Notes on Reward Model Training
As I’m currently working on a project related to RL with video generation models, my dear boss asked me to study reward model training practices, and transla...
As I’m currently working on a project related to RL with video generation models, my dear boss asked me to study reward model training practices, and transla...
This post serves as my reading notes, it may objectively, but has no intents to, serve as a educational post.
I read some papers today. They are good.
As 2025 have been the best year in my life, I’ve decided to make 2026 even better! I decide to start this site as my starter pack for the year of 2026. It wi...