Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There's no "just" in RL. Fine tuning is very important and could make a lot of difference.
 help



Indeed, this is quite obvious on Claude models vs Gemini. I fully believe Gemini is more powerful model, but the post training process is nowhere near what Anthropic does, which results in Gemini being horrible at coding sessions, while Claude is excellent.

apparently GPT-5 uses the same pretrain as 4o did, hah



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: