Reinforcement Learning from Human Feedback
rlhfbook.com95 points by onurkanbkrc 9 hours ago
95 points by onurkanbkrc 9 hours ago
https://arxiv.org/abs/2504.12501
Related. Others? RLHF Book - https://news.ycombinator.com/item?id=42902936 - Feb 2025 (37 comments) Last time I saw Nathan say something about the book, he's actively working on the next version and looking for feedback, check his socials Web version with links, etc: Thanks! We've switched to that above from https://arxiv.org/abs/2504.12501, and put the latter in the toptext.
dang - 4 hours ago
verdverm - 8 hours ago
klelatti - 9 hours ago
dang - 4 hours ago