You must log in or # to comment.
If you—like me—had no idea what “RLHF” means and could not figure it from context, it is evidently “reinforcement learning from human feedback”.
Guess they trained on my posts. I’ve always used them.
I think the better question is, why do so many people not use them?


