Item

Reinforcement learning with human feedback (Q2177): Difference between revisions

(‎Created a new Item)
 
 
(One intermediate revision by the same user not shown)
Property / depends on
 
Property / depends on: Reinforcement learning with human feedback / rank
 
Normal rank

Latest revision as of 13:42, 27 January 2026

Training a model using human preferences
  • RLHF
Language Label Description Also known as
English
Reinforcement learning with human feedback
Training a model using human preferences
  • RLHF

Statements