Item

Reinforcement learning with human feedback (Q2177)

Revision as of 13:42, 27 January 2026 by Leonie (talk | contribs) (‎Changed claim: depends on (P1): Reinforcement learning with human feedback (Q2177))
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Training a model using human preferences
  • RLHF
Language Label Description Also known as
English
Reinforcement learning with human feedback
Training a model using human preferences
  • RLHF

Statements