Reinforcement learning with human feedback (Q2177): Difference between revisions
(Created a new Item) |
(Changed claim: depends on (P1): Reinforcement learning with human feedback (Q2177)) |
||
| (One intermediate revision by the same user not shown) | |||
| Property / depends on | |||
| Property / depends on: Reinforcement learning with human feedback / rank | |||
Normal rank | |||
Latest revision as of 13:42, 27 January 2026
Training a model using human preferences
- RLHF
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | Reinforcement learning with human feedback |
Training a model using human preferences |
|