Issues
Search results
- Status: Open.#5874 In huggingface/trl;
- Status: Open.#5865 In huggingface/trl;
- Status: Open.#5863 In huggingface/trl;
- Status: Open.#5847 In huggingface/trl;
- Status: Open.#5831 In huggingface/trl;
GRPOTrainer silently uses near-greedy decoding when temperature=1.0 (transformers >= 4.50 + Qwen2.5)
Status: Open.#5783 In huggingface/trl;- Status: Open.#5770 In huggingface/trl;
- Status: Open.#5761 In huggingface/trl;
- Status: Open.#5742 In huggingface/trl;
- Status: Open.#5741 In huggingface/trl;
- Status: Open.#5724 In huggingface/trl;
- Status: Open.#5713 In huggingface/trl;