Reward Augmented Decoding (RAD) for Qwen 2.5 3B Uncensored Model to Reduce Toxicity

diagram:

This repository contains code for making Qwen 2.5 3B uncensored model generate non-harmful contents using RAD (Reward Augmented Decoding) technique. Just like in the Paper, here candidate tokens are taken to generate multiple continuations, then analyze and choose the one with the least toxicity.

Citation: If you find this repository useful, please consider citing the following paper:

@article{zhang2023rad,
  title={RAD: Large Language Model Generation with Reward Augmented Decoding},
  author={Haikang Deng and Colin Raffel},
  journal={arXiv preprint arXiv:https://arxiv.org/abs/2310.09520},
  year={2023}
}

Here is the link to the original paper.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Hallucination_Reduction_Tests		Hallucination_Reduction_Tests
Toxicity_Reduction		Toxicity_Reduction
.gitattributes		.gitattributes
.gitignore		.gitignore
RAD_Diagram.png		RAD_Diagram.png
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Reward Augmented Decoding (RAD) for Qwen 2.5 3B Uncensored Model to Reduce Toxicity

About

Uh oh!

Releases

Packages

Languages

Rohit909-creator/Reward-Augmented-Decoding-RAD-

Folders and files

Latest commit

History

Repository files navigation

Reward Augmented Decoding (RAD) for Qwen 2.5 3B Uncensored Model to Reduce Toxicity

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages