GitHub - hrlics/CoPE: CoPE: Clipped RoPE as A Scalable Free Lunch for Long Context LLMs

CoPE: Clipped RoPE as A Scalable Free Lunch for Long Context LLMs

📢 News

[02/05/2026] CoPE was released on arXiv!

✨ Overview

CoPE is a plug-and-play enchancement of RoPE that softly clips the unstable low-frequency components, delivering consistent gains both within the training context and during long-context extrapoaltion.

With a simple yet effective soft clipping strategy, CoPE

1️⃣ Eliminates severe OOD outliers, whose periods exceed the pre-training context window and are the primary cause of OOD extrapolation.

2️⃣ Refines Long-range Semantic Signals by alleviating the secret long-term decay of semantic attention introduced by RoPE.

3️⃣ Prevents Spectral Leakage induced by hard frequency truncation, which otherwise leads to long-range oscillatory ringing in the attention scores across relative token distances and introduces spurious correlations.

⚡ Download the Models and Data

All our models and data are released on Hugging Face, including RoPE, HardClip, and CoPE checkpoints (64k) obtained via continued pre-training and SFT, starting from Llama-3-8B (8k). [Link]

🚀 Training

Our training code is based on ProLong. And we recommend having two seperate environments for training and evaluation to avoid dependency conflicts.

Setup training environment.

cd train/
bash setup_env.sh

Download training data.

git clone https://huggingface.co/datasets/haoranli-ml/prolong-data-64K datasets/long-context-65536
git clone https://huggingface.co/datasets/haoranli-ml/prolong-ultrachat-64K datasets/prolong-ultrachat-64K

Start training.

bash train_64k.sh
bash train_sft.sh

We note that the default method in training is CoPE, while you can easily switch to HardClip or vanilla RoPE by modifying CoPE/train/training/modeling_flash_llama_cope.py.

🔍 Evaluation

We primarily conduct evaluations on the HELMET benchmark given its diverse collection of real-world tasks, while also including results on synthetic benchmarks such as RULER and InfiniteBench. For standard short-context benchmark evaluations, we utilze lm-evaluation-harness.

Setup Eval Environment

bash setup_eval_env.sh

Replace the code in

CoPE/transformers-4.50.0/src/transformers/models/llama/modeling_llama.py

with CoPE/modeling_cope.py. And rename it back to modeling_llama.py. Note that you have to modify LlamaRotaryEmbedding in the modeling file to use CoPE/RoPE/HardClip.

📖 Citation

@article{li2026cope,
  title={CoPE: Clipped RoPE as A Scalable Free Lunch for Long Context LLMs},
  author={Li, Haoran and Ren, Sucheng and Yuille, Alan and Wang, Feng},
  journal={arXiv preprint arXiv:2602.05258},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
assets		assets
train		train
README.md		README.md
modeling_cope.py		modeling_cope.py
setup_eval_env.sh		setup_eval_env.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CoPE: Clipped RoPE as A Scalable Free Lunch for Long Context LLMs

📢 News

✨ Overview

⚡ Download the Models and Data

🚀 Training

🔍 Evaluation

📖 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

CoPE: Clipped RoPE as A Scalable Free Lunch for Long Context LLMs

📢 News

✨ Overview

⚡ Download the Models and Data

🚀 Training

🔍 Evaluation

📖 Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages