K-FAC curvature edit (minimal)

Two commands to reproduce the K-FAC treatment used in our paper.

Paper: From Memorization to Reasoning in the Spectrum of Loss Curvature
Blog: Understanding Memorization via Loss Curvature
Scope: generate K-FAC factors A, G, compute KFAC edit run and eval.

TL;DR

Compute A = E[aa^T] (pre-activation inputs) and G = E[gg^T] (pre-activation gradients) per MLP projection, decompose them, and keep only the top curvature mass when editing each weight W. This suppresses rote recitation while preserving shared structure.

Requirements

Python 3.10+
PyTorch (CUDA recommended)
NumPy
(If using HF models) transformers

Install your environment as usual, or add a requirements.txt and run uv pip install -r requirements.txt.

Usage

1. Collect K-FAC factors

python data/collect_kfac_multilayer.py \
  --model-size 7b \
  --layers 28,29,30,31 \
  --projections gate,up,down \
  --out data/kfac_factors/olmo2-7b

Streams text through model, saves A = E[aa^T] and G = E[gg^T] per MLP projection.

2. Apply edit & evaluate

python evaluations/eval_mem_kfac.py \
  --model-size 7b \
  --layers-json '{"31": {"gate": 0.8, "up": 0.8, "down": 0.8}}' \
  --use-cache

Keep-mass ∈ [0,1] controls how much curvature to retain.

Outputs

Printed/saved metrics from the evaluator (e.g., perplexity and any configured memorization metrics).
Optionally, an edited state dict / checkpoint depending on script flags.

Citation

If this code helps your work, please cite the paper:
OpenReview: https://openreview.net/pdf?id=MzRDxPUmgK

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
evaluations		evaluations
metrics		metrics
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
kfac_treatment_pairwise.py		kfac_treatment_pairwise.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

K-FAC curvature edit (minimal)

TL;DR

Requirements

Usage

1. Collect K-FAC factors

2. Apply edit & evaluate

Outputs

Citation

About

Uh oh!

Releases

Packages

Languages

goodfire-ai/memorization_kfac

Folders and files

Latest commit

History

Repository files navigation

K-FAC curvature edit (minimal)

TL;DR

Requirements

Usage

1. Collect K-FAC factors

2. Apply edit & evaluate

Outputs

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages