State_Dict Size Mismatch Error

Hi Again
I used float16 in the parameter setting and I can load the model into memory. However, when I run the demo, I get this error:
`Error(s) in loading state_dict for LLaMA: size mismatch for transformer.h.0.attn.c_attn.lora_A: copying a param with shape torch.Size([128, 5120]) from checkpoint, the shape in current model is torch.Size([128, 4096]).` 

I think I get this error because the checkpoint has a parameter shape of [128, 5120], but the current model expects a shape of [128, 4096]. Maybe during training, the Llama 7B model structure was different compared to the model that I got from HuggingFace.

Do you have any ideas how to solve this issue?
Thanks a lot for your support.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

State_Dict Size Mismatch Error #28

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

State_Dict Size Mismatch Error #28

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions