the code seems to be referencing hf_hub but i do not wish to download the model each time. right now my code looks like this
def load(self):
access_token = "hf_rdS...."
self._model = AutoModelForCausalLM.from_pretrained(
"./mistralaiprivate/Mistral-7B-Instruct-v0.1",
torch_dtype=torch.float16,
device_map="auto",
token=access_token
)
using this I'm deploying a docker image for this model
Could you please help me out here?