System hangs when using Llamacpp as LLM

The following code appears to load the llamacpp model properly, but it just ramps up the CPU load and hangs for hours if allowed.
If service_context=service_context is removed from GPTSimpleVectorIndex.from_documents() then it uses OpenAI's api and works fine. What step is missing here to run llama locally? It outputs all the debug text like loading llamacpp normally does, so it is loading it.
```

llm_predictor = LLMPredictor(llm = LlamaCpp(model_path="~/Code/llama.cpp/models/30B/ggml-model-q4_0.bin", n_threads=10))
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor)

ObsidianReader = download_loader('ObsidianReader')
documents = ObsidianReader('~/Documents/Obsidian').load_data()



index = GPTSimpleVectorIndex.from_documents(documents, service_context=service_context)


print(index.query("Any query here"))

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

System hangs when using Llamacpp as LLM #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

System hangs when using Llamacpp as LLM #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions