An intelligent Q&A system powered by Retrieval-Augmented Generation (RAG).
DeepKnowledge.net is an advanced chatbot that integrates large language models with your private data sources using Retrieval-Augmented Generation (RAG). This approach provides precise, source-grounded answers while ensuring data privacy.
- Multi-source Integration: Seamlessly process content from websites and documents (PDF/DOCX).
- Source Citation: Offers transparent references to original data sources for every response.
- Relevance Scoring: Efficiently ranks information based on query relevance.
- Conversational Memory: Supports context-aware follow-up questions to maintain dialogue continuity.
- Language Models: Uses OpenRouter as the single API provider while keeping DeepSeek for chat interactions and OpenAI's text-embedding-ada-002 for embeddings.
- RAG Framework: Powered by LlamaIndex.
- Vector Store: Employs LlamaIndex In-Memory Vector Store for efficient data retrieval.
- User Interface: Built with Streamlit for a seamless web experience.
-
Clone the repository:
git clone https://github.com/ErnestAroozoo/DeepKnowledge.net.git cd DeepKnowledge.net -
Install necessary dependencies:
pip install -r requirements.txt
-
Set up environment variables:
cp .env.example .env # then edit .env and paste your OpenRouter API key
Update the .env file with your OpenRouter credentials:
# OpenRouter Configuration
OPENROUTER_API_KEY=your-openrouter-key
OPENROUTER_API_HOST=https://openrouter.ai/api/v1
OPENROUTER_CHAT_MODEL=deepseek/deepseek-chat
OPENROUTER_EMBED_MODEL=text-embedding-ada-002
# OpenRouter Headers
OPENROUTER_SITE_URL=https://DeepKnowledge.net
OPENROUTER_APP_NAME=DeepKnowledge.netNote: You only need an OpenRouter API key now.
-
Launch the application:
streamlit run app.py
-
Add data sources:
- Websites: Input valid URLs for content parsing.
- Documents: Upload PDF/DOCX files for text extraction.
-
Engage with the chatbot by:
- Asking natural language queries.
- Following up with questions using chat history.
- Requesting source verification for responses.
| Type | Formats | Processing Method |
|---|---|---|
| Web Content | URLs | Web page parsing |
| Documents | PDF, DOCX | Text extraction |
