diff --git a/.github/workflows/code-scans.yaml b/.github/workflows/code-scans.yaml index 5139404..e044d7c 100644 --- a/.github/workflows/code-scans.yaml +++ b/.github/workflows/code-scans.yaml @@ -7,7 +7,7 @@ on: description: 'Pull request number' required: true push: - branches: [ main ] + branches: [ main, dev/Audify ] pull_request: types: [opened, synchronize, reopened, ready_for_review] diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index f2ae3c8..efbb177 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,22 +1,313 @@ -# Contributing to Audify - -Thank you for your interest in contributing to the -**Audify** by Cloud2 Labs. - -## Scope of Contributions -Appropriate contributions include: -- Documentation improvements -- Bug fixes -- Reference architecture enhancements -- Educational clarity and examples - -Major feature additions or architectural changes require prior discussion with -the Cloud2 Labs maintainers. - -## Contribution Guidelines -- Follow existing coding and documentation standards -- Avoid production-specific assumptions -- Do not introduce sensitive, proprietary, or regulated data - -By submitting a contribution, you agree that your work may be used, modified, -and redistributed by Cloud2 Labs. +# Contributing to Audify + +Thanks for your interest in contributing to Audify. + +Audify is an open-source AI application that turns documents into editable, two-speaker podcast-style scripts and downloadable audio using a FastAPI microservices backend and a React frontend. We welcome improvements across the codebase, documentation, bug reports, UX refinements, observability, and feature enhancements. + +Before you start, read the relevant section below. It helps keep contributions focused, reviewable, and aligned with the current project setup. + +--- + +## Quick Setup Checklist + +Before you dive in, make sure you have these installed: + +```bash +# Check Python (3.11+ recommended) +python --version + +# Check Node.js (18+ recommended) +node --version + +# Check npm +npm --version + +# Check Docker +docker --version +docker compose version + +# Check Git +git --version +``` + +New to contributing? + +1. Open an issue or pick an existing one to work on. +2. Sync your branch from `dev/Audify`. +3. Follow the local setup guide below. +4. Run the app locally and verify your change before opening a PR. + +## Table of Contents + +- [How do I...?](#how-do-i) + - [Get help or ask a question?](#get-help-or-ask-a-question) + - [Report a bug?](#report-a-bug) + - [Suggest a new feature?](#suggest-a-new-feature) + - [Set up Audify locally?](#set-up-audify-locally) + - [Start contributing code?](#start-contributing-code) + - [Improve the documentation?](#improve-the-documentation) + - [Submit a pull request?](#submit-a-pull-request) +- [Code guidelines](#code-guidelines) +- [Pull request checklist](#pull-request-checklist) +- [Branching model](#branching-model) +- [Thank you](#thank-you) + +--- + +## How do I... + +### Get help or ask a question? + +- Start with the main project docs in [`README.md`](./README.md), [`docs/PROJECT_ARCHITECTURE.md`](./docs/PROJECT_ARCHITECTURE.md), and the service-level READMEs under [`api/`](./api). +- Review relevant config files such as [`simple_backend.py`](./simple_backend.py), [`api/llm-service/app/config.py`](./api/llm-service/app/config.py), and [`api/tts-service/app/config.py`](./api/tts-service/app/config.py). +- If something is still unclear, open a GitHub issue with your question and the context you already checked. + +### Report a bug? + +1. Search existing issues first. +2. If the bug is new, open a GitHub issue. +3. Include your environment, what happened, what you expected, and exact steps to reproduce. +4. Add screenshots, logs, request payloads, or response details if relevant. + +### Suggest a new feature? + +1. Open a GitHub issue describing the feature. +2. Explain the problem, who it helps, and how it fits Audify. +3. If the change is large, get alignment in the issue before writing code. + +### Set up Audify locally? + +#### Prerequisites + +- Python 3.11+ +- Node.js 18+ and npm +- Git +- Docker with Docker Compose v2 +- One inference path for script generation: + - Ollama or another OpenAI-compatible local inference endpoint, or + - An OpenAI-compatible API endpoint for fallback or hosted inference +- OpenAI API key for TTS generation + +#### Option 1: Local Development + +##### Step 1: Clone the repository + +```bash +git clone https://github.com/cld2labs/Audify.git +cd Audify +``` + +##### Step 2: Configure environment variables + +Create the root `.env` file: + +```bash +cp .env.example .env +``` + +If `.env.example` is not present in your branch, create `.env` manually using the values documented in [`README.md`](./README.md). + +Create `api/llm-service/.env` with your inference settings. Example: + +```env +SERVICE_PORT=8002 +OPENAI_API_KEY=sk-... +OPENAI_BASE_URL= +INFERENCE_API_ENDPOINT= +INFERENCE_API_TOKEN= +INFERENCE_MODEL_NAME=gpt-4o-mini +VLLM_BASE_URL=http://localhost:11434/v1 +VLLM_MODEL=Qwen/Qwen3-1.7B +DEFAULT_MODEL=gpt-4o-mini +DEFAULT_TONE=conversational +DEFAULT_MAX_LENGTH=2000 +TEMPERATURE=0.7 +MAX_TOKENS=2048 +MAX_RETRIES=3 +``` + +Create `api/tts-service/.env` with your TTS settings. Example: + +```env +SERVICE_PORT=8003 +OPENAI_API_KEY=sk-... +TTS_MODEL=tts-1-hd +DEFAULT_HOST_VOICE=alloy +DEFAULT_GUEST_VOICE=nova +OUTPUT_DIR=static/audio +AUDIO_FORMAT=mp3 +AUDIO_BITRATE=192k +SILENCE_DURATION_MS=500 +MAX_CONCURRENT_REQUESTS=5 +MAX_SCRIPT_LENGTH=100 +``` + +##### Step 3: Install backend dependencies + +```bash +python -m venv .venv +source .venv/bin/activate # Windows: .venv\Scripts\activate +pip install -r requirements.txt +pip install -r api/pdf-service/requirements.txt +pip install -r api/llm-service/requirements.txt +pip install -r api/tts-service/requirements.txt +``` + +##### Step 4: Install frontend dependencies + +```bash +cd ui +npm install +cd .. +``` + +##### Step 5: Start the backend services + +Open separate terminals and start: + +```bash +# Terminal 1: gateway +source .venv/bin/activate +python simple_backend.py +``` + +```bash +# Terminal 2: PDF service +source .venv/bin/activate +cd api/pdf-service +uvicorn app.main:app --reload --host 0.0.0.0 --port 8001 +``` + +```bash +# Terminal 3: LLM service +source .venv/bin/activate +cd api/llm-service +uvicorn app.main:app --reload --host 0.0.0.0 --port 8002 +``` + +```bash +# Terminal 4: TTS service +source .venv/bin/activate +cd api/tts-service +uvicorn app.main:app --reload --host 0.0.0.0 --port 8003 +``` + +##### Step 6: Start the frontend + +Open another terminal: + +```bash +cd ui +npm run dev +``` + +##### Step 7: Access the application + +- Frontend: `http://localhost:5173` in local Vite development, or `http://localhost:3000` when using Docker +- Backend gateway health check: `http://localhost:8000/health` +- PDF service docs: `http://localhost:8001/docs` +- LLM service docs: `http://localhost:8002/docs` +- TTS service docs: `http://localhost:8003/docs` + +#### Option 2: Docker + +From the repository root: + +```bash +# Create and configure the required env files first +docker compose up --build +``` + +This starts: + +- Frontend on `http://localhost:3000` +- Backend gateway on `http://localhost:8000` +- PDF service on `http://localhost:8001` +- LLM service on `http://localhost:8002` +- TTS service on `http://localhost:8003` + +#### Common Troubleshooting + +- If ports `3000`, `8000`, `8001`, `8002`, or `8003` are already in use, stop the conflicting process before starting Audify. +- If script generation fails, confirm the LLM service `.env` points to a reachable model endpoint. +- If you use Ollama with Docker, make sure Ollama is running on the host and reachable from the container. +- If audio generation fails, verify `OPENAI_API_KEY` is set in `api/tts-service/.env`. +- If Docker fails to build, rebuild with `docker compose up --build`. +- If Python packages fail to install, confirm you are using a supported Python version. + +### Start contributing code? + +1. Open or choose an issue. +2. Create a feature branch from `dev/Audify`. +3. Keep the change focused on a single problem. +4. Run the app locally and verify the affected workflow. +5. Update docs when behavior, setup, configuration, or architecture changes. +6. Open a pull request from your feature branch into `dev/Audify`. + +### Improve the documentation? + +Documentation updates are welcome. Relevant files currently live in: + +- [`README.md`](./README.md) +- [`docs/`](./docs/) +- [`api/pdf-service/README.md`](./api/pdf-service/README.md) +- [`api/llm-service/README.md`](./api/llm-service/README.md) +- [`api/tts-service/README.md`](./api/tts-service/README.md) +- [`benchmarks/README.md`](./benchmarks/README.md) + +### Submit a pull request? + +Follow the checklist below before opening your PR. Your pull request should: + +- Stay focused on one issue or topic. +- Explain what changed and why. +- Include manual verification steps. +- Include screenshots or short recordings for UI changes. +- Reference the related GitHub issue when applicable. + +Note: pull requests should target the `dev/Audify` branch unless maintainers ask otherwise. + +--- + +## Code guidelines + +- Follow the existing project structure and patterns before introducing new abstractions. +- Keep frontend changes consistent with the React + Vite + Tailwind setup already in use under [`ui/`](./ui/). +- Keep backend changes consistent with the FastAPI microservice structure under [`api/`](./api/) and the gateway in [`simple_backend.py`](./simple_backend.py). +- Avoid unrelated refactors in the same pull request. +- Do not commit secrets, API keys, local `.env` files, generated audio, or benchmark artifacts that do not belong in version control. +- Prefer clear, small commits and descriptive pull request summaries. +- Update documentation when contributor setup, behavior, environment variables, or service logic changes. +- If you change API contracts, verify both the service endpoint and the frontend consumer still match. + +--- + +## Pull request checklist + +Before submitting your pull request, confirm the following: + +- You tested the affected flow locally. +- The application still starts successfully in the environment you changed. +- You removed debug code, stray logs, and commented-out experiments. +- You documented any new setup steps, environment variables, or behavior changes. +- You kept the pull request scoped to one issue or topic. +- You added screenshots for UI changes when relevant. +- You did not commit secrets, API keys, sample private documents, or generated media outputs by mistake. +- You are opening the pull request against `dev/Audify`. + +If one or more of these are missing, the pull request may be sent back for changes before review. + +--- + +## Branching model + +- Base new work from `dev/Audify`. +- Open pull requests against `dev/Audify`. +- Use descriptive branch names such as `fix/script-generation-timeout` or `docs/update-contributing-guide`. +- Rebase or merge the latest `dev/Audify` before opening your PR if your branch has drifted. + +--- + +## Thank you + +Thanks for contributing to Audify. Whether you are fixing a bug, improving the docs, refining the UI, strengthening the service architecture, or making the generation workflow more reliable, your work helps make the project more useful and easier to maintain. diff --git a/README.md b/README.md index 6cfaefa..32e39cf 100644 --- a/README.md +++ b/README.md @@ -395,14 +395,17 @@ The table below compares inference performance across different providers, deplo | Provider | Model | Deployment | Context Window | Avg Input Tokens | Avg Output Tokens | Avg Tokens / Request | P50 Latency (ms) | P95 Latency (ms) | Throughput (req/s) | Hardware | |----------|-------|------------|----------------|------------------|-------------------|----------------------|------------------|------------------|--------------------|----------| | vLLM | `Qwen/Qwen3:1.7b` | Local | 4,096 | 1,183 | 1,308 | 2,492 | 58,855 | 59,773 | 0.0162 | Apple Silicon Metal (Macbook Pro M4) | +| OPEA EI / SLM | `Qwen/Qwen3:1.7b` | Local | 8.1K | 1,075 | 350 | 1,425 | 10,446 | 23,445 | 0.0853 | Xeon CPU (CPU only) | | OpenAI (Cloud) | `gpt-4o-mini` | API (Cloud) | 128K | 1,625 | 680 | 2,330 | 19,848 | 20,733 | 0.0276 | Cloud GPUs | > **Notes:** > > - Context Window for vLLM (4,096) reflects the `LLM_MAX_TOKENS` / `--max-model-len` used during benchmarking, not the model's native maximum context. vLLM shares its configured context between input and output tokens. +> - EI is configured with an 8,192-token context window for this benchmark run. > - All benchmarks use the same Audify script-generation prompt and identical inputs across 3 runs. > - Token counts may vary slightly per run due to non-deterministic model output. > - vLLM on Apple Silicon requires [vllm-metal](https://github.com/vllm-project/vllm-metal); the standard `pip install vllm` package does not provide macOS Metal support. +> - [Intel OPEA EI](https://github.com/opea-project/Enterprise-Inference) runs on Intel Xeon CPUs without GPU acceleration. ---