diff --git a/.env.example b/.env.example index d6eabcc..3bebfa5 100644 --- a/.env.example +++ b/.env.example @@ -6,9 +6,9 @@ OPENAI_API_KEY=your_openai_api_key_here # DeepSeek API Key for summarization DEEPSEEK_API_KEY= -# Origin that is allowed to use the `OPENAI_API_KEY` & `DEEPSEEK_API_KEY` provided by the API -# instead of needing to provide them as `Authorization` in each request from client -ALLOWED_ORIGIN=http://localhost:3001 +# Origins that are allowed to use the AI models keys provided by the API, seperate them using a comma +# instead of need to provide them as Authorization in each request from client +ALLOWED_ORIGINS=http://localhost:3001 # Max Tokens (Limit the response length) OPENAI_MAX_TOKENS=500 @@ -22,4 +22,7 @@ AWS_S3_BUCKET=your-bucket-name USE_S3=false -MAX_TRANSCRIPT_TOKENS= \ No newline at end of file +MAX_TRANSCRIPT_TOKENS= + +FASTAPI_URL= +GEMENI_API_KEY= \ No newline at end of file diff --git a/.github/workflows/deploy.yml b/.github/workflows/deploy.yml new file mode 100644 index 0000000..c95b9a9 --- /dev/null +++ b/.github/workflows/deploy.yml @@ -0,0 +1,31 @@ +name: Deploy API + +on: + push: + branches: + - prod + +jobs: + deploy: + runs-on: ubuntu-latest + + steps: + - name: Checkout repo + uses: actions/checkout@v4 + + - name: Setup SSH + run: | + mkdir -p ~/.ssh + echo "${{ secrets.PROD_SSH_KEY }}" > ~/.ssh/id_rsa + chmod 600 ~/.ssh/id_rsa + ssh-keyscan -H ${{ secrets.PROD_HOST }} >> ~/.ssh/known_hosts + + - name: Deploy via SSH + run: | + ssh ${{ secrets.PROD_USER }}@${{ secrets.PROD_HOST }} << 'EOF' + cd ~/api + git pull origin prod + pnpm install + pnpm build + pm2 restart letssummarize-api + EOF diff --git a/docs/authentication.md b/docs/authentication.md index 5808b07..c4c19d3 100644 --- a/docs/authentication.md +++ b/docs/authentication.md @@ -3,7 +3,7 @@ The Summarization API uses **API Key authentication** to control access. There are two ways to provide an API key: -1. **Environment Variables (`.env`)** – The API owner can set `OPENAI_API_KEY` and `DEEPSEEK_API_KEY` to allow authorized access without requiring clients to provide API keys. However this only works if the **origin** is same as the value of `ALLOWED_ORIGIN`. +1. **Environment Variables (`.env`)** – The API owner can set `OPENAI_API_KEY` and `DEEPSEEK_API_KEY` to allow authorized access without requiring clients to provide API keys. However this only works if the **origin** is a value in `ALLOWED_ORIGINS`. 2. **Request Headers** – Clients can include an API key in the `Authorization` header for each request. > When there is both a `.env` key and a request header key, the request header key will be used. @@ -45,15 +45,15 @@ Authorization: Bearer YOUR_API_KEY ## 2. Allowed Origin Configuration -The API restricts access to the default keys to specific **frontend applications** by defining `ALLOWED_ORIGIN`. +The API restricts access to the default keys to specific **frontend applications** by defining `ALLOWED_ORIGINS`. If an origin is **not** allowed, then the api key must be provided in the request headers. ```ini -ALLOWED_ORIGIN=http://localhost:3001 +ALLOWED_ORIGINS=http://localhost:3001,http://localhost:3000 ``` -- If a frontend **matches `ALLOWED_ORIGIN`**, it **does not** need to send API keys. -- If a frontend **is not listed in `ALLOWED_ORIGIN`**, it must include API keys in request headers. +- If a frontend **is listed in `ALLOWED_ORIGINS`**, it **does not** need to send API keys. +- If a frontend **is not listed in `ALLOWED_ORIGINS`**, it must include API keys in request headers. --- @@ -65,9 +65,9 @@ The API uses a **security guard (`ApiKeyGuard`)** to verify API keys before proc | **Scenario** | **What Happens?** | **Notes** | | ---------------------------------------------- | -------------------------------------------- | -------------------------------------------- | -| **Valid API Key in Request Header** | ✅ Request is allowed | Does not require origin to be provided in `ALLOWED_ORIGIN` | -| **Valid API Key in `.env` but not in request** | ✅ Request is allowed (uses `.env` key) | Requires origin to be provided in `ALLOWED_ORIGIN` | -| **Valid API Key in `.env` and in request** | ✅ Request is allowed (uses request api key) | ✅Even if the origin is same as the valud of `ALLOWED_ORIGIN`, api key in the request will be used | +| **Valid API Key in Request Header** | ✅ Request is allowed | Does not require origin to be provided in `ALLOWED_ORIGINS` | +| **Valid API Key in `.env` but not in request** | ✅ Request is allowed (uses `.env` key) | Requires origin to be provided in `ALLOWED_ORIGINS` | +| **Valid API Key in `.env` and in request** | ✅ Request is allowed (uses request api key) | ✅Even if the origin is provided in `ALLOWED_ORIGINS`, api key in the request will be used | | **No API Key provided in request or `.env`** | ❌ Request is rejected | | --- @@ -81,7 +81,7 @@ curl -X POST http://localhost:3000/summarize/text \ -d '{ "content": { "text": "This is a test." }, "options": {} }' ``` -This will be valid only if the `ALLOWED_ORIGIN` is `http://localhost:3001`. +This will be valid only if the `ALLOWED_ORIGINS` contains `http://localhost:3001`. --- diff --git a/docs/getting-started.md b/docs/getting-started.md index 112ed1a..3752658 100644 --- a/docs/getting-started.md +++ b/docs/getting-started.md @@ -60,9 +60,9 @@ OPENAI_API_KEY= # DeepSeek API Key for summarization (Optional if provided in request headers) DEEPSEEK_API_KEY= -# Origin that is allowed to use the `OPENAI_API_KEY` & `DEEPSEEK_API_KEY` provided by the API -# instead of needing to provide them as `Authorization` in each request from client -ALLOWED_ORIGIN=http://localhost:3001 +# Origins that are allowed to use the AI models keys provided by the API, seperate them using a comma +# instead of need to provide them as Authorization in each request from client +ALLOWED_ORIGINS=http://localhost:3001,http://localhost:3000 # Max Tokens (Limit the response length) OPENAI_MAX_TOKENS=500 @@ -87,9 +87,9 @@ Below is a breakdown of the `.env` variables and their functions: | **Variable** | **Description** | **Required?** | **Default Value** | | ----------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------- | ----------------------- | | `NODE_ENV` | Node.js environment (development, production, etc.) | ❌ No | None | -| `OPENAI_API_KEY` | API key for using OpenAI GPT-4o, Whisper, and TTS-1 models. Only works if the origin matches the value of `ALLOWED_ORIGIN`. _(can be provided in request headers instead)_ | ❌ No | None | -| `DEEPSEEK_API_KEY` | API key for using DeepSeek Chat (DeepSeek-V3). Only works if the origin matches the value of `ALLOWED_ORIGIN`. _(can be provided in request headers instead)_ | ❌ No | None | -| `ALLOWED_ORIGIN` | Specifies the allowed frontend origin that can access API-provided keys | ❌ No | `http://localhost:3001` | +| `OPENAI_API_KEY` | API key for using OpenAI GPT-4o, Whisper, and TTS-1 models. Only works if the origin is listed in the `ALLOWED_ORIGINS`. _(can be provided in request headers instead)_ | ❌ No | None | +| `DEEPSEEK_API_KEY` | API key for using DeepSeek Chat (DeepSeek-V3). Only works if the origin is listed in the `ALLOWED_ORIGINS`. _(can be provided in request headers instead)_ | ❌ No | None | +| `ALLOWED_ORIGINS` | Specifies the allowed frontend origins that can access API-provided keys | ❌ No | `http://localhost:3001` | | `OPENAI_MAX_TOKENS` | Maximum token limit for OpenAI-generated responses | ❌ No | `500` | | `DEEPSEEK_MAX_TOKENS` | Maximum token limit for DeepSeek-generated responses | ❌ No | `1000` | | `AWS_ACCESS_KEY_ID` | AWS Access Key for S3 storage (for text-to-speech audio files) | ⚠️ Required only if `USE_S3` is `true` | None | diff --git a/package.json b/package.json index 8985306..7a5ecb5 100644 --- a/package.json +++ b/package.json @@ -8,6 +8,7 @@ "scripts": { "build": "nest build", "format": "prettier --write \"src/**/*.ts\" \"test/**/*.ts\"", + "prestart": "pnpm run build", "start": "nest start", "start:dev": "nest start --watch", "start:debug": "nest start --debug --watch", @@ -21,16 +22,20 @@ }, "dependencies": { "@aws-sdk/client-s3": "^3.758.0", + "@google/genai": "^0.6.0", + "@nestjs/axios": "^4.0.0", "@nestjs/common": "^11.0.1", "@nestjs/core": "^11.0.1", "@nestjs/platform-express": "^11.0.1", "@nestjs/schedule": "^5.0.1", "@nestjs/serve-static": "^5.0.3", "aws-sdk": "^2.1692.0", + "axios": "^1.8.4", "class-transformer": "^0.5.1", "class-validator": "^0.14.1", "dotenv": "^16.4.7", "fluent-ffmpeg": "^2.1.3", + "form-data": "^4.0.2", "mammoth": "^1.9.0", "multer": "1.4.5-lts.1", "openai": "^4.87.3", @@ -38,6 +43,8 @@ "reflect-metadata": "^0.2.2", "rxjs": "^7.8.1", "youtube-transcript": "^1.2.1", + "youtubei.js": "^13.3.0", + "yt-dlp-exec": "^1.0.2", "ytdl-mp3": "^5.2.2" }, "devDependencies": { @@ -92,7 +99,8 @@ "pnpm": { "onlyBuiltDependencies": [ "ffmpeg-static", - "ytdl-mp3" + "ytdl-mp3", + "yt-dlp-exec" ] } } diff --git a/pnpm-lock.yaml b/pnpm-lock.yaml index 949714e..44e5133 100644 --- a/pnpm-lock.yaml +++ b/pnpm-lock.yaml @@ -11,6 +11,12 @@ importers: '@aws-sdk/client-s3': specifier: ^3.758.0 version: 3.758.0 + '@google/genai': + specifier: ^0.6.0 + version: 0.6.0 + '@nestjs/axios': + specifier: ^4.0.0 + version: 4.0.0(@nestjs/common@11.0.12(class-transformer@0.5.1)(class-validator@0.14.1)(reflect-metadata@0.2.2)(rxjs@7.8.2))(axios@1.8.4)(rxjs@7.8.2) '@nestjs/common': specifier: ^11.0.1 version: 11.0.12(class-transformer@0.5.1)(class-validator@0.14.1)(reflect-metadata@0.2.2)(rxjs@7.8.2) @@ -29,6 +35,9 @@ importers: aws-sdk: specifier: ^2.1692.0 version: 2.1692.0 + axios: + specifier: ^1.8.4 + version: 1.8.4 class-transformer: specifier: ^0.5.1 version: 0.5.1 @@ -41,6 +50,9 @@ importers: fluent-ffmpeg: specifier: ^2.1.3 version: 2.1.3 + form-data: + specifier: ^4.0.2 + version: 4.0.2 mammoth: specifier: ^1.9.0 version: 1.9.0 @@ -49,7 +61,7 @@ importers: version: 1.4.5-lts.1 openai: specifier: ^4.87.3 - version: 4.88.0 + version: 4.88.0(ws@8.18.1) pdf-parse: specifier: ^1.1.1 version: 1.1.1 @@ -62,6 +74,12 @@ importers: youtube-transcript: specifier: ^1.2.1 version: 1.2.1 + youtubei.js: + specifier: ^13.3.0 + version: 13.3.0 + yt-dlp-exec: + specifier: ^1.0.2 + version: 1.0.2 ytdl-mp3: specifier: ^5.2.2 version: 5.2.2 @@ -506,6 +524,9 @@ packages: '@bcoe/v8-coverage@0.2.3': resolution: {integrity: sha512-0hYQ8SB4Db5zvZB4axdMHGwEaQjkZzFjQiN9LVYvIFB2nSUHW9tYpxWriPrWDASIxiaXax83REcLxuSdnGPZtw==} + '@bufbuild/protobuf@2.2.5': + resolution: {integrity: sha512-/g5EzJifw5GF8aren8wZ/G5oMuPoGeS6MQD3ca8ddcvdXR5UELUfdTZITCGNhNXynY/AYl3Z4plmxdj/tRl/hQ==} + '@colors/colors@1.5.0': resolution: {integrity: sha512-ooWCrlZP11i8GImSjTHYHLkvFDP48nS4+204nGb1RiX/WXYHmJA2III9/e2DWVabCESdW7hBAEzHRqUn9OUVvQ==} engines: {node: '>=0.1.90'} @@ -629,6 +650,14 @@ packages: resolution: {integrity: sha512-JubJ5B2pJ4k4yGxaNLdbjrnk9d/iDz6/q8wOilpIowd6PJPgaxCuHBnBszq7Ce2TyMrywm5r4PnKm6V3iiZF+g==} engines: {node: ^18.18.0 || ^20.9.0 || >=21.1.0} + '@fastify/busboy@2.1.1': + resolution: {integrity: sha512-vBZP4NlzfOlerQTnba4aqZoMhE/a9HY7HRqoOPaETQcSQuWEIyZMHGfVu6w9wGtGK5fED5qRs2DteVCjOH60sA==} + engines: {node: '>=14'} + + '@google/genai@0.6.0': + resolution: {integrity: sha512-wmLQM+K//DpcFjnHu10vBDbUua3W+CJjRF6nTblkNwzUEk4Tdb3WiMa53jl8J/X8h0jXOxXSrBuYrh1Rl3RxZQ==} + engines: {node: '>=18.0.0'} + '@humanfs/core@0.19.1': resolution: {integrity: sha512-5DyQ4+1JEUzejeK1JGICcideyfUbGixgS9jNgex5nqkW+cY7WZhxBigmieN5Qnw9ZosSNVC9KQKyb+GUaGyKUA==} engines: {node: '>=18.18.0'} @@ -982,6 +1011,13 @@ packages: resolution: {integrity: sha512-zM0mVWSXE0a0h9aKACLwKmD6nHcRiKrPpCfvaKqG1CqDEyjEawId0ocXxVzPMCAm6kkWr2P025msfxXEnt8UGQ==} engines: {node: '>= 10'} + '@nestjs/axios@4.0.0': + resolution: {integrity: sha512-1cB+Jyltu/uUPNQrpUimRHEQHrnQrpLzVj6dU3dgn6iDDDdahr10TgHFGTmw5VuJ9GzKZsCLDL78VSwJAs/9JQ==} + peerDependencies: + '@nestjs/common': ^10.0.0 || ^11.0.0 + axios: ^1.3.1 + rxjs: ^7.0.0 + '@nestjs/cli@11.0.5': resolution: {integrity: sha512-ab/d8Ple+dMSQ4pC7RSNjhntpT8gFQQE8y/F/ilaplp7zPGpuxbayRtYbsA/wc1UkJHORDckrqFc8Jh8mrTq2A==} engines: {node: '>= 20.11'} @@ -1875,6 +1911,9 @@ packages: base64-js@1.5.1: resolution: {integrity: sha512-AKpaYlHn8t4SVbOHCy+b5+KKgvR4vrsD8vbvrbiQJps7fKDTkjkDry6ji0rUJjC0kzbNePLwzxq8iypo41qeWA==} + bignumber.js@9.1.2: + resolution: {integrity: sha512-2/mKyZH9K85bzOEfhXDBFZTGd1CTs+5IHpeFQo9luiBG7hghdC851Pj2WAhb6E3R6b9tZj/XKhbg4fum+Kepug==} + bin-version-check@5.1.0: resolution: {integrity: sha512-bYsvMqJ8yNGILLz1KP9zKLzQ6YpljV3ln1gqhuLkUtyfGi3qXKGuK2p+U4NAvjVFzDFiBBtOpCOSFNuYYEGZ5g==} engines: {node: '>=12'} @@ -1925,6 +1964,9 @@ packages: buffer-crc32@0.2.13: resolution: {integrity: sha512-VO9Ht/+p3SN7SKWqcrgEzjGbRSJYTx+Q1pTQC0wrWqHx0vpJraQ6GtHx8tvcg1rlK1byhU5gccxgOgj7B0TDkQ==} + buffer-equal-constant-time@1.0.1: + resolution: {integrity: sha512-zRpUiDwd/xk6ADqPMATG8vc9VPrkck7T07OIx0gnjmJAnHnTVXNQG3vfvWNuiZIkwu9KrKdA1iJKfsfTVxE6NA==} + buffer-from@1.1.2: resolution: {integrity: sha512-E+XQCRwSbaaiChtv6k6Dwgc+bx+Bs6vuKJHHl5kox/BaKbhiXzqQOwK4cO22yElGp2OCmjwVhT3HmxgyPGnJfQ==} @@ -2193,6 +2235,10 @@ packages: resolution: {integrity: sha512-uV2QOWP2nWzsy2aMp8aRibhi9dlzF5Hgh5SHaB9OiTGEyDTiJJyx0uy51QXdyWbtAHNua4XJzUKca3OzKUd3vA==} engines: {node: '>= 8'} + dargs@7.0.0: + resolution: {integrity: sha512-2iy1EkLdlBzQGvbweYRFxmFath8+K7+AKB0TlhHWkNuH+TmovaMH/Wp7V7R4u7f4SnX3OgLsU9t1NI9ioDnUpg==} + engines: {node: '>=8'} + dargs@8.1.0: resolution: {integrity: sha512-wAV9QHOsNbwnWdNW2FYvE1P56wtgSbM+3SZcdGiWQILwVjACCXDCI3Ai8QlCjMDB8YK5zySiXZYBiwGmNY3lnw==} engines: {node: '>=12'} @@ -2305,6 +2351,9 @@ packages: eastasianwidth@0.2.0: resolution: {integrity: sha512-I88TYZWc9XiYHRQ4/3c5rjjfgkjhLyW2luGIheGERbNQ6OY7yTybanSpDXZa8y7VUP9YmDcYa+eyq4ca7iLqWA==} + ecdsa-sig-formatter@1.0.11: + resolution: {integrity: sha512-nagl3RYrbNv6kQkeJIpt6NJZy8twLB/2vtz6yN9Z4vRKHN4/QZJIEbqohALSgwKdnksuY3k5Addp5lg8sVoVcQ==} + ee-first@1.1.1: resolution: {integrity: sha512-WMwm9LhRUo+WUaRN+vRuETqG89IgZphVSNkdFgeb6sS/E4OrDIN7t48CAewSHXc6C8lefD8KKfr5vY61brQlow==} @@ -2490,6 +2539,9 @@ packages: resolution: {integrity: sha512-yblEwXAbGv1VQDmow7s38W77hzAgJAO50ztBLMcUyUBfxv1HC+LGwtiEN+Co6LtlqT/5uwVOxsD4TNIilWhwdQ==} engines: {node: '>=4'} + extend@3.0.2: + resolution: {integrity: sha512-fjquC59cD7CyW6urNXK0FBufkZcoiGG80wTuPujX590cB5Ttln20E2UB4S/WARVqhXffZl2LNgS+gQdPIIim/g==} + external-editor@3.1.0: resolution: {integrity: sha512-hMQ4CX1p1izmuLYyZqLMO/qGNw10wSv9QDCPfzXfyFrOaCSSoRfqE1Kf1s5an66J5JZC62NewG+mK49jOCtQew==} engines: {node: '>=4'} @@ -2659,6 +2711,14 @@ packages: function-bind@1.1.2: resolution: {integrity: sha512-7XHNxH7qX9xG5mIwxkhumTox/MIRNcOgDrxWsMt2pAr23WHp6MrRlN7FBSFpCpr+oVO0F744iUgR82nJMfG2SA==} + gaxios@6.7.1: + resolution: {integrity: sha512-LDODD4TMYx7XXdpwxAVRAIAuB0bzv0s+ywFonY46k126qzQHT9ygyoa9tncmOiQmmDrik65UYsEkv3lbfqQ3yQ==} + engines: {node: '>=14'} + + gcp-metadata@6.1.1: + resolution: {integrity: sha512-a4tiq7E0/5fTjxPAaH4jpjkSv/uCaU2p5KC6HVGrvl0cDjA8iBZv4vv1gyzlmK0ZUKqwpOyQMKzZQe3lTit77A==} + engines: {node: '>=14'} + gensync@1.0.0-beta.2: resolution: {integrity: sha512-3hN7NaskYvMDLQY55gnW3NQ+mesEAepTqlg+VEbj7zzqEMBVNhzcGYYeqFo/TlYz6eQiFcp1HcsCZO+nGgS8zg==} engines: {node: '>=6.9.0'} @@ -2728,6 +2788,14 @@ packages: resolution: {integrity: sha512-iInW14XItCXET01CQFqudPOWP2jYMl7T+QRQT+UNcR/iQncN/F0UNpgd76iFkBPgNQb4+X3LV9tLJYzwh+Gl3A==} engines: {node: '>=18'} + google-auth-library@9.15.1: + resolution: {integrity: sha512-Jb6Z0+nvECVz+2lzSMt9u98UsoakXxA2HGHMCxh+so3n90XgYWkq5dur19JAJV7ONiJY22yBTyJB1TSkvPq9Ng==} + engines: {node: '>=14'} + + google-logging-utils@0.0.2: + resolution: {integrity: sha512-NEgUnEcBiP5HrPzufUkBzJOD/Sxsco3rLNo1F1TNf7ieU8ryUzBhqba8r756CjLX7rn3fHl6iLEwPYuqpoKgQQ==} + engines: {node: '>=14'} + gopd@1.2.0: resolution: {integrity: sha512-ZUKRh6/kUFoAiTAtTYPZJ3hw9wNxx+BIBOijnlG9PnrJsCcSjs1wyyD6vJpaYtgnzDrKYRSqf3OO6Rfa93xsRg==} engines: {node: '>= 0.4'} @@ -2742,6 +2810,10 @@ packages: graphemer@1.4.0: resolution: {integrity: sha512-EtKwoO6kxCL9WO5xipiHTZlSzBm7WLT627TqC/uVRd0HKmq8NXyebnNYxDoBi7wt8eTWrUrKXCOVaFq9x1kgag==} + gtoken@7.1.0: + resolution: {integrity: sha512-pCcEwRi+TKpMlxAQObHDQ56KawURgyAf6jtIY046fJ5tIv3zDe/LEIubckAO8fj6JnAxLdmWkUfNyulQ2iKdEw==} + engines: {node: '>=14.0.0'} + has-flag@4.0.0: resolution: {integrity: sha512-EykJT/Q1KjTWctppgIAgfSO0tKVuZUjhgMr17kqTumMl6Afv3EISleU7qZUzoXDFTAHTDC4NOoG/ZxU3EvlMPQ==} engines: {node: '>=8'} @@ -2961,6 +3033,10 @@ packages: resolution: {integrity: sha512-knxG2q4UC3u8stRGyAVJCOdxFmv5DZiRcdlIaAQXAbSfJya+OhopNotLQrstBhququ4ZpuKbDc/8S6mgXgPFPw==} engines: {node: '>=10'} + is-unix@2.0.10: + resolution: {integrity: sha512-CcasZSEOQUoE7JHy56se4wyRhdJfjohuMWYmceSTaDY4naKyd1fpLiY8rJsIT6AKfVstQAhHJOfPx7jcUxK61Q==} + engines: {node: '>= 12'} + isarray@1.0.0: resolution: {integrity: sha512-VLghIWNM6ELQzo7zwmcg0NmTVyWKYjvIeM83yjp0wRDTmUnrM678fQbcKBo6n2CJEF0szoG//ytg+TKla89ALQ==} @@ -3137,6 +3213,9 @@ packages: node-notifier: optional: true + jintr@3.3.0: + resolution: {integrity: sha512-ZsaajJ4Hr5XR0tSPhOZOTjFhxA0qscKNSOs41NRjx7ZOGwpfdp8NKIBEUtvUPbA37JXyv1sJlgeOOZHjr3h76Q==} + jiti@2.4.2: resolution: {integrity: sha512-rg9zJN+G4n2nfJl5MW3BMygZX56zKPNVEYYqq7adpmMh4Jn2QNEwhvQlFy6jPVdcod7txZtKHWnyZiA3a0zP7A==} hasBin: true @@ -3161,6 +3240,9 @@ packages: engines: {node: '>=6'} hasBin: true + json-bigint@1.0.0: + resolution: {integrity: sha512-SiPv/8VpZuWbvLSMtTDU8hEfrZWg/mH/nV/b4o0CYbSxu1UIQPLdwKOCIyLQX+VIPO5vrLX3i8qtqFyhdPSUSQ==} + json-buffer@3.0.1: resolution: {integrity: sha512-4bV5BfR2mqfQTJm+V5tPPdf+ZpuhiIvTuAB5g8kcrXOZpTT/QwwVRWBywX1ozr6lEuPdbHxwaJlm9G6mI2sfSQ==} @@ -3194,6 +3276,12 @@ packages: jszip@3.10.1: resolution: {integrity: sha512-xXDvecyTpGLrqFrvkrUSoxxfJI5AH7U8zxxtVclpsUtMCq4JQ290LY8AW5c7Ggnr/Y/oK+bQMbqK2qmtk3pN4g==} + jwa@2.0.0: + resolution: {integrity: sha512-jrZ2Qx916EA+fq9cEAeCROWPTfCwi1IVHqT2tapuqLEVVDKFDENFw1oL+MwrTvH6msKxsd1YTDVw6uKEcsrLEA==} + + jws@4.0.0: + resolution: {integrity: sha512-KDncfTmOZoOMTFG4mBlG0qUIOlc03fmzH+ru6RgYVZhPkyiy/92Owlt/8UEN+a4TXR1FQetfIpJE8ApdvdVxTg==} + keyv@4.5.4: resolution: {integrity: sha512-oxVHkHR/EJf2CNXnWxRLW6mg7JyCCUcG0DtEGmL2ctUo1PNTin1PUil+r/+4r5MpVgC/fn1kjsx7mjSujKqIpw==} @@ -3417,6 +3505,11 @@ packages: resolution: {integrity: sha512-FP+p8RB8OWpF3YZBCrP5gtADmtXApB5AMLn+vdyA+PyxCjrCs00mjyUozssO33cwDeT3wNGdLxJ5M//YqtHAJw==} hasBin: true + mkdirp@1.0.4: + resolution: {integrity: sha512-vVqVZQyf3WLx2Shd0qJ9xuvqgAyKPLAiqITEtqW0oIUjzo3PePDd6fW9iFz30ef7Ysp/oiWqbhszeGWW2T6Gzw==} + engines: {node: '>=10'} + hasBin: true + ms@2.1.2: resolution: {integrity: sha512-sGkPx+VjMtmA6MX27oA4FBFELFCZZ4S4XqeGOXCv68tT+jb3vk/RyaKWP0PTKyWtmLSM0b+adUTEvbs1PEaH2w==} @@ -3454,6 +3547,15 @@ packages: node-ensure@0.0.0: resolution: {integrity: sha512-DRI60hzo2oKN1ma0ckc6nQWlHU69RH6xN0sjQTjMpChPfTYvKZdcQFfdYK2RWbJcKyUizSIy/l8OTGxMAM1QDw==} + node-fetch@2.6.13: + resolution: {integrity: sha512-StxNAxh15zr77QvvkmveSQ8uCQ4+v5FkvNTj0OESmiHu+VRi/gXArXtkWMElOsOUNLtUEvI4yS+rdtOHZTwlQA==} + engines: {node: 4.x || >=6.0.0} + peerDependencies: + encoding: ^0.1.0 + peerDependenciesMeta: + encoding: + optional: true + node-fetch@2.7.0: resolution: {integrity: sha512-c4FRfUm/dbcWZ7U+1Wq0AwCyFL+3nt2bEw05wfxSz+DWpWsitgmSgYmy2dQdWyKC1694ELPqMs/YzUSNozLt8A==} engines: {node: 4.x || >=6.0.0} @@ -4248,6 +4350,10 @@ packages: undici-types@6.20.0: resolution: {integrity: sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==} + undici@5.29.0: + resolution: {integrity: sha512-raqeBD6NQK4SkWhQzeYKd1KmIG6dllBOTt55Rmkt4HtI9mwdWtJljnrXjAFUBLTSN67HWrOIZ3EPF4kjUw80Bg==} + engines: {node: '>=14.0'} + undici@7.5.0: resolution: {integrity: sha512-NFQG741e8mJ0fLQk90xKxFdaSM7z4+IQpAgsFI36bCDY9Z2+aXXZjVy2uUksMouWfMI9+w5ejOq5zYYTBCQJDQ==} engines: {node: '>=20.18.1'} @@ -4383,6 +4489,18 @@ packages: resolution: {integrity: sha512-7KxauUdBmSdWnmpaGFg+ppNjKF8uNLry8LyzjauQDOVONfFLNKrKvQOxZ/VuTIcS/gge/YNahf5RIIQWTSarlg==} engines: {node: ^12.13.0 || ^14.15.0 || >=16.0.0} + ws@8.18.1: + resolution: {integrity: sha512-RKW2aJZMXeMxVpnZ6bck+RswznaxmzdULiBr6KY7XkTnW8uvt0iT9H5DkHUChXrc+uurzwa0rVI16n/Xzjdz1w==} + engines: {node: '>=10.0.0'} + peerDependencies: + bufferutil: ^4.0.1 + utf-8-validate: '>=5.0.2' + peerDependenciesMeta: + bufferutil: + optional: true + utf-8-validate: + optional: true + xml2js@0.6.2: resolution: {integrity: sha512-T4rieHaC1EXcES0Kxxj4JWgaUQHDk+qwHcYOCFHfiwKz7tOVPLq7Hjq9dM1WCMhylqMEfP7hMcOIChvotiZegA==} engines: {node: '>=4.0.0'} @@ -4438,6 +4556,13 @@ packages: resolution: {integrity: sha512-TvEGkBaajKw+B6y91ziLuBLsa5cawgowou+Bk0ciGpjELDfAzSzTGXaZmeSSkUeknCPpEr/WGApOHDwV7V+Y9Q==} engines: {node: '>=18.0.0'} + youtubei.js@13.3.0: + resolution: {integrity: sha512-tbl7rxltpgKoSsmfGUe9JqWUAzv6HFLqrOn0N85EbTn5DLt24EXrjClnXdxyr3PBARMJ3LC4vbll100a0ABsYw==} + + yt-dlp-exec@1.0.2: + resolution: {integrity: sha512-swKtruQmGBs+Xrxy0wCZ2FxCT167EpBYIWdj/klTzNB2HrHng/qFlKo/C0WVlopbww8/uMIGQR6grXQ2ObcrAw==} + engines: {node: '>= 12'} + ytdl-mp3@5.2.2: resolution: {integrity: sha512-CYrjnsynoUUOQlYT8ql8yJXMWyL7qmjY6LvEmo9B/FlEUwYfwqMv6/XYOgkNoRPEj8XJXnFUPPyhhccb7yIQ5Q==} engines: {node: '>=20.0.0'} @@ -5149,6 +5274,8 @@ snapshots: '@bcoe/v8-coverage@0.2.3': {} + '@bufbuild/protobuf@2.2.5': {} + '@colors/colors@1.5.0': optional: true @@ -5329,6 +5456,18 @@ snapshots: '@eslint/core': 0.12.0 levn: 0.4.1 + '@fastify/busboy@2.1.1': {} + + '@google/genai@0.6.0': + dependencies: + google-auth-library: 9.15.1 + ws: 8.18.1 + transitivePeerDependencies: + - bufferutil + - encoding + - supports-color + - utf-8-validate + '@humanfs/core@0.19.1': {} '@humanfs/node@0.16.6': @@ -5750,6 +5889,12 @@ snapshots: '@napi-rs/nice-win32-x64-msvc': 1.0.1 optional: true + '@nestjs/axios@4.0.0(@nestjs/common@11.0.12(class-transformer@0.5.1)(class-validator@0.14.1)(reflect-metadata@0.2.2)(rxjs@7.8.2))(axios@1.8.4)(rxjs@7.8.2)': + dependencies: + '@nestjs/common': 11.0.12(class-transformer@0.5.1)(class-validator@0.14.1)(reflect-metadata@0.2.2)(rxjs@7.8.2) + axios: 1.8.4 + rxjs: 7.8.2 + '@nestjs/cli@11.0.5(@swc/cli@0.6.0(@swc/core@1.11.11)(chokidar@4.0.3))(@swc/core@1.11.11)(@types/node@22.13.10)': dependencies: '@angular-devkit/core': 19.1.8(chokidar@4.0.3) @@ -6889,6 +7034,8 @@ snapshots: base64-js@1.5.1: {} + bignumber.js@9.1.2: {} + bin-version-check@5.1.0: dependencies: bin-version: 6.0.0 @@ -6956,6 +7103,8 @@ snapshots: buffer-crc32@0.2.13: {} + buffer-equal-constant-time@1.0.1: {} + buffer-from@1.1.2: {} buffer@4.9.2: @@ -7227,6 +7376,8 @@ snapshots: shebang-command: 2.0.0 which: 2.0.2 + dargs@7.0.0: {} + dargs@8.1.0: {} debug@3.2.7: @@ -7302,6 +7453,10 @@ snapshots: eastasianwidth@0.2.0: {} + ecdsa-sig-formatter@1.0.11: + dependencies: + safe-buffer: 5.2.1 + ee-first@1.1.1: {} ejs@3.1.10: @@ -7522,6 +7677,8 @@ snapshots: ext-list: 2.2.2 sort-keys-length: 1.0.1 + extend@3.0.2: {} + external-editor@3.1.0: dependencies: chardet: 0.7.0 @@ -7710,6 +7867,26 @@ snapshots: function-bind@1.1.2: {} + gaxios@6.7.1: + dependencies: + extend: 3.0.2 + https-proxy-agent: 7.0.6 + is-stream: 2.0.1 + node-fetch: 2.7.0 + uuid: 9.0.1 + transitivePeerDependencies: + - encoding + - supports-color + + gcp-metadata@6.1.1: + dependencies: + gaxios: 6.7.1 + google-logging-utils: 0.0.2 + json-bigint: 1.0.0 + transitivePeerDependencies: + - encoding + - supports-color + gensync@1.0.0-beta.2: {} get-caller-file@2.0.5: {} @@ -7785,6 +7962,20 @@ snapshots: globals@16.0.0: {} + google-auth-library@9.15.1: + dependencies: + base64-js: 1.5.1 + ecdsa-sig-formatter: 1.0.11 + gaxios: 6.7.1 + gcp-metadata: 6.1.1 + gtoken: 7.1.0 + jws: 4.0.0 + transitivePeerDependencies: + - encoding + - supports-color + + google-logging-utils@0.0.2: {} + gopd@1.2.0: {} got@13.0.0: @@ -7805,6 +7996,14 @@ snapshots: graphemer@1.4.0: {} + gtoken@7.1.0: + dependencies: + gaxios: 6.7.1 + jws: 4.0.0 + transitivePeerDependencies: + - encoding + - supports-color + has-flag@4.0.0: {} has-own-prop@2.0.0: {} @@ -7993,6 +8192,8 @@ snapshots: is-unicode-supported@0.1.0: {} + is-unix@2.0.10: {} + isarray@1.0.0: {} isexe@2.0.0: {} @@ -8366,6 +8567,10 @@ snapshots: - supports-color - ts-node + jintr@3.3.0: + dependencies: + acorn: 8.14.1 + jiti@2.4.2: {} jmespath@0.16.0: {} @@ -8383,6 +8588,10 @@ snapshots: jsesc@3.1.0: {} + json-bigint@1.0.0: + dependencies: + bignumber.js: 9.1.2 + json-buffer@3.0.1: {} json-parse-even-better-errors@2.3.1: {} @@ -8412,6 +8621,17 @@ snapshots: readable-stream: 2.3.8 setimmediate: 1.0.5 + jwa@2.0.0: + dependencies: + buffer-equal-constant-time: 1.0.1 + ecdsa-sig-formatter: 1.0.11 + safe-buffer: 5.2.1 + + jws@4.0.0: + dependencies: + jwa: 2.0.0 + safe-buffer: 5.2.1 + keyv@4.5.4: dependencies: json-buffer: 3.0.1 @@ -8595,6 +8815,8 @@ snapshots: dependencies: minimist: 1.2.8 + mkdirp@1.0.4: {} + ms@2.1.2: {} ms@2.1.3: {} @@ -8627,6 +8849,10 @@ snapshots: node-ensure@0.0.0: {} + node-fetch@2.6.13: + dependencies: + whatwg-url: 5.0.0 + node-fetch@2.7.0: dependencies: whatwg-url: 5.0.0 @@ -8663,7 +8889,7 @@ snapshots: dependencies: mimic-fn: 2.1.0 - openai@4.88.0: + openai@4.88.0(ws@8.18.1): dependencies: '@types/node': 18.19.80 '@types/node-fetch': 2.6.12 @@ -8672,6 +8898,8 @@ snapshots: form-data-encoder: 1.7.2 formdata-node: 4.4.1 node-fetch: 2.7.0 + optionalDependencies: + ws: 8.18.1 transitivePeerDependencies: - encoding @@ -9392,6 +9620,10 @@ snapshots: undici-types@6.20.0: {} + undici@5.29.0: + dependencies: + '@fastify/busboy': 2.1.1 + undici@7.5.0: {} unicorn-magic@0.1.0: {} @@ -9544,6 +9776,8 @@ snapshots: imurmurhash: 0.1.4 signal-exit: 3.0.7 + ws@8.18.1: {} + xml2js@0.6.2: dependencies: sax: 1.2.1 @@ -9586,6 +9820,23 @@ snapshots: youtube-transcript@1.2.1: {} + youtubei.js@13.3.0: + dependencies: + '@bufbuild/protobuf': 2.2.5 + jintr: 3.3.0 + tslib: 2.8.1 + undici: 5.29.0 + + yt-dlp-exec@1.0.2: + dependencies: + dargs: 7.0.0 + execa: 5.1.1 + is-unix: 2.0.10 + mkdirp: 1.0.4 + node-fetch: 2.6.13 + transitivePeerDependencies: + - encoding + ytdl-mp3@5.2.2: dependencies: '@distube/ytdl-core': 4.16.5 diff --git a/src/app.module.ts b/src/app.module.ts index 3b9b10c..5a5781d 100644 --- a/src/app.module.ts +++ b/src/app.module.ts @@ -1,10 +1,11 @@ -import { MiddlewareConsumer, Module, NestModule } from '@nestjs/common'; +import { Module } from '@nestjs/common'; import { AppController } from './app.controller'; import { SummarizationModule } from './summarization/summarization.module'; import { ScheduleModule } from '@nestjs/schedule'; import { ServeStaticModule } from '@nestjs/serve-static'; import { join } from 'path'; import { PUBLIC_DIR } from './utils/constants'; +import { HttpModule } from '@nestjs/axios'; @Module({ imports: [ @@ -13,7 +14,8 @@ import { PUBLIC_DIR } from './utils/constants'; rootPath: join(__dirname, '..', 'downloads'), serveRoot: PUBLIC_DIR, }), - SummarizationModule + SummarizationModule, + HttpModule ], controllers: [AppController], providers: [], diff --git a/src/main.ts b/src/main.ts index 2dc8961..efaa60f 100644 --- a/src/main.ts +++ b/src/main.ts @@ -1,7 +1,7 @@ import { NestFactory } from '@nestjs/core'; import { AppModule } from './app.module'; import { ValidationPipe } from '@nestjs/common'; -import { Express } from 'express'; +import { CORS_ORIGINS } from './utils/constants'; async function bootstrap() { const app = await NestFactory.create(AppModule); @@ -9,23 +9,15 @@ async function bootstrap() { app.useGlobalPipes(new ValidationPipe({ whitelist: true, transform: true })); app.enableCors({ - origin: true, + origin: CORS_ORIGINS, credentials: true, + methods: ['GET', 'HEAD', 'PUT', 'PATCH', 'POST', 'DELETE'], + allowedHeaders: ['Content-Type', 'Authorization'], + exposedHeaders: ['Authorization'], + optionsSuccessStatus: 204, }); - if (process.env.NODE_ENV === 'production') { - await app.init(); - const expressApp = app.getHttpAdapter().getInstance(); - return expressApp; - } else { - await app.listen(process.env.PORT ?? 3000); - } + await app.listen(process.env.PORT ?? 3000); } -// Run locally in development -if (process.env.NODE_ENV !== 'production') { - bootstrap(); -} - -// Export for Vercel in production -export default bootstrap; +bootstrap(); diff --git a/src/python/transcribe_api/__pycache__/transcribe_api.cpython-311.pyc b/src/python/transcribe_api/__pycache__/transcribe_api.cpython-311.pyc new file mode 100644 index 0000000..975372b Binary files /dev/null and b/src/python/transcribe_api/__pycache__/transcribe_api.cpython-311.pyc differ diff --git a/src/python/transcribe_api/transcribe_api.py b/src/python/transcribe_api/transcribe_api.py index affad9c..9e28d0f 100644 --- a/src/python/transcribe_api/transcribe_api.py +++ b/src/python/transcribe_api/transcribe_api.py @@ -26,7 +26,7 @@ ) # Max seconds to wait for model # 🎙️ Whisper Transcription Configuration -WHISPER_BEAM_SIZE = int(os.getenv("WHISPER_BEAM_SIZE", "5")) +WHISPER_BEAM_SIZE = int(os.getenv("WHISPER_BEAM_SIZE", "1")) WHISPER_LANGUAGE = os.getenv("WHISPER_LANGUAGE", "en") WHISPER_TEMPERATURE = float(os.getenv("WHISPER_TEMPERATURE", "0.3")) @@ -132,9 +132,8 @@ async def transcribe_audio(file: UploadFile = File(...)): segments, info = model.transcribe( temp_file_path, beam_size=WHISPER_BEAM_SIZE, - language=WHISPER_LANGUAGE if WHISPER_LANGUAGE != "auto" else None, + language=None, temperature=WHISPER_TEMPERATURE, - vad_filter=True, # Filter out non-speech parts vad_parameters={"min_silence_duration_ms": 500}, ) diff --git a/src/summarization/dto/summarization-options.dto.ts b/src/summarization/dto/summarization-options.dto.ts index 2af90b7..209333e 100644 --- a/src/summarization/dto/summarization-options.dto.ts +++ b/src/summarization/dto/summarization-options.dto.ts @@ -1,5 +1,5 @@ import { IsOptional, IsEnum, IsBoolean, IsString, Length, MaxLength } from "class-validator"; -import { SummarizationLanguage, SummarizationModel, SummarizationSpeed, SummaryFormat, SummaryLength } from "../enums/summarization-options.enum"; +import { STTModel, SummarizationLanguage, SummarizationModel, SummarizationSpeed, SummaryFormat, SummaryLength } from "../enums/summarization-options.enum"; export class SummarizationOptionsDto { @IsOptional() @@ -26,6 +26,10 @@ export class SummarizationOptionsDto { @IsEnum(SummarizationLanguage) lang?: SummarizationLanguage; + @IsOptional() + @IsEnum(STTModel) + sttModel?: STTModel; + @IsOptional() @IsString() @MaxLength(200) diff --git a/src/summarization/enums/summarization-options.enum.ts b/src/summarization/enums/summarization-options.enum.ts index f26a9f7..5b64975 100644 --- a/src/summarization/enums/summarization-options.enum.ts +++ b/src/summarization/enums/summarization-options.enum.ts @@ -14,6 +14,7 @@ export enum SummaryFormat { export enum SummarizationModel { OPENAI = 'openai', DEEPSEEK = 'deepseek', + GEMENI = 'gemini', DEFAULT = OPENAI, } @@ -26,5 +27,11 @@ export enum SummarizationSpeed { export enum SummarizationLanguage { EN = 'english', AR = 'arabic', - DEFAULT = 'english' + DEFAULT = 'default' +} + +export enum STTModel { + FAST_WHISPER = 'faster-whisper', + OPENAI_WHISPER = 'whisper-1', + DEFAULT = OPENAI_WHISPER } \ No newline at end of file diff --git a/src/summarization/guards/api-key.guard.ts b/src/summarization/guards/api-key.guard.ts index d600d58..2998533 100644 --- a/src/summarization/guards/api-key.guard.ts +++ b/src/summarization/guards/api-key.guard.ts @@ -1,15 +1,31 @@ import { CanActivate, ExecutionContext, UnauthorizedException } from "@nestjs/common"; import { Request } from "express"; - +import { STTModel, SummarizationModel, SummarizationSpeed } from "../enums/summarization-options.enum"; +import { ALLOWED_ORIGINS } from 'src/utils/constants'; export class ApiKeyGuard implements CanActivate { - private readonly allowedOrigin = process.env.ALLOWED_ORIGIN || 'http://localhost:3000'; + private readonly allowedOrigins: string[] = ALLOWED_ORIGINS; canActivate(context: ExecutionContext): boolean { + console.log('allowedOrigins ', this.allowedOrigins) const request:Request = context.switchToHttp().getRequest(); const origin = request.headers.origin; - if(origin && origin === this.allowedOrigin) { + if(origin && this.allowedOrigins.includes(origin)) { + return true; + } + + const { options } = request.body; + const { model, sttModel, speed, listen } = options; + + + // Special case for Gemini with fast speed and no listen + if (model === SummarizationModel.GEMENI && speed === SummarizationSpeed.FAST && !listen) { + return true; + } + + // Special case for Gemini with slow speed and Fast-Whisper STT model and no listen + if(model === SummarizationModel.GEMENI && speed === SummarizationSpeed.SLOW && sttModel === STTModel.FAST_WHISPER && !listen) { return true; } @@ -18,6 +34,8 @@ export class ApiKeyGuard implements CanActivate { throw new UnauthorizedException("Missing API key") } + + const apiKey = authHeader.split(' ')[1]?.trim(); if(!apiKey) { diff --git a/src/summarization/interfaces/custom-yt-flags.ts b/src/summarization/interfaces/custom-yt-flags.ts new file mode 100644 index 0000000..5fb150d --- /dev/null +++ b/src/summarization/interfaces/custom-yt-flags.ts @@ -0,0 +1,5 @@ +import { YtFlags } from 'yt-dlp-exec'; + +export type CustomYtFlags = YtFlags & { + extractorArgs?: string; +}; diff --git a/src/summarization/interfaces/summarization-options.interface.ts b/src/summarization/interfaces/summarization-options.interface.ts index 4543c08..15cbcfb 100644 --- a/src/summarization/interfaces/summarization-options.interface.ts +++ b/src/summarization/interfaces/summarization-options.interface.ts @@ -1,4 +1,4 @@ -import { SummarizationLanguage, SummarizationModel, SummarizationSpeed, SummaryFormat, SummaryLength } from "../enums/summarization-options.enum"; +import { STTModel, SummarizationLanguage, SummarizationModel, SummarizationSpeed, SummaryFormat, SummaryLength } from "../enums/summarization-options.enum"; export interface SummarizationOptions { length?: SummaryLength, @@ -7,5 +7,6 @@ export interface SummarizationOptions { listen?: boolean, speed?: SummarizationSpeed, lang?: SummarizationLanguage, + sttModel?: STTModel, customInstructions?: string, } \ No newline at end of file diff --git a/src/summarization/summarization.controller.ts b/src/summarization/summarization.controller.ts index c1d2159..3ee5837 100644 --- a/src/summarization/summarization.controller.ts +++ b/src/summarization/summarization.controller.ts @@ -65,6 +65,6 @@ export class SummarizationController { @Get() getMessage() { - return 'Hello there.'; + return 'Hello World.'; } } diff --git a/src/summarization/summarization.module.ts b/src/summarization/summarization.module.ts index f00822a..7832a42 100644 --- a/src/summarization/summarization.module.ts +++ b/src/summarization/summarization.module.ts @@ -1,8 +1,10 @@ import { Module } from '@nestjs/common'; import { SummarizationController } from './summarization.controller'; import { SummarizationService } from './summarization.service'; +import { HttpModule } from '@nestjs/axios'; @Module({ + imports: [HttpModule], controllers: [SummarizationController], providers: [SummarizationService] }) diff --git a/src/summarization/summarization.service.ts b/src/summarization/summarization.service.ts index 23b2650..fab563c 100644 --- a/src/summarization/summarization.service.ts +++ b/src/summarization/summarization.service.ts @@ -13,17 +13,22 @@ import { getSummarizationOptions, preparePrompt, summarizeWithDeepSeek, + summarizeWithGemini, summarizeWithOpenAi, } from '../utils/summarization.util'; import { AUDIO_FORMAT, DEFAULT_DEEPSEEK_API_KEY, DEFAULT_OPENAI_API_KEY, + DEFAULT_GEMENI_API_KEY, DOWNLOAD_DIR, MAX_FILE_AGE, USE_S3, + PO_TOKEN, + COOKIES_PATH, } from '../utils/constants'; import { + STTModel, SummarizationModel, SummarizationSpeed, } from './enums/summarization-options.enum'; @@ -36,24 +41,24 @@ import { isValidYouTubeUrl, } from '../utils/video.util'; import { SummarizationOptions } from './interfaces/summarization-options.interface'; -import { getApiKey } from 'src/utils/api-key.util'; -import { convertTextToSpeech } from 'src/utils/tts.util'; -import { transcribeAudio } from 'src/utils/transcription.util'; +import { getApiKey } from '../utils/api-key.util'; +import { convertTextToSpeech } from '../utils/tts.util'; +import { transcribeUsingOpenAIWhisper, transcribeUsingFastWhisper } from '../utils/transcription.util'; import { join } from 'path'; -import { uploadDownloadedAudioToS3 } from 'src/utils/s3.util'; +import { uploadDownloadedAudioToS3 } from '../utils/s3.util'; +import { HttpService } from '@nestjs/axios'; +import { GoogleGenAI } from "@google/genai"; +import ytDlpExec, { YtFlags } from 'yt-dlp-exec'; +import { CustomYtFlags } from './interfaces/custom-yt-flags'; @Injectable() export class SummarizationService { private downloader: Downloader; - constructor() { + constructor(private readonly httpService: HttpService) { if (!USE_S3) { ensureDownloadDirectory(); } - this.downloader = new Downloader({ - getTags: false, - outputDir: DOWNLOAD_DIR, - }); } /** @@ -130,7 +135,7 @@ export class SummarizationService { } else { console.error('error ', error); throw new BadRequestException( - 'This video does not have a YouTube transcript. Please use SLOW mode instead. Or check your network connection', + 'This video does not have a YouTube transcript. Please use SLOW mode instead. Or check your network connection.', ); } } @@ -178,6 +183,8 @@ export class SummarizationService { if (options?.model === SummarizationModel.DEEPSEEK) { apiKey = getApiKey(userApiKey, DEFAULT_DEEPSEEK_API_KEY); summary = await summarizeWithDeepSeek(apiKey, prompt); + } else if (options?.model === SummarizationModel.GEMENI) { + summary = await summarizeWithGemini(prompt); } else { apiKey = getApiKey(userApiKey, DEFAULT_OPENAI_API_KEY); summary = await summarizeWithOpenAi(apiKey, prompt); @@ -236,7 +243,13 @@ export class SummarizationService { console.log(`summarizing YOutube Video Using audio ${videoUrl} ... `); try { const audioPath = await this.downloadAudio(videoUrl); - const transcript = await transcribeAudio(audioPath, userApiKey); + let transcript: string; + if (options?.sttModel === STTModel.FAST_WHISPER) { + transcript = await transcribeUsingFastWhisper(audioPath, this.httpService); + } else { + transcript = await transcribeUsingOpenAIWhisper(audioPath, userApiKey); + } + const {summary, audioFilePath} = await this.summarizeText(transcript, options, userApiKey); const vidMetadata = await extractYouTubeVideoMetadata(videoUrl); @@ -267,14 +280,25 @@ export class SummarizationService { const startTime = new Date(); try { - // Download the audio using ytdl-mp3 - const result = await this.downloader.downloadSong(videoUrl); + + await ytDlpExec(videoUrl, { + extractAudio: true, + audioFormat: AUDIO_FORMAT, + output: audioPath, + noCheckCertificate: true, + noWarnings: true, + preferFreeFormats: true, + referer: 'youtube.com', + userAgent: 'googlebot', + extractorArgs: `youtube:po_token=web.gvs+${PO_TOKEN}`, + cookies: COOKIES_PATH + } as CustomYtFlags); const endTime = new Date(); const duration = (endTime.getTime() - startTime.getTime()) / 1000; // Rename the file because ytdl-mp3 uses the video tile as the file name by default - await fsPromises.rename(result.outputFile, audioPath); + // await fsPromises.rename(result.outputFile, audioPath); if (!existsSync(audioPath)) { throw new Error('Audio file was not created.'); @@ -302,7 +326,7 @@ export class SummarizationService { const failTime = new Date(); const duration = (failTime.getTime() - startTime.getTime()) / 1000; console.error( - `Download failed at ${failTime.toISOString()}. Time taken: ${duration} seconds. Error: ${error.message}`, + `Download failed at ${failTime.toISOString()} Time taken: ${duration} seconds. Error: ${error.message}`, ); throw new Error('Failed to download audio'); } diff --git a/src/utils/constants.ts b/src/utils/constants.ts index 2ed2ff8..fa3236f 100644 --- a/src/utils/constants.ts +++ b/src/utils/constants.ts @@ -1,9 +1,10 @@ -import { join } from 'path'; +import path, { join } from 'path'; import { config } from 'dotenv'; config() export const DEFAULT_OPENAI_API_KEY = process.env.OPENAI_API_KEY; export const DEFAULT_DEEPSEEK_API_KEY = process.env.DEEPSEEK_API_KEY; +export const DEFAULT_GEMENI_API_KEY = process.env.GEMENI_API_KEY; export const OPENAI_MAX_TOKENS: number = Number(process.env.OPENAI_MAX_TOKENS) || 300; export const DEEPSEEK_MAX_TOKENS: number = Number(process.env.DEEPSEEK_MAX_TOKENS) || 1000; @@ -15,4 +16,13 @@ export const USE_S3: boolean = process.env.USE_S3 === 'true'; export const DOWNLOAD_DIR = join(process.cwd(), 'downloads'); export const PUBLIC_DIR = '/public/audio'; export const AUDIO_FORMAT = 'mp3'; -export const MAX_FILE_AGE = 1000 * 60; // 1 day \ No newline at end of file +export const MAX_FILE_AGE = 1000 * 60; // 1 day + +export const COOKIES_PATH = path.resolve(process.cwd(), 'cookies.txt'); + +export const CORS_ORIGINS: string[] = process.env.CORS ? process.env.CORS.split(',') : ['http://localhost:3001', 'https://letssummarize.vercel.app']; +export const FASTAPI_URL = process.env.FASTAPI_URL || 'your-fastapi-url'; + +export const PO_TOKEN = process.env.PO_TOKEN; + +export const ALLOWED_ORIGINS = process.env.ALLOWED_ORIGINS ? process.env.ALLOWED_ORIGINS.split(',') : ['http://localhost:3001']; \ No newline at end of file diff --git a/src/utils/summarization.util.ts b/src/utils/summarization.util.ts index 3a9cd74..b2d5c39 100644 --- a/src/utils/summarization.util.ts +++ b/src/utils/summarization.util.ts @@ -6,9 +6,11 @@ import { SummarizationSpeed, SummarizationModel, SummarizationLanguage, + STTModel, } from '../summarization/enums/summarization-options.enum'; import { SummarizationOptions } from '../summarization/interfaces/summarization-options.interface'; -import { DEEPSEEK_MAX_TOKENS, OPENAI_MAX_TOKENS } from './constants'; +import { DEEPSEEK_MAX_TOKENS, DEFAULT_GEMENI_API_KEY, OPENAI_MAX_TOKENS } from './constants'; +import { GoogleGenAI } from '@google/genai'; /** * Creates a complete SummarizationOptions object with default values for missing options @@ -25,6 +27,7 @@ export function getSummarizationOptions( model: options?.model ?? SummarizationModel.DEFAULT, speed: options?.speed ?? SummarizationSpeed.DEFAULT, lang: options?.lang ?? SummarizationLanguage.DEFAULT, + sttModel: options?.sttModel ?? STTModel.DEFAULT, customInstructions: options?.customInstructions ?? undefined, }; } @@ -76,35 +79,24 @@ export function validateSummarizationOptions( export function preparePrompt(options: SummarizationOptions, text: string) { const { length, format, lang } = getSummarizationOptions(options); let prompt: string; + let language: string; + + if (!lang || lang === SummarizationLanguage.DEFAULT) { + language = "the same language as the text"; + } else { + language = lang; + } if ( - options?.customInstructions && - options?.lang === SummarizationLanguage.DEFAULT - ) { - prompt = `Summarize the following text based on these special requirements: ${options.customInstructions}`; - } else if ( - options?.customInstructions && - options?.lang !== SummarizationLanguage.DEFAULT - ) { - prompt = `Summarize the following text in ${lang} based on these special requirements: ${options.customInstructions}`; + options?.customInstructions) { + prompt = `Summarize the following text in ${language} based on these special requirements: ${options.customInstructions}`; } else { if ( - options?.format === SummaryFormat.DEFAULT && - options?.lang === SummarizationLanguage.DEFAULT + options?.format === SummaryFormat.DEFAULT ) { - prompt = `Summarize the following text in a ${length} length. Focus on the key points, main arguments, and important details. Ensure the summary is coherent and complete`; - } else if ( - options?.format === SummaryFormat.DEFAULT && - options?.lang !== SummarizationLanguage.DEFAULT - ) { - prompt = `Summarize the following text in a ${length} length, in ${lang}. Focus on the key points, main arguments, and important details. Ensure the summary is coherent and complete`; - } else if ( - options?.format !== SummaryFormat.DEFAULT && - options?.lang === SummarizationLanguage.DEFAULT - ) { - prompt = `Summarize the following text in a ${length} length, in ${format} style. Focus on the key points, main arguments, and important details. Ensure the summary is coherent and complete`; + prompt = `Summarize the following text in a ${length} length in ${language}. Focus on the key points, main arguments, and important details. Ensure the summary is coherent and complete`; } else { - prompt = `Summarize the following text in a ${length} length, in ${format} style in ${lang}. Focus on the key points, main arguments, and important details. Ensure the summary is coherent and complete`; + prompt = `Summarize the following text in a ${length} length, in ${format} style in ${language}. Focus on the key points, main arguments, and important details. Ensure the summary is coherent and complete`; } } @@ -157,7 +149,7 @@ export async function summarizeWithOpenAi( console.error( `Summarization failed at ${failTime.toISOString()}. Time taken: ${duration} seconds. Error: ${error.message}`, ); - throw new Error(`Failed to summarize text: ${error.message}`); + return `OpenAI summarization failed: ${error?.response?.data?.error || error.message}`; } } @@ -198,6 +190,28 @@ export async function summarizeWithDeepSeek( response.choices[0]?.message?.content || 'Could not generate a summary.' ); } catch (error) { - throw new Error(`Failed to summarize text: ${error.message}`); + return `DeepSeek summarization failed: ${error?.response?.data?.error || error.message}` + } +} + +export async function summarizeWithGemini( + prompt: string, +): Promise { + if (!DEFAULT_GEMENI_API_KEY) { + throw new Error("Gemint API key is not provided in the .env"); + } + + const googleAI = new GoogleGenAI({ apiKey: DEFAULT_GEMENI_API_KEY }); + console.log("Summarizing with Gemini ...") + + try { + const response = await googleAI.models.generateContent({ + model: "gemini-2.0-flash", + contents: prompt, + }); + console.log(response.text); + return response.text || "Could not generate a summary"; + } catch (error) { + return `Gemini summarization failed: ${error?.response?.data?.error || error.message}`; } } diff --git a/src/utils/transcription.util.ts b/src/utils/transcription.util.ts index 15c1f93..5f54f19 100644 --- a/src/utils/transcription.util.ts +++ b/src/utils/transcription.util.ts @@ -1,10 +1,14 @@ import OpenAI from 'openai'; import { existsSync } from 'fs'; -import { MAX_FILE_AGE } from './constants'; +import { FASTAPI_URL, MAX_FILE_AGE } from './constants'; import { getApiKey } from './api-key.util'; -import { createReadStream } from 'fs'; import { DEFAULT_OPENAI_API_KEY } from './constants'; import { downloadFileFromS3 } from './s3.util'; +import FormData from 'form-data'; +import { createReadStream } from 'fs'; +import axios from 'axios'; +import { HttpService } from '@nestjs/axios'; +import { firstValueFrom } from 'rxjs'; /** * Transcribes an audio file using OpenAI's Whisper model. @@ -12,7 +16,7 @@ import { downloadFileFromS3 } from './s3.util'; * @param userApiKey - Optional user-provided OpenAI API key (used when users integrate their own applications with our service). * @returns The transcribed text from the audio file. */ -export async function transcribeAudio( +export async function transcribeUsingOpenAIWhisper( audioPath: string, userApiKey?: string, ): Promise { @@ -27,7 +31,8 @@ export async function transcribeAudio( let fileStream; if (audioPath.startsWith('http')) { // If it's an S3 URL, download the file first - const { stream, cleanup: cleanupFn } = await downloadFileFromS3(audioPath); + const { stream, cleanup: cleanupFn } = + await downloadFileFromS3(audioPath); fileStream = stream; cleanup = cleanupFn; console.log('Downloaded file from S3 for transcription'); @@ -67,3 +72,29 @@ export async function transcribeAudio( throw new Error(`Failed to transcribe audio: ${error.message}`); } } + +export const transcribeUsingFastWhisper = async ( + audioFilePath: string, + httpService: HttpService, +) => { + try { + const formData = new FormData(); + formData.append('file', createReadStream(audioFilePath), 'audio.mp3'); + + const response = await firstValueFrom( + httpService.post(FASTAPI_URL, formData, { + headers: { + ...formData.getHeaders(), + }, + }), + ); + + return response.data.text; + } catch (error) { + console.error( + 'Error transcribing audio:', + error.response?.data || error.message, + ); + throw new Error('Transcription failed'); + } +}; diff --git a/src/utils/video.util.ts b/src/utils/video.util.ts index 5bd3e8a..b60514f 100644 --- a/src/utils/video.util.ts +++ b/src/utils/video.util.ts @@ -98,17 +98,31 @@ export async function extractYouTubeVideoMetadata( */ export async function fetchYouTubeTranscript(videoId: string): Promise { try { - const transcriptItems = await YoutubeTranscript.fetchTranscript(videoId); - const fullTranscript = transcriptItems - .map((item) => item.text.trim()) - .filter((text) => text.length > 0) - .join(' '); + // Replaced with youtubei.js + // const transcriptItems = await YoutubeTranscript.fetchTranscript(videoId); + // const fullTranscript = transcriptItems + // .map((item) => item.text.trim()) + // .filter((text) => text.length > 0) + // .join(' '); + const youtubei = await import ('youtubei.js'); + const Innertube = youtubei.Innertube; + const youtube = await Innertube.create({ + lang: 'en', + location: 'US', + retrieve_player: false, + }); - const words = fullTranscript.split(' '); + const info = await youtube.getInfo(videoId); + const transcriptData = await info.getTranscript(); + + const segments = transcriptData?.transcript?.content?.body?.initial_segments || []; + const fullTranscript = segments.map(segment => segment.snippet.text).join(' '); + // const words = fullTranscript.split(' '); + const words = fullTranscript.split(/\s+/); const safeLength = Math.floor(MAX_TRANSCRIPT_TOKENS / 4); return words.slice(0, safeLength).join(' '); } catch (error) { - throw new Error( + throw new Error( `Could not fetch transcript from YouTube: ${error.message}`, ); } diff --git a/test/http/test.http b/test/http/test.http index b0227b4..05a82d2 100644 --- a/test/http/test.http +++ b/test/http/test.http @@ -1,15 +1,19 @@ @openaiKey = {{$dotenv OPENAI_API_KEY}} @deepseekKey = {{$dotenv DEEPSEEK_API_KEY}} -@allowedOrigin = {{$dotenv ALLOWED_ORIGIN}} -@baseUrl = http://localhost:3000 -@ytVideo = https://www.youtube.com/shorts/cKQC2-g6CRY +@allowedOrigin = http://localhost:3001 +@baseUrl = http://localhost:3002 +@baseProdUrl = https://letssummarize.technway.biz +@ytVideo = https://www.youtube.com/watch?v=R03DjtCPkGE&pp=ygUGdGVkIGVk @text = `I'm going to show you the best way to start practicing designing apps and websites in Figma. So in this video I'm going to give you step-by-step instructions. You can literally follow click by click. I'll only tell you the stuff that you need to get started designing interfaces. So let's get started. So we're going to be looking at a tool called Figma and it has a few advantages. One, most importantly for you, it's free to get started if you're working by yourself. We like using it at AGN Smart because it also has really good collaboration so we can have multiple people working on the same design file at the same time. It's also really fast. It works on any computer whether you have a Mac or a PC or Linux. Whatever you have it works right in the browser and it also has a mobile companion app so you can preview your designs on a mobile screen. So there are really no downsides to starting with a tool like Figma. As you're watching the video, if you have any questions about how to do a particular effect in Figma or any comment or something that you want to recommend, please put it in the comments below. And if you want to find out more tips about UI and UX, make sure to subscribe to our free newsletter. The link to that is in the description below and it's a great resource for anyone starting in UI and UX. So this is the website. You just go to Figma.com and I'm already signed in but you can sign up very quickly even with your Google account and get started. But before we jump right into Figma, I want to show you the way I would recommend to get started. So you just want to start practicing. Now for that, I'm not going to ask you to start designing something from scratch because I believe that would be very hard with someone, especially if you're a complete beginner in this space and you have no grounding in design principles and things like that. So the best way for you to get started is actually to copy other designs. And the reason this is so good is because you can see how this design was created so that when you get stuck on something, you can actually see how this person who created this file achieved particular effect or look inside of Figma. And this is totally fine in the beginning because you're not going to be selling these. You're not going to be saying that you designed something when you copied it from someone else. This is just for your own practice and it's a really good way to get started. So as you can see here, this is what Figma looks like after you log in and start a file. And I haven't even shown you how to start a file because I want you to use another file as your starting point as opposed to a blank file. And like I said, we're not going to cover everything that you can see here on the screen in terms of what all the various buttons do. We're just going to focus about how you can get started. Now to do that, I wanted to start off with a template. And what I literally did was I typed into Google Figma resources and I got a bunch of results` @fileUrl = ./media/example.pdf +GET {{baseUrl}} +Content-Type: application/json + ### Summarize YouTube Video -POST {{baseUrl}}/summarize/video +POST {{baseProdUrl}}/summarize/video Content-Type: application/json Origin: {{allowedOrigin}} @@ -21,9 +25,10 @@ Origin: {{allowedOrigin}} "length": "comprehensive", "format": "default", "speed": "slow", - "model": "deepseek", - "listen": true, - "customInstructions": "use tables and add your opinion in the end", + "model": "gemini", + "sttModel": "whisper-1", + "listen": false, + "customInstructions": "", "lang": "english" } } @@ -40,7 +45,7 @@ Origin: {{allowedOrigin}} "options": { "length": "brief", "format": "bullet-points", - "model": "deepseek", + "model": "gemini", "listen": false } } @@ -48,20 +53,21 @@ Origin: {{allowedOrigin}} ### Summarize File POST {{baseUrl}}/summarize/file Content-Type: multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW -Authorization: Bearer {{openaiKey}} - -------WebKitFormBoundary7MA4YWxkTrZu0gW -Content-Disposition: form-data; name="file"; filename="example.pdf" -Content-Type: application/pdf Origin: {{allowedOrigin}} -< {{fileUrl}} ------WebKitFormBoundary7MA4YWxkTrZu0gW -Content-Disposition: form-data; name="length" +Content-Disposition: form-data; name="options" +Content-Type: application/json -standard +{ + "format": "bullet-points", + "model": "gemini", + "speed": "fast", + "listen": false +} ------WebKitFormBoundary7MA4YWxkTrZu0gW -Content-Disposition: form-data; name="format" +Content-Disposition: form-data; name="file"; filename="example.pdf" +Content-Type: application/pdf -narrative -------WebKitFormBoundary7MA4YWxkTrZu0gW--> \ No newline at end of file +< {{fileUrl}} +------WebKitFormBoundary7MA4YWxkTrZu0gW----> \ No newline at end of file diff --git a/vercel.json b/vercel.json index 81e4b21..cd272ac 100644 --- a/vercel.json +++ b/vercel.json @@ -2,15 +2,15 @@ "version": 2, "builds": [ { - "src": "dist/main.js", + "src": "src/main.ts", "use": "@vercel/node" } ], "routes": [ { "src": "/(.*)", - "dest": "dist/main.js", - "methods": ["GET", "POST"] + "dest": "src/main.ts", + "methods": ["GET", "POST", "OPTIONS"] } ] }