🍏 Apple SEC Filing Scanner (NLP Pipeline)

🎯 Project Overview: Accelerated Due Diligence

This project is a High-Throughput, AI-powered financial document analyzer designed to rapidly extract actionable investment insights from large regulatory filings (like 10-K and 10-Q reports).

It converts hours of manual review into automated data signals, empowering analysts and quantitative models with faster intelligence.

🚀 Technical Architecture & Performance

The solution is engineered for speed and efficiency using a Natural Language Processing (NLP) Pipeline built on Python and the Hugging Face transformers library.

⚡️ Critical Performance Metric

The project was optimized using GPU acceleration, reducing the analysis time for a full 10-K document from over 60 minutes (on CPU) to under 2 minutes (on a T4 GPU). This ~30x speed increase is essential for scalable, daily document processing.

🧠 Model Stack

Summarization: Utilizes the lightweight sshleifer/distilbart-cnn-12-6 model to efficiently condense document chunks.
Signal Extraction: Uses the distilbert-base-cased-distilled-squad model for Question Answering (QA) to precisely identify high-value information.

💡 Key Investment Signals Extracted

The final stage of the pipeline focuses the QA model on extracting specific factors critical to investment decisions:

Revenue Jumps
Cash-Flow Risks
Buy/Sell Signals (General Risk Factors)

🛠️ Setup and Execution

Prerequisites

Dependencies: Ensure a T4 GPU runtime is active in Colab or a local environment is configured for CUDA.
Document: Place the target SEC filing (e.g., apple10k.pdf) in the working directory.

Installation

!pip install PyPDF2 transformers accelerate torch --quiet

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
earnings_scanner.ipynb		earnings_scanner.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🍏 Apple SEC Filing Scanner (NLP Pipeline)

🎯 Project Overview: Accelerated Due Diligence

🚀 Technical Architecture & Performance

⚡️ Critical Performance Metric

🧠 Model Stack

💡 Key Investment Signals Extracted

🛠️ Setup and Execution

Prerequisites

Installation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🍏 Apple SEC Filing Scanner (NLP Pipeline)

🎯 Project Overview: Accelerated Due Diligence

🚀 Technical Architecture & Performance

⚡️ Critical Performance Metric

🧠 Model Stack

💡 Key Investment Signals Extracted

🛠️ Setup and Execution

Prerequisites

Installation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages