Natural Language Processing service for Readmigo.
- Language: Python 3.11+
- Framework: FastAPI
- NLP Libraries: spaCy, NLTK, jieba
- Text tokenization and analysis
- Sentence boundary detection
- Word difficulty assessment
- Chinese text segmentation
- Bilingual text alignment
├── app/
│ ├── main.py # FastAPI application
│ ├── routers/ # API routes
│ ├── services/ # NLP services
│ └── models/ # Data models
├── scripts/ # Utility scripts
└── tests/ # Test cases
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Download spaCy models
python -m spacy download en_core_web_sm
# Start development server
uvicorn app.main:app --reload| Endpoint | Description |
|---|---|
POST /tokenize |
Tokenize text into words |
POST /sentences |
Split text into sentences |
POST /difficulty |
Assess word difficulty level |
POST /align |
Align bilingual paragraphs |