Skip to content

Fix readability sentence boundary counting#17

Merged
practicalmind-dev merged 1 commit into
PracticalMind:mainfrom
puneetdixit200:fix/15-readability-sentence-boundaries
May 23, 2026
Merged

Fix readability sentence boundary counting#17
practicalmind-dev merged 1 commit into
PracticalMind:mainfrom
puneetdixit200:fix/15-readability-sentence-boundaries

Conversation

@puneetdixit200
Copy link
Copy Markdown

Summary

  • count sentence-ending punctuation without splitting inside decimals, URLs, or common abbreviations
  • add regression coverage for Dr./Mr., decimal values, and domain names
  • clamp cosine similarity scores into the valid [-1, 1] range to avoid float32 rounding leaking values above 1.0

Closes #15

Verification

  • UV_CACHE_DIR=/tmp/assayer-15-uv-cache UV_PYTHON_INSTALL_DIR=/tmp/assayer-15-uv-python UV_PROJECT_ENVIRONMENT=/tmp/assayer-15-uv-venv uv run --python 3.12 --with pytest --with pytest-asyncio --with numpy --with ruff pytest -m "not integration" -q
  • UV_CACHE_DIR=/tmp/assayer-15-uv-cache UV_PYTHON_INSTALL_DIR=/tmp/assayer-15-uv-python UV_PROJECT_ENVIRONMENT=/tmp/assayer-15-uv-venv uv run --python 3.12 --with pytest --with pytest-asyncio --with numpy --with ruff ruff check .
  • git diff --check HEAD~1 HEAD

@practicalmind-dev practicalmind-dev merged commit eccda52 into PracticalMind:main May 23, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: readability_stats() miscounts sentences on abbreviations and decimals

2 participants