Skip to content

Comments

fix: Resolve installation bug #292 and modernize package management#325

Open
WhizZest wants to merge 1 commit intowenet-e2e:masterfrom
WhizZest:master
Open

fix: Resolve installation bug #292 and modernize package management#325
WhizZest wants to merge 1 commit intowenet-e2e:masterfrom
WhizZest:master

Conversation

@WhizZest
Copy link

@WhizZest WhizZest commented Feb 22, 2026

📑 语言切换 / Language Switch


🌐 English Description

Fix WeTextProcessing Installation Bug and Modernize Package Management System

🎯 Problem Background and Motivation

Core Issue Description

Encountered serious dependency conflict when using ChatTTS:

Dependency Conflict Scenario:

  • ChatTTS requires nemo_text_processing, which requires pynini==2.1.6.post1
  • WeTextProcessing requires pynini==2.1.6, not supporting post-release versions
  • This creates impossible situation to satisfy both package requirements simultaneously

Installation Failure Process:

  1. First tried pip install WeTextProcessing --no-deps - installation succeeded but ChatTTS reported WeTextProcessing missing (missing data files)
  2. Tried pip install git+https://github.com/wenet-e2e/WeTextProcessing.git - encountered setup.py error: "IndexError: list index out of range"
  3. Tried toolchain downgrade: pip install "setuptools<70" "wheel" + --no-build-isolation - same IndexError occurred
  4. Searched related issue 1.0.4.1仅可以在Linux环境下安装成功,windows和mac下都无法安装成功 #292, Request: New PyPI Release with Japanese ITN Support #321, [Installation Error] Installation Error on MacOS - pynini #317 - found the problem was unresolved
  5. Confirmed this is a WeTextProcessing bug, and official releases don't provide ZIP downloads

🔧 Solution Evolution Process

Phase 1: Attempt Non-Code Modification Solutions

Approach 1: Dependency Isolation Installation

pip install WeTextProcessing --no-deps
# Result: Installation successful but runtime missing data files

Approach 2: Direct Source Installation

pip install git+https://github.com/wenet-e2e/WeTextProcessing.git
# Result: Encountered "IndexError: list index out of range" error

Approach 3: Toolchain Downgrade

pip install "setuptools<70" "wheel"
pip install --no-build-isolation git+https://github.com/wenet-e2e/WeTextProcessing.git
# Result: Same IndexError error persisted

Phase 2: Attempt Internal setup.py Fixes

Attempt 1: Improve Version Extraction Method

  • Set default version number in setup.py
  • Problem: Default version inconsistent with git tag version, violates version consistency principle

Attempt 2: Use Git Commands to Get Tag

  • Get latest tag via subprocess calling git commands
  • Problem: Some installation environments (like PyPI installations) cannot use git commands

Attempt 3: Add VERSION File

  • Create separate VERSION file to manage version numbers
  • Problem: Violates "single source of truth" principle, increases maintenance complexity

Phase 3: Introduce Professional Solution

Final Choice: setuptools_scm

  • Professional version management tool
  • Automatically generates version numbers from git tags
  • Supports development version identification
  • Offline environment fallback mechanism

🚀 Final Implemented Improvements

1. Core Bug Fix

  • Problem Identification: IndexError in setup.py version extraction logic
  • Root Cause: Unreliable manual parsing of sys.argv for version number
  • Solution: Adopt setuptools_scm for automated version management

2. Dependency Compatibility Improvement

  • Original Problem: pynini==2.1.6 incompatible with post-release versions
  • Improved: pynini>=2.1.6,<2.2.0 supports broader version range
  • Conflict Resolution: Now compatible with nemo_text_processing

3. Complete Modernization Refactoring

  • Introduce pyproject.toml: Compliant with PEP 621 standard
  • Delete setup.py: Completely based on pyproject.toml building
  • Unified Dependency Management: Consolidate requirements.txt into pyproject.toml

🔧 Technical Changes Details

File Changes

New Files:

  • pyproject.toml: Complete modern packaging configuration

Deleted Files:

  • setup.py: Completely removed, no longer needed
  • requirements.txt: Runtime dependencies (consolidated into pyproject.toml)

Modified Files:

  • None (no other modifications beyond deletions)

Configuration Mapping Table

Original Setup.py Config New Pyproject.toml Corresponding Item Status Notes
name="WeTextProcessing" [project.name] = "WeTextProcessing" ✅ Mapped Project name
version=version dynamic = ["version"] + [tool.setuptools_scm] ✅ Fixed Automated version management
author="Zhendong Peng, Xingchen Song" [project.authors] list ✅ Mapped Author information
author_email="..." [project.authors] list ✅ Mapped Author emails
long_description (read from README.md) [project.readme] = "README.md" ✅ Mapped Long description file
long_description_content_type="text/markdown" [project.readme] implicit ✅ Handled Content type
description="WeTextProcessing, including TN & ITN" [project.description] ✅ Mapped Short description
url="https://github.com/wenet-e2e/WeTextProcessing" [project.urls.Homepage] ✅ Mapped Project homepage
packages=find_packages() [tool.setuptools.packages.find] ✅ Mapped Package discovery
package_data={...} [tool.setuptools.package-data] ✅ Mapped Package data files
install_requires=[...] [project.dependencies] ✅ Improved Runtime dependencies
entry_points={...} [project.scripts] ✅ Mapped Console scripts
tests_require=["pytest"] [project.optional-dependencies.test] ✅ Mapped Test dependencies
classifiers=[...] [project.classifiers] ✅ Mapped Classifiers

Key Configuration Improvements

Dependency Constraint Optimization:

# Before: Strict version locking
install_requires=["pynini==2.1.6"]

# After: Reasonable version range
dependencies = ["pynini>=2.1.6,<2.2.0"]

Modernized Version Management:

# Professional configuration in pyproject.toml
[tool.setuptools_scm]
version_scheme = "guess-next-dev"
local_scheme = "dirty-tag"
write_to = "tn/_version.py"
fallback_version = "1.0.5"

Completely pyproject.toml-based Building:

[build-system]
requires = ["setuptools>=45", "wheel", "setuptools_scm[toml]>=6.2"]
build-backend = "setuptools.build_meta"

✅ Test Verification

Functional Regression Testing

  • ✅ Chinese TN/ITN functions work properly
  • ✅ Command-line interfaces (wetn, weitn) operate normally
  • ✅ Dependency conflicts with nemo_text_processing resolved
  • ✅ Version number auto-generation correct (1.0.6.dev0+dirty)

Installation Compatibility Testing

# Successfully install both packages together
pip install nemo_text_processing  # Installs pynini==2.1.6.post1
pip install .                     # WeTextProcessing works normally, no conflicts

Configuration Completeness Verification

  • ✅ All original functions preserved
  • ✅ Dependency resolution and installation functionality complete
  • ✅ Package data files correctly included
  • ✅ Metadata information complete

📈 Improvement Benefits

Direct Benefits (Solving Core Issues)

  • ✓ Dependency Conflict Resolved: Support for pynini post-release versions
  • ✓ Installation Bug Fixed: Eliminate IndexError, support various installation methods
  • ✓ Compatibility Enhanced: Perfect integration with ChatTTS ecosystem

Long-term Benefits (Architectural Improvements)

  • ✓ Professional Version Management: Automated, reliable version control system
  • ✓ Completely Modernized: Compliant with latest Python packaging standards
  • ✓ Maintenance Simplified: Single configuration file, reduced maintenance costs
  • ✓ Future Compatible: Following Python ecosystem development trends

🚀 Usage Instructions

User Installation (Dependency Conflict Resolved)

# Now can install both packages together
pip install nemo_text_processing  # Install pynini==2.1.6.post1
pip install WeTextProcessing      # Works normally, no conflicts

Developer Workflow

# Development environment installation
pip install -e .[dev]

# Run tests
python -m pytest

# Release new version (automatic version management)
git tag v1.0.6
git push origin v1.0.6

📝 Technical Decision Rationale

Why Delete setup.py?

  1. Completely Feasible: Modern pip and build tools fully support pure pyproject.toml projects
  2. More Concise: Eliminate unnecessary files, reduce project complexity
  3. Standard Compliant: Complies with PEP 621 recommended modern practices
  4. Future-oriented: Python community development trend

Why Choose setuptools_scm?

  1. Professional and Reliable: Industry-standard version management solution
  2. Feature Complete: Supports development versions, offline fallback, etc.
  3. Ecosystem Compatible: Perfectly integrates with modern Python toolchain
  4. Maintenance Simple: No manual version management required

Why Consolidate to pyproject.toml?

  1. Standard Compliant: Follows PEP 621 modern packaging standard
  2. Configuration Centralized: Single configuration file, avoids scattered management
  3. Tool Support: Modern tools prioritize pyproject.toml support
  4. Future-oriented: Direction of Python packaging development

This PR resolves core installation bugs and implements complete modernization refactoring. Strongly recommended for merge.
Back to Top / 返回顶部


🇨🇳 中文描述

修复 WeTextProcessing 安装 bug 并现代化包管理系统

🎯 问题背景与动机

核心问题描述

在使用 ChatTTS 时遇到了严重的依赖冲突问题:

依赖冲突场景

  • ChatTTS 需要 nemo_text_processing,而 nemo_text_processing 要求 pynini==2.1.6.post1
  • WeTextProcessing 要求 pynini==2.1.6,不支持 post-release 版本
  • 这导致无法同时满足两个包的依赖要求

安装失败过程

  1. 首先尝试 pip install WeTextProcessing --no-deps 安装成功,但运行时 ChatTTS 认为 WeTextProcessing 不存在(缺少数据文件)
  2. 尝试 pip install git+https://github.com/wenet-e2e/WeTextProcessing.git 遇到 setup.py 代码错误:"IndexError: list index out of range"
  3. 尝试降级工具链:pip install "setuptools<70" "wheel" + --no-build-isolation,仍然出现相同错误
  4. 查找相关 issue 1.0.4.1仅可以在Linux环境下安装成功,windows和mac下都无法安装成功 #292Request: New PyPI Release with Japanese ITN Support #321[Installation Error] Installation Error on MacOS - pynini #317,发现问题未被解决
  5. 确认这是 WeTextProcessing 的 bug,且官方 release 未提供 ZIP 包下载

🔧 解决方案演进过程

第一阶段:尝试不修改代码的方案

方案1:依赖隔离安装

pip install WeTextProcessing --no-deps
# 结果:安装成功但运行时缺失数据文件

方案2:直接从源码安装

pip install git+https://github.com/wenet-e2e/WeTextProcessing.git
# 结果:遇到 "IndexError: list index out of range" 错误

方案3:工具链降级

pip install "setuptools<70" "wheel"
pip install --no-build-isolation git+https://github.com/wenet-e2e/WeTextProcessing.git
# 结果:仍然是相同的 IndexError 错误

第二阶段:尝试在 setup.py 内部修复

尝试1:改进版本获取方式

  • 在 setup.py 中设置默认版本号
  • 问题:默认版本与 git tag 不一致,违背版本一致性原则

尝试2:使用 git 命令获取 tag

  • 通过 subprocess 调用 git 命令获取最新 tag
  • 问题:某些安装环境(如 PyPI 安装)无法使用 git 命令

尝试3:添加 VERSION 文件

  • 创建独立的 VERSION 文件管理版本号
  • 问题:违背"单一数据源原则",增加维护复杂度

第三阶段:引入专业解决方案

最终选择:setuptools_scm

  • 专业的版本管理工具
  • 自动从 git tag 生成版本号
  • 支持开发版本标识
  • 离线环境回退机制

🚀 最终实施的改进

1. 核心 bug 修复

  • 问题定位:setup.py 中的版本提取逻辑存在 IndexError
  • 根本原因:手动解析 sys.argv 获取版本号的方式不可靠
  • 解决方案:采用 setuptools_scm 实现自动化版本管理

2. 依赖兼容性改进

  • 原问题pynini==2.1.6 不兼容 post-release 版本
  • 改进后pynini>=2.1.6,<2.2.0 支持更广泛的版本范围
  • 解决冲突:现在可以与 nemo_text_processing 共存

3. 完全现代化重构

  • 引入 pyproject.toml:符合 PEP 621 标准
  • 删除 setup.py:完全基于 pyproject.toml 构建
  • 优化配置:移除冗余的 _version.py package-data 声明
  • 统一依赖管理:将 requirements.txt 整合到 pyproject.toml

🔧 技术变更详情

文件变更

新增文件

  • pyproject.toml:完整的现代化打包配置

删除文件

  • setup.py:完全移除,不再需要
  • requirements.txt:运行时依赖(已整合到 pyproject.toml)

修改文件

  • 无(除上述删除外无其他修改)

配置映射对照表

原始 Setup.py 配置 新的 Pyproject.toml 对应项 状态 说明
name="WeTextProcessing" [project.name] = "WeTextProcessing" ✅ 已映射 项目名称
version=version dynamic = ["version"] + [tool.setuptools_scm] ✅ 已修复 版本自动化管理
author="Zhendong Peng, Xingchen Song" [project.authors] 列表 ✅ 已映射 作者信息
author_email="..." [project.authors] 列表 ✅ 已映射 作者邮箱
long_description (从 README.md 读取) [project.readme] = "README.md" ✅ 已映射 详细描述文件
long_description_content_type="text/markdown" [project.readme] 隐含 ✅ 已处理 内容类型
description="WeTextProcessing, including TN & ITN" [project.description] ✅ 已映射 简短描述
url="https://github.com/wenet-e2e/WeTextProcessing" [project.urls.Homepage] ✅ 已映射 项目主页
packages=find_packages() [tool.setuptools.packages.find] ✅ 已映射 包发现配置
package_data={...} [tool.setuptools.package-data] ✅ 已映射 包数据文件
install_requires=[...] [project.dependencies] ✅ 已改进 运行时依赖
entry_points={...} [project.scripts] ✅ 已映射 控制台脚本
tests_require=["pytest"] [project.optional-dependencies.test] ✅ 已映射 测试依赖
classifiers=[...] [project.classifiers] ✅ 已映射 分类器

关键配置改进

版本管理现代化

# pyproject.toml 中的专业配置
[tool.setuptools_scm]
version_scheme = "guess-next-dev"
local_scheme = "dirty-tag"
write_to = "tn/_version.py"
fallback_version = "1.0.5"

完全基于 pyproject.toml 的构建

[build-system]
requires = ["setuptools>=45", "wheel", "setuptools_scm[toml]>=6.2"]
build-backend = "setuptools.build_meta"
# 改进前:严格版本锁定
install_requires=["pynini==2.1.6"]

# 改进后:合理的版本范围
dependencies = ["pynini>=2.1.6,<2.2.0"]

版本管理现代化

# pyproject.toml 中的专业配置
[tool.setuptools_scm]
version_scheme = "guess-next-dev"
local_scheme = "dirty-tag"
write_to = "tn/_version.py"  # 自动生成,无需手动声明
fallback_version = "1.0.5"

✅ 测试验证

功能回归测试

  • ✅ 中文 TN/ITN 功能正常运行
  • ✅ 命令行接口 (wetn, weitn) 工作正常
  • ✅ 与 nemo_text_processing 的依赖冲突已解决
  • ✅ 版本号自动生成正确 (1.0.6.dev0+dirty)

安装兼容性测试

# 成功安装并与 nemo_text_processing 共存
pip install nemo_text_processing  # 安装 pynini==2.1.6.post1
pip install .                     # WeTextProcessing 正常安装

配置完整性验证

  • ✅ 所有原始功能保持不变
  • ✅ 依赖解析和安装功能完整
  • ✅ 包数据文件正确包含
  • ✅ 元数据信息完整显示

📈 改进收益

直接收益(解决核心问题)

  • ✓ 依赖冲突解决:支持 pynini post-release 版本
  • ✓ 安装 bug 修复:消除 IndexError,支持各种安装方式
  • ✓ 兼容性提升:与 ChatTTS 生态系统完美集成

长期收益(架构改进)

  • ✓ 专业版本管理:自动化、可靠的版本控制系统
  • ✓ 完全现代化:符合最新 Python 打包标准
  • ✓ 维护简化:单一配置文件,减少维护成本
  • ✓ 未来兼容:跟随 Python 生态发展方向

🚀 使用说明

用户安装(已解决依赖冲突)

# 现在可以同时安装两个包
pip install nemo_text_processing  # 安装 pynini==2.1.6.post1
pip install WeTextProcessing      # 正常工作,无冲突

开发者工作流

# 开发环境安装
pip install -e .[dev]

# 运行测试
python -m pytest

# 发布新版本(自动版本管理)
git tag v1.0.6
git push origin v1.0.6

📝 技术决策说明

为什么选择 setuptools_scm?

  1. 专业可靠:业界标准的版本管理方案
  2. 功能完整:支持开发版本、离线回退等
  3. 生态兼容:与现代 Python 工具链完美集成
  4. 维护简单:无需手动管理版本号

为什么整合到 pyproject.toml?

  1. 标准合规:遵循 PEP 621 现代打包标准
  2. 配置集中:单一配置文件,避免分散管理
  3. 工具支持:现代工具优先支持 pyproject.toml
  4. 未来导向:Python 打包的发展方向

此 PR 解决了核心安装 bug 并实现了完全现代化重构,强烈建议合并。
返回顶部 / Back to Top

…agement

- Fix IndexError in setup.py version extraction logic
- Upgrade pynini dependency to support post-release versions
- Replace manual version management with setuptools_scm
- Migrate to modern pyproject.toml configuration
- Remove obsolete setup.py and requirements.txt
- Enable coexistence with nemo_text_processing in ChatTTS ecosystem
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant