aliyun · tianhao909 · Mar 18, 2026 · Mar 18, 2026 · Mar 18, 2026
diff --git a/.github/ISSUE_TEMPLATE/bug_report.yml b/.github/ISSUE_TEMPLATE/bug_report.yml
@@ -0,0 +1,87 @@
+name: Bug Report
+description: Submit a bug report to help improve SimAI
+title: "[BUG]: "
+labels: ["bug"]
+body:
+  - type: markdown
+    attributes:
+      value: |
+        Thanks for taking the time to fill out this bug report!
+
+        It would be very helpful if you could provide as much detail as possible.
+
+  - type: textarea
+    id: bug-description
+    attributes:
+      label: Describe the Bug
+      description: A clear and concise description of what the bug is.
+    validations:
+      required: true
+
+  - type: textarea
+    id: reproduction
+    attributes:
+      label: Reproduction Details
+      description: |
+        Please provide detailed steps to reproduce the issue.
+        Include the branch names or commit IDs of SimAI/AICB you are using.
+      placeholder: |
+        1. **Branches / Commit IDs**: SimAI branch `master` (commit `abc1234`), AICB branch `master`
+        2. Go to '...'
+        3. Run `...`
+        4. See error: ...
+    validations:
+      required: true
+
+  - type: textarea
+    id: expected
+    attributes:
+      label: Expected Behavior
+      description: What did you expect to happen?
+    validations:
+      required: true
+
+  - type: textarea
+    id: actual
+    attributes:
+      label: Actual Behavior
+      description: What actually happened? Please include any error messages or logs.
+    validations:
+      required: true
+
+  - type: textarea
+    id: environment
+    attributes:
+      label: Environment
+      description: Please provide details about your environment.
+      placeholder: |
+        - OS: Ubuntu 20.04
+        - GCC/G++: 9.4.0
+        - Python: 3.8.10
+        - Docker image (if applicable): ...
+        - CUDA version (if applicable): ...
+        - SimAI branch/commit: master / abc1234
+        - AICB branch/commit: master / def5678
+    validations:
+      required: true
+
+  - type: textarea
+    id: usage-scenario
+    attributes:
+      label: Usage Scenario (Optional)
+      description: |
+        If possible, please describe your usage scenario for SimAI:
+        - What task or project you are working on
+        - The underlying goals or business context
+
+        This information will help us collect relevant use cases and optimize the SimAI simulator to better meet your needs.
+    validations:
+      required: false
+
+  - type: textarea
+    id: screenshots
+    attributes:
+      label: Screenshots / Logs
+      description: If applicable, add screenshots or log snippets to help explain your problem.
+    validations:
+      required: false
diff --git a/.github/ISSUE_TEMPLATE/config.yml b/.github/ISSUE_TEMPLATE/config.yml
@@ -0,0 +1,8 @@
+blank_issues_enabled: false
+contact_links:
+  - name: SimAI Documentation
+    url: https://github.com/aliyun/SimAI/tree/master/docs
+    about: Refer to the SimAI documentation to help you get started.
+  - name: SimAI Community (DingTalk / WeChat)
+    url: https://github.com/aliyun/SimAI#contact-us
+    about: Join our DingTalk or WeChat community groups for discussion and support.
diff --git a/.github/ISSUE_TEMPLATE/feature_request.yml b/.github/ISSUE_TEMPLATE/feature_request.yml
@@ -0,0 +1,37 @@
+name: Feature Request
+description: Suggest an improvement for SimAI
+title: "[FEATURE]: "
+labels: ["enhancement"]
+body:
+  - type: markdown
+    attributes:
+      value: |
+        Thank you for suggesting a feature to improve SimAI!
+
+  - type: textarea
+    id: feature-description
+    attributes:
+      label: Feature Description
+      description: A clear and concise description of the feature you'd like.
+    validations:
+      required: true
+
+  - type: textarea
+    id: problem
+    attributes:
+      label: Problem / Motivation
+      description: |
+        What problem does this feature solve? Why is it needed?
+        Please describe the use case or scenario where this feature would be helpful.
+    validations:
+      required: true
+
+  - type: textarea
+    id: alternatives
+    attributes:
+      label: Alternatives Considered
+      description: |
+        Have you considered any alternative solutions or workarounds?
+        Please describe them if applicable.
+    validations:
+      required: false
diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md
@@ -0,0 +1,34 @@
+## Description
+
+<!-- Please provide a clear and concise description of what this PR does. -->
+
+## Related Issue
+
+<!-- Link to the related issue, e.g., Fixes #123 or Resolves #456 -->
+
+## Type of Change
+
+- [ ] Bug fix (non-breaking change which fixes an issue)
+- [ ] New feature (non-breaking change which adds functionality)
+- [ ] Performance improvement
+- [ ] Refactoring (no functional changes)
+- [ ] Documentation update
+- [ ] Build / CI configuration change
+
+## Checklist
+
+- [ ] I have read the [CONTRIBUTING.md](../CONTRIBUTING.md) guide
+- [ ] My code follows the existing code style of this project
+- [ ] I have tested my changes locally
+- [ ] I have added/updated documentation as needed
+- [ ] My changes do not introduce new warnings or errors
+- [ ] I have verified that simulation accuracy is not degraded (if applicable)
+
+## Test Results
+
+<!-- Describe the tests you ran and their results. -->
+<!-- For simulation changes, include before/after accuracy comparison if possible. -->
+
+## Additional Notes
+
+<!-- Any additional information that reviewers should know. -->
diff --git a/.github/workflows/lint.yml b/.github/workflows/lint.yml
@@ -0,0 +1,52 @@
+name: Lint
+
+on:
+  push:
+    branches: [master, main]
+  pull_request:
+    branches: [master, main]
+
+jobs:
+  python-lint:
+    name: Python Lint
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.10"
+
+      - name: Install linters
+        run: pip install flake8
+
+      - name: Run flake8
+        run: |
+          # Stop the build if there are Python syntax errors or undefined names
+          flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
+          # Treat all other issues as warnings (non-blocking)
+          flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
+
+  markdown-lint:
+    name: Markdown Lint
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: markdownlint
+        uses: DavidAnson/markdownlint-cli2-action@v19
+        with:
+          globs: |
+            README.md
+            CONTRIBUTING.md
+            CHANGELOG.md
+            docs/**/*.md
+          config: |
+            {
+              "default": true,
+              "MD013": false,
+              "MD033": false,
+              "MD041": false
+            }
+        continue-on-error: true
diff --git a/.gitignore b/.gitignore
@@ -1,4 +1,4 @@
-.vscode
+# .vscode
 astra-sim-alibabacloud/build/simai_analytical/build/
 astra-sim-alibabacloud/build/astra_ns3/build/
 astra-sim-alibabacloud/extern/
@@ -8,3 +8,16 @@ test/log/
 *.log
 .cur*
 .DS_Store
+
+# fth add
+*.csv
+*.txt
+tmp_simai_inference_workload/
+aicb/
+Spectrum-X*
+
+fth-test/*
+
+# Personal dev / fth files
+fth.sh
+**/fth.sh
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -0,0 +1,62 @@
+<p align="left">
+    <a href="CHANGELOG_CN.md">中文</a>&nbsp ｜ &nbspEnglish
+</p>
+
+# Changelog
+
+All notable changes to SimAI will be documented in this file.
+
+The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
+
+> **Note**: This changelog covers v1.0 (initial open-source release) and later versions.
+
+## [Unreleased]
+
+## [1.6.0] - 2026-03-16
+
+### Added
+
+- GPU memory calculation module: accurate parameter counting and KV cache management for DeepSeek-V3-671B, Qwen3-MoE-235B, and Qwen3-Next-80B
+- PD-separation memory planning for independent Prefill/Decode memory budgets
+- Improved AICB decode time estimation with linear interpolation and global cache
+- 4-scenario end-to-end inference test suite (`run_scenarios.sh`)
+- SimAI 1.6 Technical Report (EN/ZH)
+- Complete bilingual documentation system (30+ files under `docs/en/`, `docs/zh/`)
+- GitHub community health files: issue/PR templates, Code of Conduct, Security Policy, Contributing Guide
+
+### Changed
+
+- Replaced print statements with logging across vidur-alibabacloud modules
+- Added bilingual docstrings for public APIs
+- Standardized TODO comments format
+
+### Removed
+
+- Removed ~390 lines of dead code in vidur-alibabacloud
+- Cleaned personal debug markers across 8 files
+
+## [1.5.0] - 2025-12-30
+
+### Added
+
+- **End-to-end multi-request inference simulation**: Full simulation support for multi-request inference workloads.
+- **Prefill/Decode separation**: Model complex inference scenarios with Prefill/Decode phase separation.
+- **Modern model support**: Added support for DeepSeek, Qwen3-MoE, and Qwen3-Next models.
+- **Request scheduling via Vidur**: Integrated request scheduling component adapted from Microsoft's [Vidur](https://github.com/microsoft/vidur) (see [vidur-alibabacloud](./vidur-alibabacloud/)).
+- **AICB inference workload generation**: AICB now supports generating prefill/decode inference workloads for DeepSeek, Qwen3-MoE, and Qwen3-Next.
+- **DeepSeek training workload support**: AICB now supports generating training workloads for DeepSeek (contributed by [@parthpower](https://github.com/parthpower)).
+- **SimCCL initial release**: First public release of the SimCCL collective communication transformation module.
+
+## [1.0.0] - 2024-10-18
+
+### Added
+
+- Initial open-source release of SimAI: full-stack simulator for AI large-scale training
+- Core components: AICB, SimCCL, astra-sim-alibabacloud, ns-3-alibabacloud
+- SimAI-Analytical: fast simulation using bus bandwidth abstraction
+- SimAI-Simulation: full-stack NS3-based network simulation
+- SimAI-Physical (Beta): CPU RDMA cluster physical traffic generation
+
+### Academic
+
+- SimAI paper accepted by **NSDI'25 Spring**. See [paper](https://arxiv.org/abs/2410.07346).
diff --git a/CHANGELOG_CN.md b/CHANGELOG_CN.md
@@ -0,0 +1,62 @@
+<p align="left">
+    中文&nbsp ｜ &nbsp<a href="CHANGELOG.md">English</a>
+</p>
+
+# 更新日志
+
+SimAI 的所有重要变更均记录在此文件中。
+
+格式基于 [Keep a Changelog](https://keepachangelog.com/en/1.1.0/)。
+
+> **注意**：本更新日志涵盖 v1.0（首次开源发布）及之后的版本。
+
+## [未发布]
+
+## [1.6.0] - 2026-03-16
+
+### 新增
+
+- GPU 内存计算模块：支持 DeepSeek-V3-671B、Qwen3-MoE-235B、Qwen3-Next-80B 的精确参数计数与 KV Cache 管理
+- PD 分离内存规划：Prefill/Decode 阶段独立的内存预算计算
+- 改进 AICB decode 时间估算（首尾线性插值 + 全局缓存）
+- 4 场景端到端推理测试套件（`run_scenarios.sh`）
+- SimAI 1.6 技术报告（EN/ZH）
+- 完整双语文档系统（`docs/en/`、`docs/zh/` 下 30+ 文件）
+- GitHub 社区规范文件：Issue/PR 模板、行为准则、安全政策、贡献指南
+
+### 变更
+
+- vidur-alibabacloud 各模块 print 输出替换为 logging
+- 公开 API 添加双语 docstring
+- TODO 注释格式统一规范化
+
+### 移除
+
+- 清理 vidur-alibabacloud 中约 390 行死代码
+- 清理 8 个文件中的个人调试标记
+
+## [1.5.0] - 2025-12-30
+
+### 新增
+
+- **端到端多请求推理仿真**：全面支持多请求推理工作负载的端到端仿真。
+- **Prefill/Decode 分离**：支持 Prefill/Decode 阶段分离等复杂推理场景建模。
+- **主流模型支持**：新增对 DeepSeek、Qwen3-MoE 和 Qwen3-Next 模型的支持。
+- **基于 Vidur 的请求调度**：集成了基于微软 [Vidur](https://github.com/microsoft/vidur) 适配的请求调度组件（详见 [vidur-alibabacloud](./vidur-alibabacloud/)）。
+- **AICB 推理工作负载生成**：AICB 现已支持为 DeepSeek、Qwen3-MoE 和 Qwen3-Next 生成 prefill/decode 推理工作负载。
+- **DeepSeek 训练工作负载支持**：AICB 新增 DeepSeek 训练工作负载生成支持（由 [@parthpower](https://github.com/parthpower) 贡献）。
+- **SimCCL 首次发布**：SimCCL 集合通信转换模块首次对外公开发布。
+
+## [1.0.0] - 2024-10-18
+
+### 新增
+
+- SimAI 首次开源发布：业界首个全栈高精度 AI 大规模训练模拟器
+- 核心组件：AICB、SimCCL、astra-sim-alibabacloud、ns-3-alibabacloud
+- SimAI-Analytical：基于总线带宽抽象的快速仿真
+- SimAI-Simulation：基于 NS3 的全栈网络仿真
+- SimAI-Physical（Beta）：CPU RDMA 集群物理流量生成
+
+### 学术
+
+- SimAI 论文被 **NSDI'25 Spring** 接收。详见 [论文](https://arxiv.org/abs/2410.07346)。