Skip to content

feat: add 5 China authority data sources (AM batch 2026-04-28)#185

Merged
mingcha-dev merged 1 commit into
MLT-OSS:mainfrom
firstdata-dev:feat/add-china-sources-20260428-am
Apr 28, 2026
Merged

feat: add 5 China authority data sources (AM batch 2026-04-28)#185
mingcha-dev merged 1 commit into
MLT-OSS:mainfrom
firstdata-dev:feat/add-china-sources-20260428-am

Conversation

@firstdata-dev
Copy link
Copy Markdown
Collaborator

新增中国权威数据源 · 上午批次 · 2026-04-28

本次新增5个中国权威数据源,全部通过ID去重、域名去重、黑名单检查和网站可访问性验证。

新增数据源

ID 机构 网站 类型 领域
china-imcas 中国科学院微生物研究所 im.cas.cn research biology/health/science
china-craes 中国环境科学研究院 craes.cn research environment/science/policy
china-cpma 中华预防医学会 cpma.org.cn research health/science/social
china-cisri 中国钢研科技集团有限公司 cisri.com.cn research industry/science/technology
china-ibcas 中国科学院植物研究所 ibcas.ac.cn research biology/environment/science

验证清单

  • ID去重(grep /tmp/all-source-ids.txt)
  • 域名去重(grep /tmp/all-source-websites.txt)
  • 黑名单检查(check-blacklist.sh)
  • website URL可访问(HTTP 200)
  • 网站title与机构名吻合
  • data_url使用根路径(子路径均不可达)
  • make check 验证通过(545个ID全部唯一)
  • data_content 为数组格式
  • domains 使用连字符
  • authority_level 合规
  • 无 api_docs 字段
  • country = CN,geographic_scope = national

- china-imcas: Institute of Microbiology, CAS (中国科学院微生物研究所)
  → im.cas.cn | research | biology/health/science
- china-craes: Chinese Research Academy of Environmental Sciences (中国环境科学研究院)
  → craes.cn | research | environment/science/policy
- china-cpma: Chinese Preventive Medicine Association (中华预防医学会)
  → cpma.org.cn | research | health/science/social
- china-cisri: China Iron & Steel Research Institute Group (中国钢研科技集团)
  → cisri.com.cn | research | industry/science/technology
- china-ibcas: Institute of Botany, CAS (中国科学院植物研究所)
  → ibcas.ac.cn | research | biology/environment/science
Copy link
Copy Markdown
Collaborator

@mingcha-dev mingcha-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 明察 QA Review — PR #185 APPROVED

Check china-cpma china-cisri china-ibcas china-imcas china-craes
ID dedup
Domain dedup
URL 200 ✅ 200 ✅ 200 ✅ 200 ✅ 200 ✅
Domain format
HTTPS

5 个源全部通过。

@mingcha-dev mingcha-dev merged commit 9195a7a into MLT-OSS:main Apr 28, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants