Purpose
Long-running TechEngine tracker for finding missing TechAPI records, drafting import candidates, requesting PR validation, and keeping TechAPI data PRs reviewable.
This issue is the TechEngine-side companion to GetTechAPI/TechAPI#1. It should stay open while the dataset rebuild continues. TechAPI PRs may use Closes GetTechAPI/TechAPI#1 for Development linking, but this TechEngine issue tracks the automation and validation work behind those PRs.
Current Status
Latest Dataset Snapshot From PR #25
| Category |
Total |
Verified |
Unverified |
Missing verified |
Verified % |
| brand |
129 |
0 |
0 |
129 |
n/a |
| soc |
195 |
58 |
137 |
0 |
29.7% |
| smartphone |
6,544 |
184 |
6,360 |
0 |
2.8% |
| gpu |
2,030 |
0 |
2,030 |
0 |
0.0% |
| cpu |
3,977 |
976 |
3,001 |
0 |
24.5% |
| all |
12,896 |
1,218 |
11,549 |
129 |
9.5% |
TechEngine Responsibilities
- Maintain coverage reports that show missing CPU, GPU, smartphone, SoC, and brand records
- Generate or support import batches where source coverage is useful
- Validate TechAPI PRs from reusable branches such as
data/import-staging and feat/site
- Post two useful PR comments:
- Changed-data review: what changed, examples, source/verified counts, and heuristic warnings
- Validation stats: totals, verified coverage, warning callouts, and key command output
- Keep validation warnings actionable without failing expected bulk-import cases
- Add site-build verification only when
site/ files changed
- Keep project metadata, priority, labels, milestones, assignees, and issue linkage filled in automatically where possible
Validation Contract
Data PR validation should include:
| Check |
Expected behavior |
python -m app.validate |
Hard fail on schema/API validation errors |
python integrity_check.py TechAPI/data --strict |
Hard fail on structural anomalies; summarize stable advisory outliers |
| Verified coverage warning |
Warn when category or overall verified coverage is low; do not fail bulk import PRs |
| Heuristic review |
Flag suspicious names, typo-like patterns, duplicates, or source artifacts |
| Site build |
Run only when TechAPI site/ files changed |
Recent Linked Work
| PR / Issue |
Repository |
Status |
Main change |
| TechAPI#25 |
TechAPI |
Open |
Import 5,000 PhoneDB raw smartphone variants plus 45 Mobiles 2025 records |
| TechAPI#24 |
TechAPI |
Merged |
Add smartphone and SoC records, improve PR metadata and project automation |
| TechAPI#23 |
TechAPI |
Merged |
Import a larger smartphone batch |
| TechAPI#22 |
TechAPI |
Merged |
Add smartphone and SoC records from Kaggle-derived sources |
| TechEngine#19 |
TechEngine |
Open |
Auto-generated coverage gaps report |
| TechEngine#18 |
TechEngine |
Open |
Deterministic dump timestamps for daily refresh readiness |
Remaining Work
- Keep coverage reports useful by reducing obvious table-artifact slugs and source noise
- Expand automated candidate generation beyond CPU/GPU into smartphone and SoC sources where structured data is reliable
- Continue improving PR comments so reviewers can see what changed without opening thousands of files
- Make low verified coverage warnings clear and category-specific
- Keep issue and project metadata synchronized for TechAPI and TechEngine PRs
- Preserve weekly coverage workflows unless a separate daily/PR workflow is explicitly added
- Avoid closing this tracker until the TechAPI dataset rebuild and supporting automation are mature
Operational Notes
- Assignees: @Seungpyo1007 and @TechEngineBot
- Labels:
enhancement
- Milestone: Daily automation
- Project: TechEngine work
- Priority: High
- Start date: 2026-05-29
- Target date: 2026-09-30
TechEngineBot should comment on relevant PRs and update linked tracker issues whenever validation or metadata automation runs.
Purpose
Long-running TechEngine tracker for finding missing TechAPI records, drafting import candidates, requesting PR validation, and keeping TechAPI data PRs reviewable.
This issue is the TechEngine-side companion to GetTechAPI/TechAPI#1. It should stay open while the dataset rebuild continues. TechAPI PRs may use
Closes GetTechAPI/TechAPI#1for Development linking, but this TechEngine issue tracks the automation and validation work behind those PRs.Current Status
data/import-stagingverified: falseuntil manual audit or follow-up verification confirms themLatest Dataset Snapshot From PR #25
TechEngine Responsibilities
data/import-stagingandfeat/sitesite/files changedValidation Contract
Data PR validation should include:
python -m app.validatepython integrity_check.py TechAPI/data --strictsite/files changedRecent Linked Work
Remaining Work
Operational Notes
enhancementTechEngineBot should comment on relevant PRs and update linked tracker issues whenever validation or metadata automation runs.