Skip to content

data(smartphone): import remaining PhoneDB tail records#30

Merged
Seungpyo1007 merged 10 commits into
mainfrom
data/import-staging
Jun 19, 2026
Merged

data(smartphone): import remaining PhoneDB tail records#30
Seungpyo1007 merged 10 commits into
mainfrom
data/import-staging

Conversation

@Seungpyo1007

@Seungpyo1007 Seungpyo1007 commented Jun 19, 2026

Copy link
Copy Markdown
Member

Summary

  • import the remaining non-overlapping PhoneDB tail records into the reusable data/import-staging branch
  • add 6,148 raw smartphone records and 219 net-new SoC records after rebasing over the parallel PhoneDB tail PR and dropping duplicate SoC stubs
  • refresh the published static site/public/v1 dump so the homepage/API snapshot matches the data tree

Data source

  • PhoneDB: https://phonedb.net
  • Imported records remain verified: false for later TechEngine/manual audit.
  • The PhoneDB ID sweep reached the tail of the source range (last_id: 1). PhoneDB is effectively exhausted for this scraper; the 100k target needs additional sources/categories rather than duplicate fabrication.

Dataset size after dump

  • brands: 189
  • socs: 2,067
  • smartphones: 31,671
  • gpus: 2,030
  • cpus: 3,977
  • total: 39,934

Verification

  • python -m app.validate
  • python TechEngine\integrity_check.py data --strict
  • git diff --check origin/main...HEAD
  • cd site && npm.cmd run build

Closes #1

@Seungpyo1007 Seungpyo1007 added enhancement New feature or request data Dataset changes labels Jun 19, 2026
@TechEngineBot

TechEngineBot commented Jun 19, 2026

Copy link
Copy Markdown
Member

TechEngine change review: PASS

Check Result
python -m app.validate PASS
python integrity_check.py TechAPI/data --strict PASS

Changed data

Category Added Modified Deleted Added verified Added unverified Added Kaggle-sourced
brand 0 0 0 0 0 0
soc 219 0 0 0 219 0
smartphone 6148 0 0 0 6148 0
gpu 0 0 0 0 0 0
cpu 0 0 0 0 0 0

Changed record examples

soc added

  • soc/apple/2012/apple-a5-apl0498-s5l8940.json - Apple A5 APL0498 S5L8940
  • soc/apple/2012/apple-samsung-intrinsity-apple-a4-apl0398-s5l8930.json - Samsung-Intrinsity Apple A4 APL0398 S5L8930
  • soc/apple/2014/apple-a5r2-apl2498.json - Apple A5R2 APL2498
  • soc/apple/2014/apple-a6-apl0598-s5l8950x.json - Apple A6 APL0598 S5L8950X
  • soc/apple/2014/apple-a7-apl0698-s5l8960x.json - Apple A7 APL0698 S5L8960X
  • soc/apple/2014/apple-a8-apl1011-t7000.json - Apple A8 APL1011 T7000
  • soc/apple/2015/apple-a9-apl0898-s8000.json - Apple A9 APL0898 S8000
  • soc/apple/2015/apple-a9-apl1022-s8003.json - Apple A9 APL1022 S8003
  • soc/arm/2004/arm-720t.json - ARM 720T
  • soc/arm/2008/powervr-mbx.json - PowerVR MBX
  • soc/arm/2009/centrality-atlas-iii.json - Centrality Atlas III
  • soc/arm/2009/sirf-atlas-iii-at640.json - SiRF Atlas-III AT640
  • soc/arm/2010/arm-1136j-s.json - ARM 1136J-S
  • soc/arm/2010/sirf-atlasiv.json - SiRF atlasIV
  • soc/arm/2010/st-ericsson-t6719.json - ST-Ericsson T6719
  • ... 204 more

smartphone added

  • smartphone/acer/2009/acer-betouch-e100-acer-c1.json - Acer beTouch E100 (Acer C1)
  • smartphone/acer/2009/acer-betouch-e100-b-acer-c1.json - Acer beTouch E100 B (Acer C1)
  • smartphone/acer/2009/acer-betouch-e101-acer-e1.json - Acer beTouch E101 (Acer E1)
  • smartphone/acer/2009/acer-betouch-e200-acer-l1.json - Acer beTouch E200 (Acer L1)
  • smartphone/acer/2009/acer-betouch-e200-b-acer-l1.json - Acer beTouch E200 B (Acer L1)
  • smartphone/acer/2009/acer-dx650.json - Acer DX650
  • smartphone/acer/2009/acer-liquid-s100-acer-a1.json - Acer Liquid S100 (Acer A1)
  • smartphone/acer/2009/acer-m900-tempo-m900.json - Acer M900 / Tempo M900
  • smartphone/acer/2009/acer-neotouch-s200-acer-f1.json - Acer neoTouch S200 (Acer F1)
  • smartphone/acer/2009/acer-neotouch-s200-b-acer-f1.json - Acer neoTouch S200 B (Acer F1)
  • smartphone/acer/2009/acer-tempo-dx900.json - Acer Tempo DX900
  • smartphone/acer/2009/acer-tempo-f900.json - Acer Tempo F900
  • smartphone/acer/2009/acer-tempo-x960.json - Acer Tempo X960
  • smartphone/acer/2010/acer-betouch-e110-b.json - Acer beTouch E110 B
  • smartphone/acer/2010/acer-betouch-e110.json - Acer beTouch E110
  • ... 6133 more

Heuristic review

  • Added records by manufacturer/brand: samsung: 1142, lg: 636, huawei: 397, htc: 331, zte: 324, motorola: 296, alcatel: 256, sony: 221

  • Added records by source class: other: 6367

  • Heuristic warnings: 14 total; showing first 14.

    • smartphone: smartphone/coolpad/2014/coolpad-ivvi-k1-k1-nt-dual-sim-td-lte.json: repeated adjacent word in name
    • smartphone: smartphone/coolpad/2016/coolpad-ivvi-i3-i3-01-dual-sim-lte-64gb.json: repeated adjacent word in name
    • smartphone: smartphone/huawei/2013/huawei-ascend-d2-d2-0082.json: repeated adjacent word in name
    • smartphone: smartphone/huawei/2013/huawei-ascend-d2-d2-2010-cdma.json: repeated adjacent word in name
    • smartphone: smartphone/huawei/2013/huawei-ascend-d2-d2-5000-td.json: repeated adjacent word in name
    • smartphone: smartphone/huawei/2013/huawei-ascend-d2-d2-6070-td-lte.json: repeated adjacent word in name
    • smartphone: smartphone/huawei/2013/huawei-ascend-d2-d2-6114-hw-03e-huawei-u9701l.json: repeated adjacent word in name
    • smartphone: smartphone/huawei/2014/huawei-ascend-g6-g6-l11-4g-lte-a.json: repeated adjacent word in name
    • smartphone: smartphone/huawei/2014/huawei-ascend-g6-g6-l22-4g-lte-a.json: repeated adjacent word in name
    • smartphone: smartphone/huawei/2014/huawei-ascend-g6-g6-l33-4g-lte-a.json: repeated adjacent word in name
    • smartphone: smartphone/huawei/2014/huawei-ascend-g6-g6-t00-td.json: repeated adjacent word in name
    • smartphone: smartphone/huawei/2014/huawei-ascend-g6-g6-u00.json: repeated adjacent word in name
    • smartphone: smartphone/motorola/2011/motorola-milestone-3-xt860-xt860-4g.json: repeated adjacent word in name
    • smartphone: smartphone/nokia/2009/nokia-x6-x6-00-32gb-nokia-alvin.json: repeated adjacent word in name

@TechEngineBot

TechEngineBot commented Jun 19, 2026

Copy link
Copy Markdown
Member

TechEngine validation stats: PASS

Data summary

Category Total Verified Unverified Missing verified Tracked Verified % of tracked
brand 189 0 60 129 60 0.0%
soc 2067 58 2009 0 2067 2.8%
smartphone 31671 184 31487 0 31671 0.6%
gpu 2030 0 2030 0 2030 0.0%
cpu 3977 976 3001 0 3977 24.5%
all 39934 1218 38587 129 39805 3.1%

Warning

Tracked verified coverage is below 50% for brand 0.0% (0/60), gpu 0.0% (0/2030), smartphone 0.6% (184/31671), soc 2.8% (58/2067), all 3.1% (1218/39805), cpu 24.5% (976/3977).
Tracked coverage excludes records missing the verified field; see the Missing verified column for those records.
This does not fail validation. Keep imported records verified: false until manual audit, but treat this as follow-up verification work before relying on the affected categories as curated data.

Validation notes

  • Full advisory outlier listings are suppressed on successful runs because they are dataset-wide and mostly stable between PRs.
  • Failure runs still include a detailed log excerpt for debugging.

Key output:

## app.validate
## integrity_check.py --strict
loaded CPU=3977 GPU=2030
✅ integrity gate: no hard anomalies.
Integrity section Flagged lines
structural 0
CPU name/tier consistency (desktop mainstream only) 0
CPU single>multi (cinebench/geekbench — should be multi>=single) 0
CPU era-vs-score outliers 8
CPU cross-source ratio outliers (possible wrong-variant) 152
GPU cross-source ratio outliers + sanity 18

@Seungpyo1007 Seungpyo1007 force-pushed the data/import-staging branch from 1b98089 to 9be9248 Compare June 19, 2026 10:31
@Seungpyo1007 Seungpyo1007 merged commit db95032 into main Jun 19, 2026
4 checks passed
@github-project-automation github-project-automation Bot moved this from In Progress to Done in TechAPI-Project Jun 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

data Dataset changes enhancement New feature or request

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Massive dataset rebuild: CPU + brand + GPU + smartphone + SoC (1989-2026)

2 participants