Skip to content

data(smartphone): import PhoneDB raw variants (batch 2-3) + normalize names#28

Merged
Seungpyo1007 merged 20 commits into
mainfrom
data/import-staging
Jun 19, 2026
Merged

data(smartphone): import PhoneDB raw variants (batch 2-3) + normalize names#28
Seungpyo1007 merged 20 commits into
mainfrom
data/import-staging

Conversation

@Seungpyo1007

@Seungpyo1007 Seungpyo1007 commented Jun 19, 2026

Copy link
Copy Markdown
Member

Summary

Continues the PhoneDB variant-level import (batch 1 landed in #27). Adds ~5,450 new variant-level smartphones + 86 SoC seeds, normalizes 7,168 device names (fixes the #27 double-space heuristic warnings), and refreshes the public dump. All raw seed (verified: false) with per-record source_urls to phonedb.net.

Changes vs main

  • +5,450 smartphones (PhoneDB device ids ~25,227→16,228; years skew 2020–2023).
  • +86 SoC seeds (auto-detected chipsets, verified: false).
  • 2,596 existing (batch-1) smartphone names normalized (whitespace collapse).
  • site/public/v1 dump refreshed as the final commit → now 24,655 smartphones / 1,773 SoC (was stale at 19,205 / 1,560).
  • Distinct per-batch commit messages (brand/year); Refs #1 in each commit body.

Validation (local)

  • python -m app.validate → ✅ passed
  • python TechEngine/integrity_check.py data --strict → ✅ no hard anomalies
  • Dump generated from an isolated git-archive snapshot + fresh DB (so a concurrent session's in-progress writes to the shared data/ dir could not contaminate it).

Closes #1

@Seungpyo1007

Copy link
Copy Markdown
Member Author

Validation review — PASS ✅

Commands run (local):

  • python -m app.validate → ✅ passed
  • python TechEngine/integrity_check.py data --strict → ✅ no hard anomalies

Data changes vs main:

category added modified
smartphone 5,450 2,596 (name normalize)
soc 86

Example recorddata/smartphone/samsung/2026/samsung-sm-s9480-galaxy-s26-ultra-5g-...json:

{ "slug": "samsung-sm-s9480-galaxy-s26-ultra-5g-dual-sim-td-lte-cn-hk-tw-512gb-samsung-miracle-3",
  "name": "Samsung SM-S9480 Galaxy S26 Ultra 5G Dual SIM TD-LTE CN HK TW 512GB (Samsung Miracle 3)",
  "brand": "samsung", "soc": "qualcomm-snapdragon-8-elite-gen-5-sm8850-1-ad-for-galaxy",
  "release_date": "2026-03-11", "ram_gb": 12, "battery_mah": 5000, "os": "Android",
  "verified": false, "source_urls": ["https://phonedb.net/index.php?m=device&id=25688"] }

Heuristic warnings: the double-space names flagged on #27 are fixed (7,168 normalized). Remaining integrity-gate ratio outliers are pre-existing GPU/CPU benchmark notes, unrelated to this PR.

Site build: N/A (no site/ changes).

@Seungpyo1007

Copy link
Copy Markdown
Member Author

Dataset stats (post-merge projection)

category total verified unverified verified %
smartphone 24,655 184 24,471 0.7%
soc 1,773 58 1,715 3.3%
brand 189
cpu 3,977
gpu 2,030

Warning

Verified coverage is far below 50% (smartphone 0.7%, soc 3.3%) — expected for bulk raw seed imports. Verification is deferred to TechEngine / manual audit per the import workflow.

@TechEngineBot

TechEngineBot commented Jun 19, 2026

Copy link
Copy Markdown
Member

TechEngine change review: PASS

Check Result
python -m app.validate PASS
python integrity_check.py TechAPI/data --strict PASS

Changed data

Category Added Modified Deleted Added verified Added unverified Added Kaggle-sourced
brand 0 0 0 0 0 0
soc 86 0 0 0 86 0
smartphone 5450 2596 0 0 5450 0
gpu 0 0 0 0 0 0
cpu 0 0 0 0 0 0

Changed record examples

soc added

  • soc/apple/2016/apple-a10-fusion-apl1w24-t8010.json - Apple A10 Fusion APL1W24 T8010
  • soc/apple/2019/apple-a11-bionic-apl1w72-t8015.json - Apple A11 Bionic APL1W72 T8015
  • soc/apple/2020/apple-a13-bionic-apl1w85-t8030.json - Apple A13 Bionic APL1W85 T8030
  • soc/apple/2020/apple-a14-bionic-apl1w01-t8101.json - Apple A14 Bionic APL1W01 T8101
  • soc/apple/2022/apple-a15-bionic-apl1w07-t8110.json - Apple A15 Bionic APL1W07 T8110
  • soc/apple/2022/apple-a15-bionic-lite-apl1w07-t8110.json - Apple A15 Bionic Lite APL1W07 T8110
  • soc/arm/2020/powervr-ge8320.json - PowerVR GE8320
  • soc/arm/2022/jlq-jr510.json - JLQ JR510
  • soc/hisilicon/2020/hisilicon-honor-kirin810-hi6280.json - HiSilicon Honor KIRIN810 Hi6280
  • soc/hisilicon/2020/hisilicon-honor-kirin820-5g.json - HiSilicon Honor KIRIN820 5G
  • soc/hisilicon/2020/hisilicon-honor-kirin9000e-5g.json - HiSilicon Honor KIRIN9000E 5G
  • soc/hisilicon/2021/hisilicon-honor-kirin820e-5g.json - HiSilicon Honor KIRIN820E 5G
  • soc/hisilicon/2021/hisilicon-honor-kirin9000-4g.json - HiSilicon Honor KIRIN9000 4G
  • soc/hisilicon/2021/hisilicon-honor-kirin9000-5g.json - HiSilicon Honor KIRIN9000 5G
  • soc/hisilicon/2021/hisilicon-honor-kirin985-5g.json - HiSilicon Honor KIRIN985 5G
  • ... 71 more

smartphone added

  • smartphone/acer/2022/acer-sospiro-a60-latam.json - Acer Sospiro A60 LATAM
  • smartphone/alcatel/2016/alcatel-one-touch-pop-4-global-dual-sim-lte-5056d-pop-4-plus-tcl-5056.json - Alcatel One Touch Pop 4+ Global Dual SIM LTE 5056D / Pop 4 Plus (TCL 5056)
  • smartphone/alcatel/2016/alcatel-one-touch-pop-4-lte-latam-5056a-pop-4-plus-tcl-5056.json - Alcatel One Touch Pop 4+ LTE LATAM 5056A / Pop 4 Plus (TCL 5056)
  • smartphone/alcatel/2016/alcatel-one-touch-pop-4-plus-dual-sim-lte-am-5056e-pop-4-tcl-5056.json - Alcatel One Touch Pop 4 Plus Dual SIM LTE AM 5056E / Pop 4+ (TCL 5056)
  • smartphone/alcatel/2016/alcatel-one-touch-pop-4-plus-lte-am-5056g-pop-4-tcl-5056.json - Alcatel One Touch Pop 4 Plus LTE AM 5056G / Pop 4+ (TCL 5056)
  • smartphone/alcatel/2017/alcatel-u5-3g-dual-sim-emea-8gb-4047d-tcl-4047.json - Alcatel U5 3G Dual SIM EMEA 8GB 4047D (TCL 4047)
  • smartphone/alcatel/2017/alcatel-u5-3g-emea-8gb-4047x-tcl-4047.json - Alcatel U5 3G EMEA 8GB 4047X (TCL 4047)
  • smartphone/alcatel/2017/alcatel-u5-dual-sim-td-lte-apac-8gb-5044i-tcl-5044.json - Alcatel U5 Dual SIM TD-LTE APAC 8GB 5044I (TCL 5044)
  • smartphone/alcatel/2017/alcatel-u5-hd-lte-emea-8gb-5047y-tcl-5047.json - Alcatel U5 HD LTE EMEA 8GB 5047Y (TCL 5047)
  • smartphone/alcatel/2017/alcatel-u5-td-lte-apac-8gb-5044t-optus-x-spirit-tcl-5044.json - Alcatel U5 TD-LTE APAC 8GB 5044T / Optus X Spirit (TCL 5044)
  • smartphone/alcatel/2018/alcatel-1-dual-sim-lte-emea-16gb-5033f-tcl-u3a.json - Alcatel 1 Dual SIM LTE EMEA 16GB 5033F (TCL U3A)
  • smartphone/alcatel/2018/alcatel-u5-hd-dual-sim-td-lte-apac-16gb-5047i-tcl-5047.json - Alcatel U5 HD Dual SIM TD-LTE APAC 16GB 5047I (TCL 5047)
  • smartphone/alcatel/2018/alcatel-u5-hd-premium-dual-sim-lte-emea-16gb-5047u-tcl-5047.json - Alcatel U5 HD Premium Dual SIM LTE EMEA 16GB 5047U (TCL 5047)
  • smartphone/alcatel/2019/alcatel-1s-2019-dual-sim-lte-latam-5024j-tcl-5024.json - Alcatel 1S 2019 Dual SIM LTE LATAM 5024J (TCL 5024)
  • smartphone/alcatel/2019/alcatel-1s-2019-lte-latam-5024a-tcl-5024.json - Alcatel 1S 2019 LTE LATAM 5024A (TCL 5024)
  • ... 5435 more

smartphone modified

  • smartphone/apple/2023/apple-iphone-15-5g-a3089-dual-sim-td-lte-jp-ca-mx-sa-128gb-apple-iphone-154.json - Apple iPhone 15 5G A3089 Dual SIM TD-LTE JP CA MX SA 128GB (Apple iPhone 15,4)
  • smartphone/apple/2023/apple-iphone-15-5g-a3089-dual-sim-td-lte-jp-ca-mx-sa-256gb-apple-iphone-154.json - Apple iPhone 15 5G A3089 Dual SIM TD-LTE JP CA MX SA 256GB (Apple iPhone 15,4)
  • smartphone/apple/2023/apple-iphone-15-5g-a3089-dual-sim-td-lte-jp-ca-mx-sa-512gb-apple-iphone-154.json - Apple iPhone 15 5G A3089 Dual SIM TD-LTE JP CA MX SA 512GB (Apple iPhone 15,4)
  • smartphone/apple/2023/apple-iphone-15-5g-a3090-global-dual-sim-td-lte-128gb-apple-iphone-154.json - Apple iPhone 15 5G A3090 Global Dual SIM TD-LTE 128GB (Apple iPhone 15,4)
  • smartphone/apple/2023/apple-iphone-15-5g-a3090-global-dual-sim-td-lte-256gb-apple-iphone-154.json - Apple iPhone 15 5G A3090 Global Dual SIM TD-LTE 256GB (Apple iPhone 15,4)
  • smartphone/apple/2023/apple-iphone-15-5g-a3090-global-dual-sim-td-lte-512gb-apple-iphone-154.json - Apple iPhone 15 5G A3090 Global Dual SIM TD-LTE 512GB (Apple iPhone 15,4)
  • smartphone/apple/2023/apple-iphone-15-5g-a3092-dual-sim-td-lte-cn-hk-128gb-apple-iphone-154.json - Apple iPhone 15 5G A3092 Dual SIM TD-LTE CN HK 128GB (Apple iPhone 15,4)
  • smartphone/apple/2023/apple-iphone-15-5g-a3092-dual-sim-td-lte-cn-hk-256gb-apple-iphone-154.json - Apple iPhone 15 5G A3092 Dual SIM TD-LTE CN HK 256GB (Apple iPhone 15,4)
  • smartphone/apple/2023/apple-iphone-15-5g-a3092-dual-sim-td-lte-cn-hk-512gb-apple-iphone-154.json - Apple iPhone 15 5G A3092 Dual SIM TD-LTE CN HK 512GB (Apple iPhone 15,4)
  • smartphone/apple/2023/apple-iphone-15-5g-uw-a2846-dual-sim-td-lte-128gb-apple-iphone-154.json - Apple iPhone 15 5G UW A2846 Dual SIM TD-LTE 128GB (Apple iPhone 15,4)
  • smartphone/apple/2023/apple-iphone-15-5g-uw-a2846-dual-sim-td-lte-256gb-apple-iphone-154.json - Apple iPhone 15 5G UW A2846 Dual SIM TD-LTE 256GB (Apple iPhone 15,4)
  • smartphone/apple/2023/apple-iphone-15-5g-uw-a2846-dual-sim-td-lte-512gb-apple-iphone-154.json - Apple iPhone 15 5G UW A2846 Dual SIM TD-LTE 512GB (Apple iPhone 15,4)
  • smartphone/apple/2023/apple-iphone-15-plus-5g-a3093-dual-sim-td-lte-jp-ca-mx-sa-128gb-apple-iphone-155.json - Apple iPhone 15 Plus 5G A3093 Dual SIM TD-LTE JP CA MX SA 128GB (Apple iPhone 15,5)
  • smartphone/apple/2023/apple-iphone-15-plus-5g-a3093-dual-sim-td-lte-jp-ca-mx-sa-256gb-apple-iphone-155.json - Apple iPhone 15 Plus 5G A3093 Dual SIM TD-LTE JP CA MX SA 256GB (Apple iPhone 15,5)
  • smartphone/apple/2023/apple-iphone-15-plus-5g-a3093-dual-sim-td-lte-jp-ca-mx-sa-512gb-apple-iphone-155.json - Apple iPhone 15 Plus 5G A3093 Dual SIM TD-LTE JP CA MX SA 512GB (Apple iPhone 15,5)
  • ... 2581 more

Heuristic review

  • Added records by manufacturer/brand: samsung: 1114, oppo: 670, xiaomi: 662, vivo: 513, huawei: 467, motorola: 377, apple: 251, zte: 243

  • Added records by source class: other: 5536

  • Heuristic warnings: 1 total; showing first 1.

    • smartphone: smartphone/fujitsu/2022/fujitsu-raku-raku-easy-smartphone-td-lte-jp-f-52b.json: repeated adjacent word in name

@TechEngineBot

TechEngineBot commented Jun 19, 2026

Copy link
Copy Markdown
Member

TechEngine validation stats: PASS

Data summary

Category Total Verified Unverified Missing verified Tracked Verified % of tracked
brand 189 0 60 129 60 0.0%
soc 1773 58 1715 0 1773 3.3%
smartphone 24655 184 24471 0 24655 0.7%
gpu 2030 0 2030 0 2030 0.0%
cpu 3977 976 3001 0 3977 24.5%
all 32624 1218 31277 129 32495 3.7%

Warning

Tracked verified coverage is below 50% for brand 0.0% (0/60), gpu 0.0% (0/2030), smartphone 0.7% (184/24655), soc 3.3% (58/1773), all 3.7% (1218/32495), cpu 24.5% (976/3977).
Tracked coverage excludes records missing the verified field; see the Missing verified column for those records.
This does not fail validation. Keep imported records verified: false until manual audit, but treat this as follow-up verification work before relying on the affected categories as curated data.

Validation notes

  • Full advisory outlier listings are suppressed on successful runs because they are dataset-wide and mostly stable between PRs.
  • Failure runs still include a detailed log excerpt for debugging.

Key output:

## app.validate
## integrity_check.py --strict
loaded CPU=3977 GPU=2030
✅ integrity gate: no hard anomalies.
Integrity section Flagged lines
structural 0
CPU name/tier consistency (desktop mainstream only) 0
CPU single>multi (cinebench/geekbench — should be multi>=single) 0
CPU era-vs-score outliers 8
CPU cross-source ratio outliers (possible wrong-variant) 152
GPU cross-source ratio outliers + sanity 18

@Seungpyo1007 Seungpyo1007 merged commit a427fd6 into main Jun 19, 2026
4 checks passed
@github-project-automation github-project-automation Bot moved this from In Progress to Done in TechAPI-Project Jun 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

data Dataset changes enhancement New feature or request

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Massive dataset rebuild: CPU + brand + GPU + smartphone + SoC (1989-2026)

2 participants