Skip to content

data(mobile): import GSMArena Kaggle device records#34

Merged
Seungpyo1007 merged 4 commits into
mainfrom
data/import-staging
Jun 20, 2026
Merged

data(mobile): import GSMArena Kaggle device records#34
Seungpyo1007 merged 4 commits into
mainfrom
data/import-staging

Conversation

@Seungpyo1007

Copy link
Copy Markdown
Member

Summary

  • import GSMArena Kaggle device records from arwinneil/gsmarena-phone-dataset
  • add 2,225 smartphone records in the variant folder layout
  • add 302 tablet records and 2 watch records in the variant folder layout
  • refresh the published site/public/v1 dump for smartphones, tablets, and watches

Source

Data policy

  • all imported bulk records are verified: false
  • records keep source_urls and variant.source_dataset for later audit
  • smartphone imports only use mapped SoCs so existing API validation remains strict

Verification

  • python -m app.validate PASS
  • python TechEngine\integrity_check.py data --strict PASS
  • cd site && npm.cmd run build PASS (2 page(s) built in 123.02s)
  • git diff --check origin/main...HEAD PASS

Closes #1

@TechEngineBot

TechEngineBot commented Jun 20, 2026

Copy link
Copy Markdown
Member

TechEngine change review: PASS

Check Result
python -m app.validate PASS
python integrity_check.py TechAPI/data --strict PASS

Changed data

Category Added Modified Deleted Added verified Added unverified Added Kaggle-sourced
brand 0 0 0 0 0 0
soc 0 0 0 0 0 0
smartphone 2225 0 0 0 2225 2225
tablet 302 0 0 0 302 302
watch 2 0 0 0 2 2
pda 0 0 0 0 0 0
gpu 0 0 0 0 0 0
cpu 0 0 0 0 0 0

Changed record examples

smartphone added

  • smartphone/acer/2009/betouch-e100/acer-betouch-e100.json - Acer beTouch E100
  • smartphone/acer/2009/betouch-e101/acer-betouch-e101.json - Acer beTouch E101
  • smartphone/acer/2009/betouch-e200/acer-betouch-e200.json - Acer beTouch E200
  • smartphone/acer/2009/liquid/acer-liquid.json - Acer Liquid
  • smartphone/acer/2009/neotouch/acer-neotouch.json - Acer neoTouch
  • smartphone/acer/2010/betouch-e130/acer-betouch-e130.json - Acer beTouch E130
  • smartphone/acer/2010/liquid-e/acer-liquid-e.json - Acer Liquid E
  • smartphone/acer/2010/liquid-mt/acer-liquid-mt.json - Acer Liquid mt
  • smartphone/acer/2010/neotouch-p300/acer-neotouch-p300.json - Acer neoTouch P300
  • smartphone/acer/2010/stream/acer-stream.json - Acer Stream
  • smartphone/acer/2011/liquid-mini-e310/acer-liquid-mini-e310.json - Acer Liquid mini E310
  • smartphone/acer/2012/liquid-gallant-duo/acer-liquid-gallant-duo.json - Acer Liquid Gallant Duo
  • smartphone/acer/2012/liquid-gallant-e350/acer-liquid-gallant-e350.json - Acer Liquid Gallant E350
  • smartphone/acer/2012/liquid-z110/acer-liquid-z110.json - Acer Liquid Z110
  • smartphone/acer/2013/liquid-e1/acer-liquid-e1.json - Acer Liquid E1
  • ... 2210 more

tablet added

  • tablet/acer/2011/iconia-smart/acer-iconia-smart.json - Acer Iconia Smart
  • tablet/acer/2011/iconia-tab-a100/acer-iconia-tab-a100.json - Acer Iconia Tab A100
  • tablet/acer/2011/iconia-tab-a101/acer-iconia-tab-a101.json - Acer Iconia Tab A101
  • tablet/acer/2012/iconia-tab-a110/acer-iconia-tab-a110.json - Acer Iconia Tab A110
  • tablet/acer/2013/iconia-tab-a1-810/acer-iconia-tab-a1-810.json - Acer Iconia Tab A1-810
  • tablet/acer/2013/iconia-tab-a1-811/acer-iconia-tab-a1-811.json - Acer Iconia Tab A1-811
  • tablet/acer/2013/iconia-tab-b1-710/acer-iconia-tab-b1-710.json - Acer Iconia Tab B1-710
  • tablet/acer/2013/iconia-tab-b1-a71/acer-iconia-tab-b1-a71.json - Acer Iconia Tab B1-A71
  • tablet/acer/2014/iconia-a1-830/acer-iconia-a1-830.json - Acer Iconia A1-830
  • tablet/acer/2014/iconia-b1-720/acer-iconia-b1-720.json - Acer Iconia B1-720
  • tablet/acer/2014/iconia-b1-721/acer-iconia-b1-721.json - Acer Iconia B1-721
  • tablet/acer/2014/iconia-tab-8-a1-840fhd/acer-iconia-tab-8-a1-840fhd.json - Acer Iconia Tab 8 A1-840FHD
  • tablet/acer/2015/iconia-one-8-b1-820/acer-iconia-one-8-b1-820.json - Acer Iconia One 8 B1-820
  • tablet/acer/2015/predator-8/acer-predator-8.json - Acer Predator 8
  • tablet/acer/2016/iconia-talk-s/acer-iconia-talk-s.json - Acer Iconia Talk S
  • ... 287 more

watch added

  • watch/intex/2015/irist-smartwatch/intex-irist-smartwatch.json - Intex IRist Smartwatch
  • watch/lg/2015/watch-urbane-2nd-edition-lte/lg-watch-urbane-2nd-edition-lte.json - Lg Watch Urbane 2nd Edition LTE

Heuristic review

  • Added records by manufacturer/brand: samsung: 225, htc: 172, huawei: 149, lg: 144, blu: 131, lenovo: 126, zte: 104, motorola: 97
  • Added records by source class: kaggle: 2529
  • Heuristic warnings: none found.

@TechEngineBot

TechEngineBot commented Jun 20, 2026

Copy link
Copy Markdown
Member

TechEngine validation stats: PASS

Data summary

Category Total Verified Unverified Missing verified Tracked Verified % of tracked
brand 189 0 60 129 60 0.0%
soc 2075 58 2017 0 2075 2.8%
smartphone 36355 184 36171 0 36355 0.5%
tablet 754 0 754 0 754 0.0%
watch 80 0 80 0 80 0.0%
pda 110 0 110 0 110 0.0%
gpu 2030 0 2030 0 2030 0.0%
cpu 3977 976 3001 0 3977 24.5%
all 45570 1218 44223 129 45441 2.7%

Warning

Tracked verified coverage is below 50% for brand 0.0% (0/60), tablet 0.0% (0/754), watch 0.0% (0/80), pda 0.0% (0/110), gpu 0.0% (0/2030), smartphone 0.5% (184/36355), all 2.7% (1218/45441), soc 2.8% (58/2075), and 1 more.
Tracked coverage excludes records missing the verified field; see the Missing verified column for those records.
This does not fail validation. Keep imported records verified: false until manual audit, but treat this as follow-up verification work before relying on the affected categories as curated data.

Validation notes

  • Full advisory outlier listings are suppressed on successful runs because they are dataset-wide and mostly stable between PRs.
  • Failure runs still include a detailed log excerpt for debugging.

Key output:

## app.validate
## integrity_check.py --strict
loaded CPU=3977 GPU=2030
✅ integrity gate: no hard anomalies.
Integrity section Flagged lines
structural 0
CPU name/tier consistency (desktop mainstream only) 0
CPU single>multi (cinebench/geekbench — should be multi>=single) 0
CPU era-vs-score outliers 8
CPU cross-source ratio outliers (possible wrong-variant) 152
GPU cross-source ratio outliers + sanity 18

@Seungpyo1007 Seungpyo1007 merged commit 58c6c26 into main Jun 20, 2026
4 checks passed
@github-project-automation github-project-automation Bot moved this from In Progress to Done in TechAPI-Project Jun 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

data Dataset changes enhancement New feature or request site Homepage and public site changes

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Massive dataset rebuild: CPU + brand + GPU + smartphone + SoC (1989-2026)

2 participants