From messy app metadata and review text to product insights, SQL analysis, a stakeholder-ready dashboard, and a decision memo.
This project asks a product question that matters:
What drives app quality perception, and where do star ratings fail to reflect what users are actually saying?
I cleaned raw Google Play Store app and review data, built a reusable SQLite layer, wrote production-style SQL, ran segmentation and sentiment analysis in Python, used a lightweight ML model for interpretation, and packaged the outputs into a dashboard and memo.
| Area | What This Repo Delivers |
|---|---|
| Business focus | Product quality perception, sentiment gaps, update cadence, install-tier performance |
| Data | App metadata + review sentiment from the Google Play Store dataset |
| Analytics stack | Python, pandas, SQL, SQLite, seaborn, scikit-learn |
| Outputs | Executed notebooks, SQL queries, processed data, dashboard, memo, figures |
| Audience | Hiring managers, recruiters, PMs, analytics interviewers |
| Finding | Result | Why It Matters |
|---|---|---|
| Update cadence | Apps updated in the last 3 months averaged 4.25 stars vs 4.05 for apps untouched for 2+ years | Freshness is a strong quality signal |
| Scale vs quality | Apps in the 10M+ install tier averaged 4.32 stars vs 4.10 for apps below 100K installs | Large products tend to sustain better rating health |
| Ratings vs sentiment | Game showed the biggest mismatch: 4.40 average rating but only 0.058 average sentiment | Ratings alone can hide product friction |
| Portfolio health | Only 30.5% of apps landed in Star Apps; 14.8% landed in Declining Apps |
Portfolio segmentation helps prioritize intervention |
- It goes end to end: raw data, cleaning, SQL, Python, ML interpretation, dashboard, memo, and repo packaging.
- It is framed around a real product question instead of generic exploratory analysis.
- It keeps the ML honest: the model supports interpretation instead of pretending to be the whole project.
- It finishes with decision-ready outputs, not just notebooks.
- Changelog - how the project evolved from cleaning to SQL, modeling, dashboarding, and portfolio packaging
- Roadmap - what would turn this static case study into a stronger analytics product
- Validation and boundaries - what the project proves and what it does not claim
- Product memo - stakeholder-ready recommendations and limitations
flowchart LR
A["Raw App Data"] --> B["Cleaning + Standardization"]
C["Raw Review Data"] --> B
B --> D["SQLite Warehouse"]
D --> E["SQL Analysis"]
D --> F["Python Analysis"]
F --> G["Segmentation + Sentiment + Modeling"]
E --> H["Dashboard Assets"]
G --> H
G --> I["Product Memo"]
- cleaned raw fields like
Installs,Size,Price, andReviews - removed the corrupted source row and dropped unusable ratings
- standardized categorical fields like
TypeandContent Rating - deduplicated app-level records and stored clean tables in SQLite
- generated reusable processed datasets in
data/processed/
The SQL layer answers the core descriptive questions:
- Which categories perform best on rating and review volume?
- How does quality shift by install tier?
- How do free and paid apps compare within category?
- Which apps dominate review volume inside each category?
- How closely do app ratings align with review sentiment?
Core query files:
- category_performance.sql
- install_tiers.sql
- free_paid_rating_delta.sql
- top_apps_ranked.sql
- sentiment_join.sql
Python was used to go beyond descriptive SQL:
- RFM-style segmentation: recency, review volume, and installs
- Sentiment analysis: polarity distribution, category-level negatives, review text patterns
- Recency analysis: how update freshness relates to ratings and installs
- Model interpretation: a
RandomForestClassifierpredicting whether an app clears the4.0+rating threshold
- interactive local dashboard: dashboard/local_dashboard.html
- product memo: memo/product_memo.md
- PDF memo: memo/product_memo.pdf
- repo preview image: repo_preview.png
| Deliverable | Purpose |
|---|---|
| Local Dashboard | Interactive summary for category intelligence, sentiment, and app health |
| Product Memo | Executive summary with findings, recommendations, and limitations |
| Dashboard Preview | Repo-friendly static image for GitHub |
The ML component is intentionally lightweight. The goal was not to build a production model. The goal was to understand which features were most associated with apps clearing a 4.0 rating threshold.
Target
1= app rating>= 4.00= app rating< 4.0
Features
- category
- installs
- reviews
- size
- type
- content rating
- average sentiment
Reported metrics
- Precision: 0.844
- Recall: 0.793
- F1: 0.818
Takeaways
- review volume carried the most predictive weight
- app size and category context also mattered
- sentiment helped, but it worked best as part of the broader feature set
| Path | What You Will Find |
|---|---|
| data/processed/ | Cleaned datasets, SQL outputs, modeling summaries, and final metrics |
| notebooks/ | Executed notebooks for cleaning, SQL, Python analysis, and modeling |
| sql/ | Reusable SQL query files |
| figures/ | Charts used across the dashboard and memo |
| dashboard/ | Local HTML dashboard and preview assets |
| memo/ | Product memo in Markdown and PDF |
| scripts/ | Rebuild pipeline and notebook generator |
python3 -m pip install -r requirements.txtPlace these files in data/raw/:
googleplaystore.csvgoogleplaystore_user_reviews.csv
Dataset source: Kaggle - Google Play Store Apps
python3 scripts/project_pipeline.pypython3 scripts/generate_notebooks.py- cleaned app dataset: data/processed/apps_cleaned.csv
- cleaned review dataset: data/processed/reviews_cleaned.csv
- SQLite database: data/processed/google_playstore.db
- project metrics: data/processed/project_metrics.json
- executed notebooks: notebooks/
- chart exports: figures/
- memo outputs: memo/
Limitations
- This is observational data, so the project highlights patterns rather than proving causality.
- Installs and reviews are proxies for engagement, not true event-level behavior.
- Review sentiment is useful, but some apps have richer review coverage than others.
- The repo ships with a polished local dashboard instead of a public Tableau link.
Next Steps
- add event-level retention and funnel data
- compare pre/post update behavior instead of using freshness as a proxy
- track sentiment and ratings over time instead of treating them as static summaries
- layer in monetization or revenue data for stronger business prioritization
