Skip to content

Tianning-lab/global-property-intelligence

Repository files navigation

Global Property Intelligence

Cross-market real estate analytics engine covering 11 global markets with 48M+ government-sourced transactions.

[markets: 11] [transactions: 48M+] [sources: government open data] [schema: unified] [license: MIT]

What is this?

Global Property Intelligence is an open-source analytics engine that normalizes residential transaction data from 11 government sources into a unified schema, then runs 20 analytics modules across all markets simultaneously. Every data point comes from an official government registry -- no scraped listings, no estimates, no black boxes. Compare price dynamics in London vs Dubai vs Taipei from a single CLI.

Markets Covered

Market Country Source Transactions Date Range Update Freq
United Kingdom GB HM Land Registry Price Paid 31,000,000 1995--2026 Monthly
France FR DVF (DGFiP) 8,300,000 2014--2025 Semi-annual
Singapore SG HDB Resale Flat Prices 954,000 1990--2026 Monthly
New York City US DOF Rolling Sales 558,000 2016--2025 Monthly
Chicago US Cook County Assessor 2,600,000 1999--2026 Continuous
Dubai AE Dubai Land Department 1,300,000 2003--2026 Manual
Miami US Miami-Dade Property Appraiser 940,000 2016--2026 Continuous
Philadelphia US City of Philadelphia RTT + OPA 5,000,000 2016--2026 Continuous
Connecticut US CT OPM 1,100,000 2001--2024 Historical
Ireland IE PSRA Property Price Register 800,000 2010--2022 Monthly
Taiwan TW MOI Real Price Registration 668,000 2012--2026 Quarterly

Total: 48M+ verified government transactions across 4 continents.

Quick Start

git clone https://github.com/nwc-advisory/global-property-intelligence.git
cd global-property-intelligence
pip install -r requirements.txt
python cli.py dashboard

To load data from the Kaggle dataset:

# Download from: https://www.kaggle.com/datasets/ningtianning/global-property-transactions
python cli.py ingest --source kaggle --path ./data/
python cli.py dashboard

CLI Commands

Command Description
dashboard Interactive market overview with key metrics across all 11 markets
compare <market1> <market2> Side-by-side comparison of two markets (price, volume, yield)
trend <market> --years N Price trend analysis with rolling averages and momentum
hotspots <market> Identify fastest-appreciating areas within a market
correlation Cross-market correlation matrix with significance testing
seasonality <market> Monthly volume and price seasonality patterns
distribution <market> Price distribution with percentiles, skew, kurtosis
affordability <market> Price-to-income ratios using OECD household income data
yield <market> Gross rental yield estimates by area
volume <market> Transaction volume trends with 12-month moving average
forex-adjust <market> USD-normalized prices removing currency fluctuation effects
supply <market> Months of supply estimate from volume trends
outliers <market> Statistical outlier detection (IQR + z-score)
export <market> --format csv Export filtered data to CSV, Parquet, or JSON
ingest --source <name> Download and ingest data from a government source
update --market <name> Check for new data and update local database
report <market> Generate a full PDF market report
schema Print the unified data schema

Analytics Modules

Module Description
Price Index Hedonic price index per market, normalized to base year
Volume Tracker Transaction counts with trend decomposition (seasonal + residual)
Cross-Market Correlation Pearson/Spearman correlation matrix across all market pairs
Seasonality Decomposition Monthly patterns in price and volume using STL decomposition
Hotspot Detection Top-N fastest appreciating postcodes/areas by rolling 12M growth
Price Distribution Histogram + KDE with percentile markers and normality testing
Affordability Ratio Price-to-income using OECD data, indexed over time
Forex Adjustment Local prices converted to USD using historical ECB/Fed rates
Outlier Detection IQR fencing + modified z-score for anomalous transactions
Repeat Sales Same-property resale pairs for true appreciation measurement
Yield Estimation Gross yield from price levels vs area rental indices
Supply Indicator Implied months of supply from volume trend extrapolation
Momentum Scoring 3/6/12-month price momentum with mean reversion signals
Volatility Rolling standard deviation of log returns per market
Regime Detection Bull/bear/sideways classification using hidden Markov models
Comparable Search Find N most similar transactions by price, type, area, date
Area Profiling Statistical summary of any postcode/area across all metrics
Report Generator PDF reports with charts, tables, and commentary
Data Quality Completeness, consistency, and freshness scoring per market
Schema Validator Verify ingested data conforms to the unified schema

Sample Output

Dashboard

Global Property Intelligence - Market Dashboard
================================================

Market          Median Price (USD)   YoY Change   Volume (12M)   Data Through
-----------     ------------------   ----------   ------------   ------------
UK                      $382,400       +3.2%       1,040,000     2026-01
France                  $267,100       +1.8%         620,000     2025-06
Singapore               $412,500       +5.1%          28,400     2026-03
NYC                     $685,000       -1.4%          42,100     2025-12
Chicago                 $289,000       +4.7%         118,000     2026-02
Dubai                   $408,300       +8.9%          95,200     2026-01
Miami                   $445,000       +2.3%          61,800     2026-01
Philadelphia            $215,000       +3.9%         142,000     2026-01
Connecticut             $365,000       +6.1%          38,500     2024-12
Ireland                 $348,200       +4.2%          51,200     2022-12
Taiwan                  $327,800       +2.8%          82,400     2026-01

Cross-Market Correlation Matrix

             UK    FR    SG   NYC   CHI   DXB   MIA   PHL    CT    IE    TW
UK         1.00  0.72  0.31  0.58  0.61  0.18  0.54  0.63  0.67  0.81  0.22
France     0.72  1.00  0.28  0.41  0.39  0.15  0.38  0.42  0.45  0.69  0.19
Singapore  0.31  0.28  1.00  0.35  0.22  0.47  0.29  0.20  0.18  0.25  0.52
NYC        0.58  0.41  0.35  1.00  0.74  0.28  0.71  0.68  0.62  0.48  0.30
Chicago    0.61  0.39  0.22  0.74  1.00  0.16  0.65  0.78  0.71  0.52  0.18
Dubai      0.18  0.15  0.47  0.28  0.16  1.00  0.32  0.14  0.12  0.13  0.41

Architecture

config.py          Market definitions, source URLs, schema constants
    |
adapters.py        Per-market data adapters (download, parse, normalize)
    |
engine.py          Analytics engine: all 20 modules operate on unified schema
    |
  +---------+
  |         |
cli.py    report.py
  |         |
Terminal  PDF output

Each market has an adapter in adapters/ that handles source-specific parsing and maps fields to the unified schema. The analytics engine never sees raw source data -- only the normalized format.

Data Schema

All transactions are normalized to this unified schema:

Column Type Description
market string Market identifier (uk, fr, sg, nyc, chi, dxb, mia, phl, ct, ie, tw)
date date Transaction date (YYYY-MM-DD)
price_local float Sale price in local currency
currency string ISO 4217 currency code (GBP, EUR, SGD, USD, AED, TWD)
price_usd float Sale price converted to USD at transaction-date exchange rate
property_type string Normalized type: apartment, house, land, commercial, other
location string Finest available location (postcode, arrondissement, district)
area_sqm float Floor area in square meters (null if unavailable)
price_per_sqm_usd float Derived: price_usd / area_sqm
source_id string Original transaction ID from government source
source string Government data source name

Use Cases

Use Case Description
Cross-border investment analysis Compare price-per-sqm, yield, and momentum across 11 markets to identify relative value
PropTech data layer Embed as the data backend for property valuation tools, dashboards, or APIs
Academic research Longitudinal price studies with 30+ years of UK data and growing coverage elsewhere
Market timing Momentum and regime detection modules signal market phase transitions
Portfolio diversification Correlation matrix reveals which markets move independently
Content generation Automated market reports, area profiles, and trend summaries for publications
Rental yield benchmarking Cross-market yield comparison using standardized price and rental data

API (Coming Soon)

A REST API will expose all analytics modules over HTTP:

GET /v1/markets                          # List all markets with latest metrics
GET /v1/markets/{market}/trend           # Price trend with parameters
GET /v1/markets/{market}/comps           # Comparable transactions
GET /v1/compare?markets=uk,fr,sg         # Cross-market comparison
POST /v1/analyze                         # Run any analytics module

Kaggle Dataset

The full transaction dataset is available on Kaggle for direct download:

Global Property Transactions: 11 Markets, 48M+ Sales

Updated monthly. Includes pre-built CSV files with the unified schema applied.

Contributing

New markets welcome. To add a market:

  1. Create an adapter in adapters/ that implements download(), parse(), and normalize()
  2. Add market config to config.py
  3. Ensure all output conforms to the unified schema
  4. Add tests in tests/
  5. Open a PR with sample data (1,000 rows) for validation

Priority markets for contribution: Germany (Gutachterausschuss), Japan (REINS), Australia (Valuer General), Canada (CREA), South Korea (MOLIT).

License

MIT License. See LICENSE.

All underlying transaction data is sourced from government open data portals under their respective licenses. See dataset-card.md for per-source licensing details.

Built by

New Way Capital Advisory -- analytics infrastructure for real estate and wealth management.

About

Cross-market property analytics engine. 11 markets, 48M+ government-sourced transactions, unified schema, 20 analytics modules.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages