Skip to content

thigmen/netflix-catalog-custom-visuals

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Netflix Catalog — Custom Visuals in Power BI

A two-page Power BI report built without any native visuals. Every chart, card, and header on the canvas is rendered with HTML, SVG, DAX-generated content, or Deneb (Vega/Vega-Lite) specs.

Animated landing page transitioning into the dashboard

The GIF above shows the cinematic intro on page 1, followed by the analytical dashboard on page 2.


What this is

A self-imposed constraint project: rebuild a complete Power BI report without using any of the native visual types. No bar charts, no treemaps, no native KPI cards, no built-in line charts. Everything is drawn from scratch using the platform's HTML Content visual, Deneb (the custom visual that runs Vega-Lite and Vega specs), and DAX measures that return HTML strings.

I picked the Netflix Titles dataset because it is rich in dimensions (countries, genres, ratings, dates, cast) but small enough to fit cleanly into a single .pbix.

A note on what the data actually shows: this is catalog metadata — what's available on Netflix, when it was added, and how it's classified. It is not viewership data. A few people on the launch post mistook it for "what people watched." It isn't. Netflix doesn't release viewership at this granularity publicly.


What's inside

Page 1 — Animated landing page

Landing page screenshot

A cinematic intro: rising bars, a gradient growth line that draws itself, the "Netflix Streaming Analytics" wordmark fading in, three live KPIs (titles / directors / countries), a glowing "Access Dashboard" call-to- action, and a subtle starfield. It's all SVG, all driven by SMIL animations inside a single DAX measure.

Page 2 — The analytical dashboard

Dashboard screenshot

Five components, none of them native:

Region What it is How it's rendered
Editorial header Netflix-red "N" tile, "Netflix Catalog." wordmark, tag-line, four secondary pulls (Peak Year, Busiest Week, Avg Movie, Coverage) DAX-generated HTML+SVG, served by HTML Content
KPI row Four cards: Total Titles, Content Mix donut, Countries (with top 3), Genres (with top 3) DAX-generated HTML+SVG, served by HTML Content
Country → Genre Sankey Top 15 producing countries flowing into top 10 macro-genres, link width = title count, top-3 sources colour-coded Layout pre-computed in Python, emitted as static SVG, loaded into the model via Power Query, served by HTML Content
Beeswarm One dot per title, X = release year, Y = audience tier (Kids / Teens / Adults), shape = type (circle Movie, triangle TV) Deneb spec running Vega's force transform on ~8,700 dots

Tech stack

Power BI stack

  • Power BI Desktop (DAX, Power Query / M, data modelling)
  • Deneb custom visual (Daniel Marsh-Patrick) for Vega / Vega-Lite specs
  • HTML Content custom visual (Daniel Marsh-Patrick) for the DAX-driven HTML cards and the pre-rendered Sankey

Custom rendering

  • HTML + SVG embedded inside DAX measures
  • Vega force transform for the beeswarm packing
  • Hand-written Python Sankey layout (cubic-Bézier link routing, proportional node heights)

Data prep / pre-processing

  • Python 3.14 + pandas 3 + pyarrow for the cleaning pipeline
  • Parquet as the intermediate format consumed by Power BI

AI assistance

  • Claude Code was used as a coding copilot — debugging Vega force parameters, writing CSS for the HTML cards, generating boilerplate. Data modelling, design decisions, and the constraint-driven approach are mine.

Dataset

  • Source: Netflix Titles, by Shivam Bansal on Kaggle
  • Volume: 8,807 titles, 122 distinct countries, 42 raw genres, ~13 years of Netflix add-dates (Jan 2008 — Sep 2021)
  • Nature: Catalog metadata only. Each row describes one title on the platform — its type (movie or TV show), release year, the date Netflix added it, the cast, the director, the rating, and the category tags. There is no viewership, watch-time, retention, or rating data. Anyone reading the dashboard should keep that in mind.

Repository structure

netflix-catalog-custom-visuals/
├── README.md                       — this file
├── LICENSE                         — MIT
├── .gitignore
│
├── pbix/
│   └── netflix-catalog.pbix        — the Power BI report
│
├── screenshots/                    — README.md inside explains what
│                                     each image should be
│   ├── landing-page.png
│   ├── dashboard.png
│   └── animation.gif
│
├── dax/
│   ├── scalar-measures.dax         — Total Titles, Movies Count, etc.
│   ├── landing-page.dax            — the animated intro banner
│   ├── editorial-header.dax        — page-2 masthead
│   ├── kpi-cards-combined.dax      — single measure rendering all 4 KPI cards
│   ├── kpi-card-total-titles.dax   — reference snippet for card 1
│   ├── kpi-card-content-mix.dax    — reference snippet for card 2
│   ├── kpi-card-countries.dax      — reference snippet for card 3
│   ├── kpi-card-genres.dax         — reference snippet for card 4
│   └── sankey-html.dax             — the measure that loads the SVG
│
├── deneb/
│   └── beeswarm-release-audience.json  — Vega spec for the beeswarm
│
├── html/
│   └── country-genre-sankey.html   — pre-rendered Sankey SVG
│
├── python/
│   ├── 01-data-profiling.py        — schema, nulls, cardinality report
│   ├── 02-data-cleaning.py         — repairs, type splits, bridge tables
│   └── 03-build-aggregations.py    — preview data + Sankey SVG renderer
│
└── docs/
    ├── technical-decisions.md      — longer-form rationale and tradeoffs
    ├── data-profile.md             — auto-generated by 01-data-profiling.py
    └── powerbi-assembly-guide.md   — step-by-step assembly notes

How to explore the report

To open the .pbix locally

  1. Install Power BI Desktop — Windows-only, free.
  2. Install the two custom visuals from the Power BI Marketplace (the menu on the Visualizations pane → Get more visuals):
    • Deneb (by Daniel Marsh-Patrick)
    • HTML Content (by Daniel Marsh-Patrick)
  3. Open pbix/netflix-catalog.pbix.
  4. The model embeds the cleaned Parquet data. The Sankey HTML helper table references an absolute path on my machine (C:\…); you'll need to repoint it via Transform Data → query sankey_html_table → Source step, pointing it at your local copy of html/country-genre-sankey.html.

To regenerate everything from scratch

# Python 3.10+, pandas 2+, pyarrow
pip install pandas pyarrow

# Download netflix_titles.csv from Kaggle and place it at
# data/raw/netflix_titles.csv (relative to the python/ scripts).

python python/01-data-profiling.py     # → docs/data-profile.md
python python/02-data-cleaning.py      # → cleaned parquets
python python/03-build-aggregations.py # → previews + Sankey SVG

The output of step 3 (html/country-genre-sankey.html) is what feeds the Sankey visual in Power BI.


Technical highlights

HTML and SVG inside DAX measures

The HTML Content visual takes a single field as input. If that field returns an HTML string, it renders. So a DAX measure can act as a tiny template engine — concatenating HTML, embedding FORMAT(value, "#,##0", "en-US") calls for live numbers, and applying CSS classes that respond to the data. The KPI cards and the editorial header are both single-measure HTML strings. See dax/editorial-header.dax and dax/kpi-cards-combined.dax.

A subtle but important detail: explicitly passing "en-US" to FORMAT() forces US-style number formatting (comma thousands separator, dot decimal) regardless of the model's regional settings. Without that, locally-localised models render 8,807 as 8.807, which several international readers misread as 8.8.

Why Deneb for the beeswarm

A beeswarm with ~8,700 individually-placed points is something Power BI's native scatter cannot do — there's no force-packing primitive. The beeswarm spec uses Vega's force transform (collision + x/y attraction) with static: true so the simulation runs once at load time and the result is cached. See deneb/beeswarm-release-audience.json and the deeper notes in docs/technical-decisions.md.

Pre-rendering the Sankey in Python

Most Sankey diagrams in Power BI either use a marketplace custom visual or load D3 + d3-sankey at runtime. Both have payload and styling constraints. Instead, this project pre-computes the entire Sankey layout in Python — node positions, link Béziers, colours assigned by source-country rank — and emits a self-contained ~19KB SVG. The Power BI side just loads that string into a one-row helper table and renders it. Zero JavaScript runtime in the visual.

The renderer lives in python/03-build-aggregations.py (function render_sankey_svg_html). The output goes to html/country-genre-sankey.html.


Tradeoffs and lessons

The honest cons of this approach:

  • Debugging is slow. Iterating on a DAX-generated HTML string means a measure refresh on every change. Vega specs in Deneb are easier because they have an in-visual editor with live preview, but errors are still less obvious than in a browser console.
  • No cross-filtering. Native Power BI visuals talk to each other out of the box. Custom HTML rendered inside HTML Content is one-way — data flows in, but a click on the rendered HTML does not propagate selections back to the report. Bridging this requires bookmarks or Deneb specs with pbiContext.selection. This v1 has none of it.
  • Layout flexibility costs upfront work. If you hardcode pixel dimensions inside the HTML or Vega spec, you'll be re-tuning every time you resize a tile. Responsive sizing (width: 100%, viewBox, "width": {"signal": "width"}) needs to go in from the start.

When this approach is worth it: when the visual genuinely doesn't exist natively (force-packed beeswarm, custom-routed Sankey, animated intro), or when editorial typography is part of the deliverable.

When it isn't: when a native column chart conveys the same insight in 10% of the time. Most dashboards are this case. This project is the exception.


Linked posts

The launch posts on LinkedIn discussing the build and what I learned:


Notes

  • This is a personal portfolio project, not production code. Some shortcuts (hard-coded top 3 lists, absolute file path in the Sankey helper table) reflect that.
  • Claude Code was used as a coding copilot — for syntax checks, Vega parameter tuning, CSS boilerplate, and DAX formatting. Data modelling, design decisions, the choice of constraint, and the build sequence are my own.
  • Not affiliated with Netflix, Inc.

License

MIT. See LICENSE.

Releases

No releases published

Packages

 
 
 

Contributors