ChatGPT Export Tool

A CLI for analyzing and exporting ChatGPT conversations.json files.

The project focuses on two things:

analyzing the structure of a ChatGPT export without loading the whole file into memory
exporting conversations to text or JSON with structural field filtering and metadata filtering

It uses streaming JSON parsing with ijson and is organized around small core modules so filtering, formatting, split behavior, and path generation can be changed independently.

Persistent defaults can be stored in a single TOML config file. The repo ships a template at chatgpt_export.toml.example.

Installation

This project currently targets Python 3.10+.

git clone https://github.com/voidfreud/chatgpt-export-tool.git
cd chatgpt-export-tool
uv sync

For development tooling too:

uv sync --group dev

You can then run the CLI with:

uv run chatgpt-export --help

To apply config defaults, copy the template and pass --config PATH.

Quick Start

Analyze an export:

uv run chatgpt-export analyze path/to/conversations.json

Include field coverage:

uv run chatgpt-export analyze path/to/conversations.json --fields

Export everything as text to stdout:

uv run chatgpt-export export path/to/conversations.json

Export everything as JSON to one file:

uv run chatgpt-export export path/to/conversations.json --format json --output conversations.json

Export one file per conversation:

uv run chatgpt-export export path/to/conversations.json --split subject --output-dir exports

Commands

`analyze`

analyze reports high-level structure and statistics for a conversations.json file.

It includes:

conversation count
message count
file size
date range
optional field coverage with --fields

Examples:

uv run chatgpt-export analyze data.json
uv run chatgpt-export analyze data.json --fields
uv run chatgpt-export analyze data.json --verbose --output analysis.txt
uv run chatgpt-export analyze data.json --debug

`export`

export writes conversations in either text or JSON format.

It supports:

structural field filtering through --fields
metadata filtering through --include and --exclude
transcript-oriented text export that follows the active branch
split modes for one output, one file per conversation, date folders, or ID-based files

Examples:

uv run chatgpt-export export data.json
uv run chatgpt-export export data.json --output conversations.txt
uv run chatgpt-export export data.json --format json --output conversations.json
uv run chatgpt-export export data.json --split subject --output-dir exports
uv run chatgpt-export export data.json --fields "groups minimal" --split subject --output-dir exports
uv run chatgpt-export export data.json --fields "include title,mapping" --include "model*" --exclude plugin_ids
cp chatgpt_export.toml.example chatgpt_export.toml
uv run chatgpt-export export data.json --config chatgpt_export.toml

Field Filtering

The --fields option controls which structural fields are retained before formatting.

Supported forms:

all
none
include field1,field2
exclude field1,field2
groups group1,group2

Examples:

uv run chatgpt-export export data.json --fields all
uv run chatgpt-export export data.json --fields none
uv run chatgpt-export export data.json --fields "include title,create_time,mapping"
uv run chatgpt-export export data.json --fields "exclude moderation_results,plugin_ids"
uv run chatgpt-export export data.json --fields "groups minimal"

Available field groups:

conversation
message
metadata
minimal

See Fields.md for the current field-selection reference.

Metadata Filtering

The metadata filter runs after structural field filtering and applies only to keys inside nested message.metadata dictionaries.

Examples:

uv run chatgpt-export export data.json --include model_slug
uv run chatgpt-export export data.json --include "model*" --exclude plugin_ids
uv run chatgpt-export export data.json --fields "groups message" --include is_archived

Currently supported metadata names include:

model_slug
message_type
plugin_ids
is_archived

Split Modes

export supports four split modes:

single: one combined output stream or one output file
subject: one file per conversation, named from title plus identifier
date: date folders with one file per conversation
id: one file per conversation, named from conversation ID

Important output behavior:

--split single with no --output writes to stdout
--split single --output FILE writes one file
split modes like subject, date, and id write into --output-dir

Output Formats

Supported formats:

txt
json

txt is a transcript-oriented export that follows the active branch of the conversation tree. json writes the filtered conversation objects directly.

By default, text export includes user text, assistant text, assistant thoughts, and user editable context when present. User editable context is rendered in a compact preview by default so transcripts stay readable. Text export hides tool plumbing, assistant code, reasoning recap, and blank/internal nodes unless the transcript policy is changed in config.

Text output defaults now favor reading clarity:

conversation context is rendered as a separate preamble block
visible turns are grouped into clearer chat-style User / Assistant sections
turn counts can be shown in the header
ChatGPT citation/navigation artifacts can be stripped from text output
long paragraphs can be wrapped for easier reading

Important transcript policy options include:

user_editable_context_mode
show_visually_hidden_content_types
include_content_types
exclude_content_types

Configuration

export accepts --config PATH and resolves defaults from one TOML file.

The repo ships chatgpt_export.toml.example as a template. Copy it to a local file such as chatgpt_export.toml and pass that path explicitly.

The config file is TOML and is intentionally kept to one file with sections such as:

[defaults] for format, split mode, field selection, and output directory
[transcript] for active-branch reconstruction and visibility rules
[text_output] for header fields, transcript layout, and date/time formats

Notable [text_output] options include:

layout_mode = "reading" | "compact"
heading_style = "plain" | "markdown"
include_turn_count_in_header = true | false
include_turn_numbers = true | false
turn_separator = "---"
strip_chatgpt_artifacts = true | false
wrap_width = 88

Practical transcript presets:

Reading-first transcript:

[text_output]
layout_mode = "reading"
heading_style = "plain"
include_turn_count_in_header = true
turn_separator = "---"
strip_chatgpt_artifacts = true
wrap_width = 88

Compact scanning transcript:

[text_output]
layout_mode = "compact"
include_turn_count_in_header = false
turn_separator = ""
wrap_width = 0

Markdown/notes transcript:

[text_output]
layout_mode = "reading"
heading_style = "markdown"
turn_separator = "---"

CLI arguments override TOML values. analyze does not currently use export config defaults.

Architecture

The structure is intentionally modular at the subsystem level:

command wiring and user-facing behavior live in chatgpt_export_tool/commands/
streaming parse and analysis are separate from export formatting and writing
structural field filtering and metadata filtering are separate concerns
split-key resolution, filename policy, and writing are isolated from export orchestration

The core package is also grouped into shallow subpackages by concern:

core/config/ for runtime config models, loading, and validation
core/transcript/ for branch reconstruction and transcript extraction
core/validation/ for field and metadata validation
core/output/ for formatting, naming, path resolution, and writing

That separation is deliberate: most behavior changes can be made in one small subsystem instead of in one large control file.

Development

Run the checks used during refactoring:

uv run pytest
uv run pytest --cov=chatgpt_export_tool --cov-report=term-missing
uv run ruff check chatgpt_export_tool tests pyproject.toml
uv run ruff format --check chatgpt_export_tool tests

If you need to format files:

uv run ruff format chatgpt_export_tool tests

Notes

Input handling is streaming, so large exports do not need to be loaded into memory just to analyze or iterate conversations.
Single-file JSON export writes one valid JSON document.
Split exports write one conversation per output file.
Text export follows the active thread path using current_node and parent links.
The field-selection and metadata-selection surface is documented in Fields.md.

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
chatgpt_export_tool		chatgpt_export_tool
tests		tests
.gitignore		.gitignore
Fields.md		Fields.md
LICENSE		LICENSE
README.md		README.md
chatgpt_export.toml.example		chatgpt_export.toml.example
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ChatGPT Export Tool

Installation

Quick Start

Commands

`analyze`

`export`

Field Filtering

Metadata Filtering

Split Modes

Output Formats

Configuration

Architecture

Development

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

ChatGPT Export Tool

Installation

Quick Start

Commands

analyze

export

Field Filtering

Metadata Filtering

Split Modes

Output Formats

Configuration

Architecture

Development

Notes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

`analyze`

`export`

Packages