feat: add CSV conversion command to ensrainbow CLI #1136

djstrong · 2025-09-30T12:28:40Z

Introduced convert-csv command for converting CSV files to .ensrainbow format.
Added support for single and two-column CSV formats, without headers
Implemented error handling for invalid CSV data.
Created tests for various CSV scenarios, including special characters and invalid formats.
Updated package dependencies to include fast-csv for CSV parsing.

changeset-bot · 2025-09-30T12:28:45Z

🦋 Changeset detected

Latest commit: b02b7f1

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 14 packages

Name	Type
ensrainbow	Patch
ensindexer	Patch
ensadmin	Patch
ensapi	Patch
@ensnode/datasources	Patch
@ensnode/ensrainbow-sdk	Patch
@ensnode/ponder-metadata	Patch
@ensnode/ensnode-schema	Patch
@ensnode/ensnode-react	Patch
@ensnode/ponder-subgraph	Patch
@ensnode/ensnode-sdk	Patch
@ensnode/shared-configs	Patch
@docs/ensnode	Patch
@docs/ensrainbow	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

vercel · 2025-09-30T12:28:46Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Review	Updated (UTC)
admin.ensnode.io	Ready	Preview, Comment	Dec 17, 2025 3:50pm
ensnode.io	Ready	Preview, Comment	Dec 17, 2025 3:50pm
ensrainbow.io	Ready	Preview, Comment	Dec 17, 2025 3:50pm

- Introduced `convert-csv` command for converting CSV files to .ensrainbow format. - Added support for single and two-column CSV formats. - Implemented error handling for invalid CSV data. - Created tests for various CSV scenarios, including special characters and invalid formats. - Updated package dependencies to include `csv-simple-parser` for CSV parsing.

- Added new command-line options for CSV conversion: `--silent`, `--disable-dedup`, `--cache-size`, `--use-bloom-filter`, and `--bloom-filter-size`. - Implemented a deduplication database using ClassicLevel with optional Bloom filter for faster processing. - Updated the conversion process to support deduplication and improved memory management. - Enhanced logging for large file processing and added tests for new deduplication features.

- Added a function to estimate memory usage of Maps for better tracking. - Reduced default cache size in DeduplicationDB from 10000 to 1000. - Enhanced backpressure handling during CSV writing to prevent memory overflow. - Updated logging to include output backpressure events and improved performance for large files. - Streamlined the CSV processing to operate in a completely sequential manner.

- Removed unused command-line options for deduplication and Bloom filter from the CLI interface. - Updated default progress interval from 10000 to 50000 records for improved performance. - Enhanced logging for file processing and memory management during CSV conversion. - Cleaned up code for better readability and maintainability.

lightwalker-eth

@djstrong Great work here 👍 Reviewed and shared some suggestions. Appreciate your advice 👍

lightwalker-eth · 2025-12-22T13:10:26Z

.changeset/brave-kiwis-notice.md

+"ensrainbow": patch
+---
+
+feat: add CSV conversion command to ensrainbow CLI


Suggested change

feat: add CSV conversion command to ensrainbow CLI

feat: add CSV conversion command to ensrainbow CLI to convert rainbow tables from CSV format to ensrainbow format

lightwalker-eth · 2025-12-22T13:12:53Z

docs/ensnode.io/src/content/docs/ensrainbow/concepts/data-model.mdx

-```bash title="Convert legacy SQL data"
-pnpm run convert --input-file path/to/ens_names.sql.gz --output-file subgraph-0.ensrainbow
-```
+- **SQL Conversion**: Convert legacy ENS Subgraph data (`ens_names.sql.gz`) using `pnpm run convert`


Would it be a problem to flip these? In other words, the default conversion case moving forward will be converting from CSV files. Converting from SQL files is what should be the special case.

Therefore it seems to me that convert should convert a CSV and convert-sql should convert a legacy SQL file such as the legacy ENS Subgraph data.

What do you think?

lightwalker-eth · 2025-12-22T13:14:02Z

docs/ensnode.io/src/content/docs/ensrainbow/concepts/data-model.mdx

-```bash title="Convert legacy SQL data"
-pnpm run convert --input-file path/to/ens_names.sql.gz --output-file subgraph-0.ensrainbow
-```
+- **SQL Conversion**: Convert legacy ENS Subgraph data (`ens_names.sql.gz`) using `pnpm run convert`


For converting legacy ENS Subgraph data, suggest including a link to https://github.com/graphprotocol/ens-rainbow

lightwalker-eth · 2025-12-22T13:15:29Z

docs/ensnode.io/src/content/docs/ensrainbow/concepts/creating-files.mdx

@@ -0,0 +1,675 @@
+---
+title: Creating ENSRainbow Files
+description: Complete guide to creating .ensrainbow files from SQL dumps and CSV data.


Suggested change

description: Complete guide to creating .ensrainbow files from SQL dumps and CSV data.

description: Complete guide to creating .ensrainbow files.

lightwalker-eth · 2025-12-22T13:17:15Z

docs/ensnode.io/src/content/docs/ensrainbow/concepts/creating-files.mdx

+sidebar:
+  label: Creating Files
+  order: 3
+keywords: [ensrainbow, file creation, conversion, sql, csv]


Suggested change

keywords: [ensrainbow, file creation, conversion, sql, csv]

keywords: [ensrainbow, file creation, conversion, csv]

Goal: The .sql file conversion case is a really niche use case. Happy for us to include docs about it, however we should limit our promotion of the .sql conversion case. It exists only for legacy purposes. In general we should only talk about the .csv conversion case.

lightwalker-eth · 2025-12-22T14:20:54Z

apps/ensrainbow/src/commands/convert-csv-command.ts

+    }
+    const maybeLabelHash = providedHash.startsWith("0x") ? providedHash : `0x${providedHash}`;
+    try {
+      const labelHash = labelHashToBytes(maybeLabelHash as LabelHash);


Hmm, where here did we validate that the provided labelhash is the correct labelhash? Or was there a performance issue if we did that?

At the very minimum I would want to see us add a check here that the bytes we are given here is the expected length of a valid labelhash. Ex: "0x1234" is not enough bytes and should be invalid. Or is that already handled inside labelHashToBytes?

lightwalker-eth · 2025-12-22T14:21:25Z

apps/ensrainbow/src/commands/convert-csv-command.ts

+      label: label,
+    };
+  } else {
+    // Two columns: validate and use provided hash


I see here it says "validate" but I don't see full validation?

lightwalker-eth · 2025-12-22T14:26:57Z

apps/ensrainbow/src/commands/convert-csv-command.ts

+  outputStream: NodeJS.WritableStream,
+  lineNumber: number,
+  existingDb: ENSRainbowDB | null,
+  dedupDb: DeduplicationDB,


What if we removed DeduplicationDb and instead used existingDb for this purpose?

In other words, what if we changed the convert command so that it was always a combined "convert + ingest"?

In other words:

If we ingest from a .ensrainbow file then it doesn't need to output a new .ensrainbow file

If we ingest from a .csv file or a .sql file then it might initialize / update the existingDb while also producing as output the incremental .ensrainbow file.

If we did this then it seems we could ingest as we convert and then remove the need for this dedupDb?

Appreciate your advice if this is a good idea or if it would add a bunch of scope. Thanks

The drawback of proposed approach is that the existing database will change during the process or even will be in unhealthy state (if some error occurs during process).
In DeduplicationDb we are saving only labels - I hope the change will not increase memory consumption.
Probably this is good change for user experience but may take some time to implement and test.

The decision is to not combine the commands in this PR.

lightwalker-eth · 2025-12-22T14:33:14Z

apps/ensrainbow/src/commands/convert-csv-command.ts

+ */
+export async function convertCsvCommand(options: ConvertCsvCommandOptions): Promise<void> {
+  // Validate that existingDbPath is provided when labelSetVersion > 0
+  if (options.labelSetVersion > 0 && !options.existingDbPath) {


Please see my other feedback suggesting that we remove the labelSetVersion option from this command as it can be determined dynamically through the existingDbPath

lightwalker-eth · 2025-12-22T14:35:39Z

apps/ensrainbow/src/commands/convert-csv-command.test.ts

+import { ENSRainbowDB } from "@/lib/database";
+
+import { convertCsvCommand } from "./convert-csv-command";
+


Super work on these tests 🚀

vercel bot deployed to Preview – admin.ensnode.io September 30, 2025 12:29 View deployment

vercel bot deployed to Preview – ensrainbow.io September 30, 2025 12:31 View deployment

vercel bot deployed to Preview – ensnode.io September 30, 2025 12:32 View deployment

vercel bot deployed to Preview – admin.ensnode.io September 30, 2025 12:33 View deployment

vercel bot deployed to Preview – admin.ensnode.io September 30, 2025 15:27 View deployment

vercel bot deployed to Preview – ensnode.io September 30, 2025 15:28 View deployment

vercel bot deployed to Preview – ensrainbow.io September 30, 2025 15:28 View deployment

vercel bot deployed to Preview – ensrainbow.io October 1, 2025 15:21 View deployment

vercel bot deployed to Preview – ensnode.io October 1, 2025 15:22 View deployment

vercel bot deployed to Preview – admin.ensnode.io October 1, 2025 15:24 View deployment

vercel bot deployed to Preview – admin.ensnode.io October 1, 2025 16:11 View deployment

vercel bot deployed to Preview – ensrainbow.io October 1, 2025 16:12 View deployment

vercel bot deployed to Preview – ensnode.io October 1, 2025 16:13 View deployment

vercel bot deployed to Preview – admin.ensnode.io October 6, 2025 14:46 View deployment

vercel bot deployed to Preview – ensrainbow.io October 6, 2025 14:46 View deployment

vercel bot deployed to Preview – ensnode.io October 6, 2025 14:47 View deployment

vercel bot deployed to Preview – admin.ensnode.io October 17, 2025 20:47 View deployment

vercel bot deployed to Preview – ensrainbow.io October 17, 2025 20:47 View deployment

vercel bot deployed to Preview – ensnode.io October 17, 2025 20:48 View deployment

vercel bot deployed to Preview – ensrainbow.io November 24, 2025 12:26 View deployment

vercel bot deployed to Preview – ensnode.io November 24, 2025 12:27 View deployment

vercel bot deployed to Preview – ensrainbow.io November 24, 2025 12:28 View deployment

vercel bot deployed to Preview – ensnode.io November 24, 2025 12:28 View deployment

vercel bot deployed to Preview – admin.ensnode.io November 24, 2025 12:29 View deployment

vercel bot deployed to Preview – ensrainbow.io December 11, 2025 19:57 View deployment

vercel bot deployed to Preview – ensnode.io December 11, 2025 19:57 View deployment

vercel bot had a problem deploying to Preview – admin.ensnode.io December 11, 2025 19:58 Failure

djstrong added 3 commits December 15, 2025 15:21

refactor: simplify command options in package.json

2c94d41

djstrong force-pushed the csv-conversion-tool branch from 6f4803c to 721a50d Compare December 15, 2025 14:22

vercel bot deployed to Preview – ensrainbow.io December 15, 2025 14:23 View deployment

vercel bot deployed to Preview – ensnode.io December 15, 2025 14:24 View deployment

vercel bot deployed to Preview – admin.ensnode.io December 15, 2025 14:24 View deployment

vercel bot deployed to Preview – ensrainbow.io December 15, 2025 14:45 View deployment

vercel bot deployed to Preview – ensnode.io December 15, 2025 14:46 View deployment

vercel bot deployed to Preview – admin.ensnode.io December 15, 2025 14:46 View deployment

fix: improve error handling and logging in CSV conversion tests

11992d7

vercel bot deployed to Preview – ensrainbow.io December 15, 2025 15:33 View deployment

vercel bot deployed to Preview – admin.ensnode.io December 15, 2025 15:34 View deployment

vercel bot deployed to Preview – ensnode.io December 15, 2025 15:34 View deployment

refactor: update CSV conversion logic and improve deduplication handling

3dea60e

vercel bot deployed to Preview – ensrainbow.io December 16, 2025 20:59 View deployment

vercel bot deployed to Preview – ensnode.io December 16, 2025 21:00 View deployment

vercel bot deployed to Preview – admin.ensnode.io December 16, 2025 21:00 View deployment

Merge branch 'main' into csv-conversion-tool

b6c668a

djstrong marked this pull request as ready for review December 16, 2025 21:23

djstrong requested a review from a team as a code owner December 16, 2025 21:23

vercel bot deployed to Preview – ensrainbow.io December 16, 2025 21:24 View deployment

vercel bot deployed to Preview – ensnode.io December 16, 2025 21:25 View deployment

vercel bot deployed to Preview – admin.ensnode.io December 16, 2025 21:25 View deployment

refactor: remove unused dependencies and enhance CSV conversion tests

b02b7f1

vercel bot deployed to Preview – ensrainbow.io December 17, 2025 15:49 View deployment

vercel bot deployed to Preview – ensnode.io December 17, 2025 15:50 View deployment

vercel bot deployed to Preview – admin.ensnode.io December 17, 2025 15:50 View deployment

lightwalker-eth reviewed Dec 22, 2025

View reviewed changes

	description: Complete guide to creating .ensrainbow files from SQL dumps and CSV data.
	description: Complete guide to creating .ensrainbow files.

	keywords: [ensrainbow, file creation, conversion, sql, csv]
	keywords: [ensrainbow, file creation, conversion, csv]

		import { ENSRainbowDB } from "@/lib/database";

		import { convertCsvCommand } from "./convert-csv-command";

feat: add CSV conversion command to ensrainbow CLI #1136

Are you sure you want to change the base?

feat: add CSV conversion command to ensrainbow CLI #1136

Uh oh!

Conversation

djstrong commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

changeset-bot bot commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

vercel bot commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lightwalker-eth left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

djstrong commented Sep 30, 2025 •

edited

Loading

changeset-bot bot commented Sep 30, 2025 •

edited

Loading

vercel bot commented Sep 30, 2025 •

edited

Loading