Fix -filter flag and extend stdout streaming to returns tables#70
Open
cjohnson-confluent wants to merge 2 commits intogregrahn:masterfrom
Open
Fix -filter flag and extend stdout streaming to returns tables#70cjohnson-confluent wants to merge 2 commits intogregrahn:masterfrom
cjohnson-confluent wants to merge 2 commits intogregrahn:masterfrom
Conversation
The -filter flag was silently broken in the v2.3.0 upstream import due to three separate bugs, all of which were originally fixed by Greg Rahn in 2013 (commit 7992dbb) and later reverted: 1. params.h: option was named _FILTER but is_set() looks up "FILTER"; prefix matching never matched, so filter mode never activated 2. print.c (print_start): fpOutfile = pTdef->outfile ran unconditionally, overwriting the stdout assignment with NULL on every row 3. w_store_sales.c: returns generation ran even in filter mode, writing both sales and returns rows to stdout in a single pass (interleaved) This commit restores and extends those fixes: - Fix all three bugs above - Add -filter support for the three returns tables (store_returns, catalog_returns, web_returns). Because returns are generated as a side effect of their parent sales table, a g_filter_tabid global tracks the target table and driver.c redirects child table requests to the parent generator; print_start routes the target to stdout and suppresses the parent's output to /dev/null - Auto-detect OS in makefile (Darwin -> MACOS, else LINUX) so the same build works on both macOS and Linux without manual OS= override - Add -Wno-implicit-int -Wno-deprecated-non-prototype to MACOS_CFLAGS for compatibility with modern clang's stricter K&R C handling - Add validation error when -filter is used without -table <name> Output verified byte-for-byte identical to non-filter mode. Usage: ./dsdgen -scale N -table store_sales -filter -quiet 2>/dev/null | gzip > store_sales.gz ./dsdgen -scale N -table store_returns -filter -quiet 2>/dev/null | gzip > store_returns.gz (same pattern for catalog_sales/returns, web_sales/returns) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
print.c is linked into both dsdgen and dsqgen; driver.c is dsdgen-only. The extern declaration of g_filter_tabid in print.c caused an undefined symbol error when linking dsqgen, because the definition lived in driver.c. Fix: move the definition (and its initializer) to print.c, and change driver.c to declare it extern. print.c is the right owner — it is the translation unit that actually reads the variable in print_start(). Smoke tested: normal file output, -filter stdout for sales and returns tables, error on -filter without -table, and dsqgen -filter all pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What was broken
The
-filterflag (which streams generated data to stdout instead of writingfiles) was silently broken by the v2.3.0 upstream import (
12caac0), whichreverted three fixes Greg Rahn had originally landed in
7992dbb:params.h: Option was named_FILTERbutis_set()searches for"FILTER"using prefix matching — it never matched, so filter mode neveractivated for any table.
print.c(print_start):fpOutfile = pTdef->outfileranunconditionally, overwriting the
stdoutassignment withNULLon everyrow.
w_store_sales.c: Returns generation ran even in filter mode, writinginterleaved sales and returns rows to stdout in a single pass.
What's new
-filtersupport to the three returns tables (store_returns,catalog_returns,web_returns). Because returns are generated as a sideeffect of their parent sales table, a
g_filter_tabidglobal tracks thetarget table;
driver.credirects child table requests to the parentgenerator, and
print_startroutes only the target to stdout, suppressingthe parent's output to
/dev/null.makefile(Darwin→MACOS, elseLINUX) so thesame build works on macOS and Linux without a manual
OS=override.(
-Wno-implicit-int -Wno-deprecated-non-prototype -fcommon).-filteris used without specifying-table <name>.Verification
Output verified byte-for-byte identical to non-filter mode for all tables.
Usage