Skip to content

Comments

Fix: Validate join columns before join execution (#118)#159

Open
manavgupta26 wants to merge 1 commit intoDataHaskell:mainfrom
manavgupta26:fix-join-column-validation
Open

Fix: Validate join columns before join execution (#118)#159
manavgupta26 wants to merge 1 commit intoDataHaskell:mainfrom
manavgupta26:fix-join-column-validation

Conversation

@manavgupta26
Copy link

Fix for Issue #118

Previously, when a non-existent column was passed to join functions (e.g. rightJoin ["Cats"]), the code would silently proceed and perform a cartesian product due to empty join keys.

This happened because no validation was performed before computing row hashes.

Changes:

  • Added validateJoinColumns helper function
  • Added validation at the beginning of:
    • innerJoin
    • leftJoin
    • fullOuterJoin
  • rightJoin delegates to leftJoin and is automatically covered

Now the join functions throw an error if any specified column does not exist in either dataframe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants