Skip to content

Comments

feat: Implement Spark bin function#20479

Open
kazantsev-maksim wants to merge 12 commits intoapache:mainfrom
kazantsev-maksim:spark_bin
Open

feat: Implement Spark bin function#20479
kazantsev-maksim wants to merge 12 commits intoapache:mainfrom
kazantsev-maksim:spark_bin

Conversation

@kazantsev-maksim
Copy link
Contributor

Which issue does this PR close?

N/A

Rationale for this change

Add new function: https://spark.apache.org/docs/latest/api/sql/index.html#bin

What changes are included in this PR?

  • Implementation
  • Unit Tests
  • SLT tests

Are these changes tested?

Yes, tests added as part of this PR.

Are there any user-facing changes?

No, these are new function.

@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) spark labels Feb 22, 2026
## PySpark 3.5.5 Result: {'bin(13.3)': '1101', 'typeof(bin(13.3))': 'string', 'typeof(13.3)': 'decimal(3,1)'}
#query
#SELECT bin(13.3::decimal(3,1));
query T
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should try to add some null handling test cases + handling actual table columns

#SELECT bin(13::int);

## Original Query: SELECT bin(13.3);
## PySpark 3.5.5 Result: {'bin(13.3)': '1101', 'typeof(bin(13.3))': 'string', 'typeof(13.3)': 'decimal(3,1)'}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spark supports decimal, we should add this support to this PR as well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could use coercible signature instead:

let int64 = Coercion::new_implicit(
    TypeSignatureClass::Native(logical_int64()),
    vec![TypeSignatureClass::Numeric],
    NativeType::Int64,
);
TypeSignature::Coercible(vec![int64])

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually Spark's bin() supports even floating numbers:

spark-sql (default)> select bin(13);
1101
Time taken: 0.785 seconds, Fetched 1 row(s)
spark-sql (default)> select bin(-13);
1111111111111111111111111111111111111111111111111111111111110011
Time taken: 0.032 seconds, Fetched 1 row(s)
spark-sql (default)> select bin(13.3);
1101
Time taken: 0.04 seconds, Fetched 1 row(s)

The function documentation says: "Returns the string representation of the long value expr represented in binary"
So, it should use logical_number above

@kazantsev-maksim
Copy link
Contributor Author

@martin-g @Jefffrey @jonathanc-n Thanks for the review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

spark sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants