Skip to content

[SPARK-55799][PYTHON][TESTS] Add datasource tests with simple worker#54580

Open
gaogaotiantian wants to merge 2 commits intoapache:masterfrom
gaogaotiantian:add-simple-worker-datasource-test
Open

[SPARK-55799][PYTHON][TESTS] Add datasource tests with simple worker#54580
gaogaotiantian wants to merge 2 commits intoapache:masterfrom
gaogaotiantian:add-simple-worker-datasource-test

Conversation

@gaogaotiantian
Copy link
Contributor

What changes were proposed in this pull request?

  • Fixed is_pyspark_module in worker logging so it can recognize simple workers
  • Add a simpler worker test suite that runs all data source tests with simple worker.
  • Fixed the test where it always expects 2 abort messages which is not guaranteed.

Why are the changes needed?

We support simple worker for data source but we have 0 test for it. Some issues were already fixed in the previous PRs and this is the one to enable the test.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

test_python_datasource passed locally.

Was this patch authored or co-authored using generative AI tooling?

No.

Comment on lines +226 to +230
def is_pyspark_module(frame: FrameType) -> bool:
module_name = frame.f_globals.get("__name__", "")
if module_name == "__main__":
if mod := sys.modules.get("__main__", None):
module_name = mod.__spec__.name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just so i understand, these logging changes were needed/discovered after adding the test or just a cleanup?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed for the test. The original code did not consider the simple worker case, where the worker module is executed with python -m XXX - the __name__ would be __main__ and our method to check if it's pyspark module will give a wrong result.

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im fine with this

Copy link
Contributor

@allisonwang-db allisonwang-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants