Implement P2300 bulk adapter for HPX executors#7240
Implement P2300 bulk adapter for HPX executors#7240shivansh023023 wants to merge 7 commits intoTheHPXProject:masterfrom
Conversation
|
Can one of the admins verify this patch? |
Up to standards ✅🟢 Issues
|
5cdcfbd to
7b3fc5f
Compare
|
@shivansh023023 Just FYI, before you invest more time into our old sender/receiver implementation: #7123 will soon remove almost all of our code related to this. Please focus your efforts on whatever stays after this has been merged. |
|
Thank you for the heads-up regarding #7123. I appreciate the guidance to avoid spending effort on code that will soon be deprecated. I will pause my work on the current P2300 bulk adapter and follow the progress of #7123 closely. Once that is merged, I’ll re-evaluate how to implement the bulk functionality within the new framework. In the meantime, I’ll focus on finishing the review items for PR #7070 (local convenience header) and cleaning up any remaining issues on my other active PRs that are unaffected by the sender/receiver removal. |
| HPX_NO_UNIQUE_ADDRESS std::decay_t<Shape> shape_; | ||
| HPX_NO_UNIQUE_ADDRESS std::decay_t<F> f_; | ||
|
|
||
| #if defined(HPX_HAVE_STDEXEC) |
There was a problem hiding this comment.
Please remove the case !defined(HPX_HAVE_STDEXEC) as we have dropped our own implementation of senders&receivers.
There was a problem hiding this comment.
#if defined(HPX_HAVE_STDEXEC) can be removed now as well.
|
|
||
| #include <hpx/config.hpp> | ||
| #include <hpx/assert.hpp> | ||
| #include <hpx/execution/queries/get_scheduler.hpp> |
There was a problem hiding this comment.
This file does not exist anymore. Please adapt.
| #include <hpx/executors/detail/index_queue_spawning.hpp> | ||
| #include <hpx/executors/execution_policy_mappings.hpp> | ||
| #include <hpx/executors/executor_scheduler.hpp> | ||
| #include <hpx/executors/executor_scheduler_bulk.hpp> |
There was a problem hiding this comment.
Can we get away without pulling this into the executor file? Would forward declaring some of the stdexec types be sufficient?
There was a problem hiding this comment.
I think you can rmeove these files now as you have added the forwarding declaration.
| #include <hpx/config.hpp> | ||
| #include <hpx/executors/execution_policy_mappings.hpp> | ||
| #include <hpx/executors/executor_scheduler.hpp> | ||
| #include <hpx/executors/executor_scheduler_bulk.hpp> |
There was a problem hiding this comment.
Same here: can avoid pulling in the stdexec types here?
7b3fc5f to
70c4501
Compare
|
Hi @hkaiser Thank you for the feedback. I have refactored the PR to align with the latest stdexec changes: Purged Legacy Code: Removed all !defined(HPX_HAVE_STDEXEC) blocks from executor_scheduler_bulk.hpp. Header Optimization: Removed the inclusions of executor_scheduler.hpp and executor_scheduler_bulk.hpp from the parallel_executor and sequenced_executor headers. Forward Declarations: Implemented forward declarations for the returned scheduler types in the executor headers to prevent bloat. Fixed Includes: Removed the reference to the deprecated get_scheduler.hpp header. The headers are now much cleaner. Let me know if there are any other areas you'd like me to slim down! |
| namespace hpx::execution::experimental { | ||
| template <typename Executor> | ||
| struct executor_scheduler; | ||
| } |
There was a problem hiding this comment.
May I suggest to create a forwarding header where this type is declared? This would avoid having the declaration copied all over the place.
- Integrated hpx::execution::experimental::bulk with internal bulk_sync_execute - Implemented robust tag_invoke(connect_t) with full const-correctness - Fixed receiver reference collapsing using std::decay_t - Added executor_algorithm_bulk unit tests for parallel and sequenced execution
…nder/receiver concepts, NVCC guard, dedup get_scheduler, fix post() UB, use get_scheduler in tests
bbfc7db to
1f4aeee
Compare
|
Hi @hkaiser Thank you for the architectural guidance! I have implemented your suggestions to clean up the header dependencies and tag_invoke logic: Forwarding Header: I created executor_scheduler_fwd.hpp and moved the declarations there, which allowed me to strip the redundant forward declarations from the individual executor headers. Tag-Invoke Overloads: I removed the default template arguments and duplicate overloads in parallel_executor and sequenced_executor, replacing them with the concrete friend functions you suggested. NVCC Fix: I ensured the NVCC/CUDACC guards for the destructors were correctly restored. Standard Test Patterns: I updated the unit tests to use the standard ex::get_scheduler(exec) call instead of manual constructor calls. The structure feels much more consistent with the rest of the HPX core modules now. Thanks! |
|
Hi @charan-003 Thanks for the catch on the stdexec compliance and the safety issues! I’ve updated the implementation as follows: Sender/Receiver Concepts: I added sender_concept to executor_sender and receiver_concept to executor_bulk_receiver. These are now correctly picked up by the stdexec concept checks. UB Fix in post: I refactored the lambda capture in executor_operation_state::start(). It now captures the state by reference and only moves the receiver inside the task body. This ensures that if hpx::parallel::execution::post throws, the receiver is still valid for the set_error call in the catch block. I really appreciate the eye for detail on that potential move-before-post UB,definitely a safer implementation now! |
| HPX_NO_UNIQUE_ADDRESS std::decay_t<Shape> shape_; | ||
| HPX_NO_UNIQUE_ADDRESS std::decay_t<F> f_; | ||
|
|
||
| #if defined(HPX_HAVE_STDEXEC) |
There was a problem hiding this comment.
#if defined(HPX_HAVE_STDEXEC) can be removed now as well.
| #include <hpx/executors/detail/index_queue_spawning.hpp> | ||
| #include <hpx/executors/execution_policy_mappings.hpp> | ||
| #include <hpx/executors/executor_scheduler.hpp> | ||
| #include <hpx/executors/executor_scheduler_bulk.hpp> |
There was a problem hiding this comment.
I think you can rmeove these files now as you have added the forwarding declaration.
d0b7f07 to
7ab5571
Compare
|
@hkaiser C++20 Concepts: I’ve replaced the legacy std::enable_if_t in the executor_scheduler constructor with a clean C++20 requires clause and included the header. Purged Redundant Guards: Removed the #if defined(HPX_HAVE_STDEXEC) guards in executor_scheduler_bulk.hpp, as we now rely strictly on the modern Sender/Receiver path. Header Optimization: I have completely removed the implementation headers (executor_scheduler.hpp and executor_scheduler_bulk.hpp) from parallel_executor.hpp and sequenced_executor.hpp. These files now rely solely on the lightweight executor_scheduler_fwd.hpp header, significantly reducing the include bloat for the core executors. The code is now much cleaner and follows the forwarding pattern we discussed. It should be ready for a final review! |
|
@shivansh023023 Thanks for the updates. I'd like to wait for #7257 to be finalized before merging anything else related to s&r. Please be patient as this may take a moment. |
Sure sir |
P2300
bulkIntegration for HPX ExecutorsDescription
This PR implements the P2300
bulksender adapter specifically for HPX's legacy executors, including theparallel_executorandsequenced_executor. By bridging thehpx::execution::experimental::bulkalgorithm with HPX's internalbulk_sync_executemechanism, this change ensures that data-parallel workloads are properly load-balanced and optimized using HPX's high-performance partitioners.Key Technical Improvements:
bulksender to the underlying HPX execution engine for native parallel performance.tag_invokeOverloads: Implemented a full suite ofconnect_toverloads (supporting&&,&, andconst&) to ensure compatibility with consumer algorithms likesync_wait.std::decay_twithin thebulk_receiverto prevent reference collapsing and ensure the safety of deferred functional object execution.executorsmodule rather than the coreexecutionmodule.Proposed Changes
libs/core/executors/include/hpx/executors/executor_scheduler_bulk.hpp: Implementsexecutor_bulk_senderandexecutor_bulk_receiver.libs/core/executors/include/hpx/executors/executor_scheduler.hpp: Exposesget_completion_scheduler_tfor ADL discovery.libs/core/executors/include/hpx/executors/parallel_executor.hpp&sequenced_executor.hpp: Integrated the new bulk headers.libs/core/executors/tests/unit/executor_algorithm_bulk.cpp: Comprehensive validation for sequential and parallel policies.Background context
This work is part of the ongoing effort to modernize HPX's execution model to align with the C++23 Sender/Receiver (P2300) standard. Implementing
bulkis a "Big Impact" milestone because it allows modern asynchronous pipelines to tap into the mature, multi-threaded performance of the HPX runtime.Checklist
executor_algorithm_bulk_testtarget.inspecttool.