Skip to content

feat: add executor pool support#687

Merged
wgtmac merged 1 commit into
apache:mainfrom
HuaHuaY:infra
Jun 5, 2026
Merged

feat: add executor pool support#687
wgtmac merged 1 commit into
apache:mainfrom
HuaHuaY:infra

Conversation

@HuaHuaY
Copy link
Copy Markdown
Contributor

@HuaHuaY HuaHuaY commented May 28, 2026

No description provided.

Copy link
Copy Markdown
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work. The abstraction is clean and the test coverage is solid. Since the default executor is nullopt, existing paths stay single-threaded, so the risk is well contained.

A couple of things before merge:

  • The PR has no description. Worth adding the motivation and a short design note, ideally with the parallel/serial numbers that justify it.
  • The RetryRunner change from a runtime fluent API (OnlyRetryOn/StopRetryOn) to the compile-time RetryRunner<Policy> is a breaking API change that's independent of executor support. Consider splitting it into its own PR so it gets reviewed on its own merits.

Left a few inline notes.

Comment thread src/iceberg/util/task_group.cc
Comment thread src/iceberg/util/task_group.h
Comment thread src/iceberg/manifest/manifest_group.cc Outdated
Copy link
Copy Markdown
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more minor notes, non-blocking.

Comment thread src/iceberg/manifest/manifest_filter_manager.cc Outdated
Comment thread src/iceberg/update/snapshot_update.cc Outdated
Comment thread src/iceberg/util/retry_util.h
Copy link
Copy Markdown
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another pass, this time on the interface design and extensibility.

The overall shape is good: one virtual Executor that engines implement, threaded through the builders via PlanWith. The Arrow adapter test is a nice proof that an external pool drops in with basically one line, so "bring your own threadpool" is well covered.

On future async directions (C++23 coroutines, P2300 std::execution): the model here is synchronous and blocking, TaskGroup::Run() fans out and blocks on std::future::get(), and the planning APIs return Result<...> directly. So this is a parallel-for primitive, not a step toward a sender/receiver pipeline. That's a reasonable scope for now, I'd just flag it explicitly so nobody expects this interface to extend into async later, it'll be a separate one. Details inline.

Comment thread src/iceberg/util/executor.h
Comment thread src/iceberg/util/executor.h
Comment thread src/iceberg/util/task_group.cc
Comment thread src/iceberg/update/snapshot_update.h Outdated
Copy link
Copy Markdown
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more pass. The main thing I found is a documentation gap around a new concurrency contract that now leaks onto user-supplied callbacks.

Once an executor is set, the user's ManifestWriterFactory and the shared FileIO get called from multiple worker threads. The tests already account for this (the factories use an atomic counter plus a barrier), so the requirement is understood, it's just not written down anywhere a downstream engine would see it. Worth documenting on the public surface.

Comment thread src/iceberg/manifest/manifest_merge_manager.h Outdated
Comment thread src/iceberg/delete_file_index.cc Outdated
*/

// Borrowed the file from Apache Arrow:
// https://github.com/apache/arrow/blob/main/cpp/src/arrow/util/functional.h
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to update LICENSE file

@wgtmac wgtmac merged commit 171ac57 into apache:main Jun 5, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants