feat(catalog): enrich catalog interface with more methods#102
feat(catalog): enrich catalog interface with more methods#102ChaomingZhangCN wants to merge 16 commits intoalibaba:mainfrom
Conversation
| /// status. | ||
| virtual Result<std::vector<std::string>> ListTables(const std::string& db_name) const = 0; | ||
|
|
||
| /// Drops a database. |
There was a problem hiding this comment.
This PR includes too many unimplemented APIs at once, which may make the API documentation confusing. Would it be better to submit only a few APIs with their implementations in each PR?
There was a problem hiding this comment.
Thank you for your suggestion. I will implement some of them in the filesystem catalog.
There was a problem hiding this comment.
I think @lucasfang meant that the catalog's public interface should only expose methods that are already implemented — for example, DropDatabase, DropTable, RenameTable, and GetTable in this PR.
The unimplemented interfaces can be left in TODOs and added later when they're fully implemented. This helps avoid confusion in auto-generated documentation, which might otherwise include incomplete APIs.
Just wondering — for methods like DropDatabase in catalog, since they're meant to be implemented by subclasses, would it make sense to make them pure virtual (= 0) instead of providing a default "NotImplemented" implementation?
| const std::string& table_name) | ||
| : schema_(schema), database_(database), table_name_(table_name) {} | ||
|
|
||
| virtual ~Table() = default; |
There was a problem hiding this comment.
Will Table have subclasses? It seems like many functions are declared as virtual.
There was a problem hiding this comment.
Pull request overview
This PR expands the C++ Catalog public interface to better align with the Java catalog API (issue #84), and begins implementing a subset of the newly added methods for the filesystem-backed catalog.
Changes:
- Extends
include/paimon/catalog/catalog.hwith many additional catalog APIs (databases/tables/views/partitions/version-management). - Implements and tests
GetTable,DropDatabase,DropTable, andRenameTableinFileSystemCatalog. - Introduces new
Database(public header) andTable/Viewabstractions (currently undersrc/).
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
include/paimon/catalog/catalog.h |
Adds many new catalog methods and supporting type aliases/forward declarations. |
include/paimon/catalog/database.h |
Adds a new Database interface type for catalog APIs. |
src/paimon/core/catalog/file_system_catalog.h |
Extends filesystem catalog class with new overrides (DropDatabase, DropTable, RenameTable, GetTable). |
src/paimon/core/catalog/file_system_catalog.cpp |
Implements GetTable, DropDatabase, DropTable, RenameTable. |
src/paimon/core/catalog/file_system_catalog_test.cpp |
Adds unit tests covering the newly implemented filesystem catalog methods. |
src/paimon/core/table/table.h |
Adds a basic Table abstraction used by GetTable. |
src/paimon/core/view/view.h |
Adds a View interface abstraction (not yet implemented by filesystem catalog). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| class Database; | ||
| class Table; | ||
| class View; | ||
| class Schema; | ||
| class Snapshot; | ||
| class PartitionStatistics; | ||
| class Tag; | ||
| class Identifier; |
There was a problem hiding this comment.
This public header forward-declares Table, View, Snapshot, Tag, etc., and returns/accepts them via std::shared_ptr<> in the public API. Since only include/ is installed (not src/), API consumers won’t have access to these type definitions, and destroying a std::shared_ptr<IncompleteType> (created in the library with the default deleter) typically requires the complete type in the consumer TU. Please either move these interface types into public headers under include/, or adjust the API to avoid owning shared_ptr to incomplete types (e.g., return metadata/IDs, or use an exported custom deleter pattern).
| #pragma once | ||
|
|
||
| #include <map> | ||
| #include <memory> |
There was a problem hiding this comment.
This header uses std::optional in Comment() but does not include <optional>, which can cause compilation failures depending on include order. Add #include <optional> to this header.
| #include <memory> | |
| #include <memory> | |
| #include <optional> |
| class PAIMON_EXPORT Database { | ||
| public: | ||
| /// ================== Table Metadata ===================== | ||
|
|
||
| /// A name to identify this database. | ||
| virtual std::string Name() const = 0; | ||
|
|
There was a problem hiding this comment.
Database declares virtual methods but does not declare a virtual destructor. Deleting a derived instance through a Database* (or via smart pointers) would be undefined behavior. Add virtual ~Database() = default; (or a protected non-virtual destructor if you want to prevent deletion via base pointer).
| /// Get the table-level options associated with this schema. | ||
| /// @return Options | ||
| virtual const std::map<std::string, std::string>& Options() const = 0; | ||
|
|
||
| /// Get an optional comment describing the table. | ||
| /// @return The table comment if set, or std::nullopt otherwise. |
There was a problem hiding this comment.
The docstrings here refer to “table-level options” and “table comment”, but this interface is for a database. Please update the wording (database-level options/comment) to match the actual type and avoid confusing generated API docs.
| /// Get the table-level options associated with this schema. | |
| /// @return Options | |
| virtual const std::map<std::string, std::string>& Options() const = 0; | |
| /// Get an optional comment describing the table. | |
| /// @return The table comment if set, or std::nullopt otherwise. | |
| /// Get the database-level options associated with this database. | |
| /// @return Options | |
| virtual const std::map<std::string, std::string>& Options() const = 0; | |
| /// Get an optional comment describing the database. | |
| /// @return The database comment if set, or std::nullopt otherwise. |
|
|
||
| #include <chrono> | ||
| #include <map> | ||
| #include <memory> |
There was a problem hiding this comment.
Catalog now uses std::optional in multiple method signatures (e.g., RollbackTo/CreateBranch/CreateTag) but the header does not include <optional>, which will cause compilation failures for translation units that include this header without already including <optional>. Add #include <optional> here.
| #include <memory> | |
| #include <memory> | |
| #include <optional> |
| /// @return A result containing true if commit succeeded, or an error status. | ||
| virtual Result<bool> CommitSnapshot(const Identifier& identifier, const std::string& table_uuid, | ||
| const std::shared_ptr<Snapshot>& snapshot, | ||
| const std::vector<PartitionStatistics>& statistics) { |
There was a problem hiding this comment.
PartitionStatistics is only forward-declared here and appears inside std::vector<PartitionStatistics> in the public API. Standard containers generally require the element type to be complete, so this will either fail to compile or force users to include internal src/ headers. Consider moving PartitionStatistics into an installed public header (under include/), or change the API to use a public/opaque representation (e.g., a serializable struct, shared_ptr to a public interface, etc.).
| const std::vector<PartitionStatistics>& statistics) { | |
| const std::vector<std::shared_ptr<PartitionStatistics>>& statistics) { |
| /// Get tag for table. | ||
| /// | ||
| /// @param identifier Path of the table, cannot be system name. | ||
| /// @param tag_name Tag name | ||
| /// @return A result containing the tag information, or an error status. | ||
| virtual Result<std::shared_ptr<Tag>> GetTag(const Identifier& identifier, | ||
| const std::string& tag_name) const { | ||
| return Status::NotImplemented("GetTag not implemented"); |
There was a problem hiding this comment.
The PR description lists GetTag returning TagInfo, but the public API here declares GetTag returning std::shared_ptr<Tag>. Please align the PR description and the public interface (either update the description or adjust the API) to avoid confusion and documentation drift.
| virtual std::string Query() = 0; | ||
|
|
||
| /// Loads the schema of view. | ||
| virtual std::shared_ptr<Schema> GetSchema() = 0; |
There was a problem hiding this comment.
Query() and GetSchema() are getter-style APIs but are not marked const, which prevents calling them on a const View& and is inconsistent with Name()/FullName() being const. Consider making these methods const (e.g., virtual std::string Query() const = 0; and virtual std::shared_ptr<Schema> GetSchema() const = 0;).
| virtual std::string Query() = 0; | |
| /// Loads the schema of view. | |
| virtual std::shared_ptr<Schema> GetSchema() = 0; | |
| virtual std::string Query() const = 0; | |
| /// Loads the schema of view. | |
| virtual std::shared_ptr<Schema> GetSchema() const = 0; |
| } | ||
|
|
||
| /// Loads the latest schema of table. | ||
| virtual std::shared_ptr<Schema> LatestSchema() { |
There was a problem hiding this comment.
LatestSchema() returns the cached schema pointer and does not appear to mutate state, but it is not marked const. Making it const will improve const-correctness and allow calling it through const std::shared_ptr<Table>& / const Table&.
| virtual std::shared_ptr<Schema> LatestSchema() { | |
| virtual std::shared_ptr<Schema> LatestSchema() const { |
| /// Interface of a database in a catalog. | ||
| class PAIMON_EXPORT Database { | ||
| public: | ||
| /// ================== Table Metadata ===================== |
There was a problem hiding this comment.
virtual ~Database() = default;
| if (IsSystemTable(identifier)) { | ||
| return Status::Invalid(fmt::format("Cannot drop system table {}.", identifier.ToString())); | ||
| } | ||
|
|
There was a problem hiding this comment.
Add CheckNotBranch function here to avoid drop branch
| return Status::Invalid(fmt::format("Cannot drop system table {}.", identifier.ToString())); | ||
| } | ||
|
|
||
| PAIMON_ASSIGN_OR_RAISE(bool exist, TableExists(identifier)); |
There was a problem hiding this comment.
support drop branch/external path here or leave a TODO
Purpose
Close #84
API and Format
Documentation