feat(ptodsl): Migrate TileLib from TilelangDSL to PTODSL: Daemon and metadata#894
feat(ptodsl): Migrate TileLib from TilelangDSL to PTODSL: Daemon and metadata#894ManiSadati wants to merge 6 commits into
Conversation
Codex Review该评论由 review 机器人自动更新。
SummaryReview failed at stage Findings未生成结构化 findings,因为 review 过程提前失败。 Log Tail |
… and constraint support
There was a problem hiding this comment.
Code Review
This pull request introduces the PTODSL TileLib backend, adding the InsertTemplateAttributes pass to attach legal template candidates to tile operations and updating ExpandTileOp to support both TileLang and PTODSL backends. It also implements a Unix-socket daemon server and client in Python to handle template metadata queries and specialization rendering. The review feedback is highly constructive, pointing out critical issues such as a potential denial-of-service vulnerability in socket message framing, race conditions in the template registry under concurrent access, and incorrect cache eviction logic. Additionally, the feedback addresses cross-platform compilation issues on Windows caused by POSIX-specific headers and functions, and suggests using MLIR's diagnostic infrastructure instead of direct llvm::errs() calls for better error reporting.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| def recv_message(sock) -> dict: | ||
| """Receive one length-prefixed UTF-8 JSON message.""" | ||
| length = int.from_bytes(recv_exactly(sock, 4), byteorder="big") | ||
| return json.loads(recv_exactly(sock, length).decode("utf-8")) |
There was a problem hiding this comment.
Reading the length prefix from the socket and passing it directly to recv_exactly without validation can lead to massive memory allocation or hanging if the length prefix is corrupted or maliciously set to a very large value (e.g., up to 4 GB). Adding a maximum message size limit (e.g., 100 MB) is a standard security best practice to prevent denial of service (DoS) via memory exhaustion.
def recv_message(sock) -> dict:
"""Receive one length-prefixed UTF-8 JSON message."""
max_message_size = 100 * 1024 * 1024 # 100 MB
length = int.from_bytes(recv_exactly(sock, 4), byteorder=\"big\")
if length > max_message_size:
raise ValueError(f"Message length {length} exceeds maximum allowed size")
return json.loads(recv_exactly(sock, length).decode(\"utf-8\"))There was a problem hiding this comment.
Fixed. Added a 64 MiB MAX_MESSAGE_SIZE limit and validate both received and outgoing messages. Oversized input is rejected before reading its payload. Added a regression test covering this case.
| class TileTemplateRegistry: | ||
| def __init__(self): | ||
| self._descriptors: list = [] | ||
|
|
||
| def register(self, descriptor) -> None: | ||
| # Re-registration (e.g. module reload) replaces the prior entry with the same name. | ||
| self._descriptors = [ | ||
| d for d in self._descriptors | ||
| if not (d.op == descriptor.op and d.target == descriptor.target and d.name == descriptor.name) | ||
| ] | ||
| self._descriptors.append(descriptor) |
There was a problem hiding this comment.
The daemon runs as a multi-threaded server (ThreadingUnixStreamServer), meaning multiple threads can handle requests concurrently. If different operations are requested concurrently for the first time, they will trigger lazy template loading, which calls register concurrently. Since register modifies self._descriptors (a plain list) without synchronization, this can lead to race conditions and corrupted registry state. Adding a threading.Lock protects the list from concurrent modification.
| class TileTemplateRegistry: | |
| def __init__(self): | |
| self._descriptors: list = [] | |
| def register(self, descriptor) -> None: | |
| # Re-registration (e.g. module reload) replaces the prior entry with the same name. | |
| self._descriptors = [ | |
| d for d in self._descriptors | |
| if not (d.op == descriptor.op and d.target == descriptor.target and d.name == descriptor.name) | |
| ] | |
| self._descriptors.append(descriptor) | |
| import threading | |
| class TileTemplateRegistry: | |
| def __init__(self): | |
| self._descriptors: list = [] | |
| self._lock = threading.Lock() | |
| def register(self, descriptor) -> None: | |
| # Re-registration (e.g. module reload) replaces the prior entry with the same name. | |
| with self._lock: | |
| self._descriptors = [ | |
| d for d in self._descriptors | |
| if not (d.op == descriptor.op and d.target == descriptor.target and d.name == descriptor.name) | |
| ] | |
| self._descriptors.append(descriptor) |
There was a problem hiding this comment.
Addressed by simplifying the server architecture. PTODSL now uses sequential socketserver.UnixStreamServer instead of ThreadingUnixStreamServer. Requests and lazy template registration execute on one server thread, so the registry no longer requires locking.
| if os.path.exists(args.socket): | ||
| os.unlink(args.socket) |
There was a problem hiding this comment.
If args.socket is a broken symlink, os.path.exists returns False, so os.unlink is not called. However, the symlink still exists in the filesystem, which can cause the subsequent bind to fail with Address already in use. Using a try/except FileNotFoundError block safely unlinks the socket path regardless of whether it is a broken symlink.
| if os.path.exists(args.socket): | |
| os.unlink(args.socket) | |
| try: | |
| os.unlink(args.socket) | |
| except FileNotFoundError: | |
| pass |
There was a problem hiding this comment.
Fixed. Socket cleanup now calls os.unlink() directly and catches FileNotFoundError, which also correctly removes broken symlinks. The same helper is used during startup and shutdown, with a regression test.
|
|
||
| with self._state_lock: | ||
| if len(self._cache) >= self._max_entries: | ||
| self._cache.pop(next(iter(self._cache))) | ||
| self._stats["evictions"] += 1 |
There was a problem hiding this comment.
In _instantiate, if len(self._cache) >= self._max_entries, an entry is evicted. However, if key is already present in self._cache, setting self._cache[key] = mlir_text will overwrite the existing entry and not increase the size of the cache. Evicting another entry in this case is unnecessary and reduces cache efficiency. Checking if key not in self._cache before evicting prevents this.
with self._state_lock:
if key not in self._cache and len(self._cache) >= self._max_entries:
self._cache.pop(next(iter(self._cache)))
self._stats[\"evictions\"] += 1
self._cache[key] = mlir_textThere was a problem hiding this comment.
This no longer applies after changing the daemon to sequential UnixStreamServer. _instantiate checks and updates the cache within one request before another request can execute, so the key cannot be inserted concurrently between the lookup and eviction steps. The threading lock and concurrency-specific code were removed.
| llvm::errs() << "Note: Daemon process started (pid=" << procInfo.Pid | ||
| llvm::errs() << "Note: Daemon process started (pid=" << procInfo.Pid | ||
| << ") but socket not found. Check daemon logs.\n"; | ||
| kill(procInfo.Pid, SIGTERM); |
There was a problem hiding this comment.
The use of POSIX kill and SIGTERM directly in TilelangDaemon.cpp breaks compilation on non-POSIX platforms like Windows. Wrapping it in #ifndef _WIN32 or using portable LLVM/platform-specific APIs ensures cross-platform compatibility.
| kill(procInfo.Pid, SIGTERM); | |
| #ifndef _WIN32 | |
| kill(procInfo.Pid, SIGTERM); | |
| #endif |
There was a problem hiding this comment.
TilelangDaemon.cpp was already POSIX-only and already called kill(..., SIGTERM) in DaemonManager::stop() before this PR. It also uses Unix-domain sockets, /tmp, getpid, and POSIX environment semantics.
| #include <cstdlib> | ||
| #include <signal.h> | ||
| #include <thread> | ||
| #include <unistd.h> |
There was a problem hiding this comment.
<unistd.h> was already included before this PR and is required by the existing POSIX daemon implementation, including getpid(). This component manages Unix-domain sockets.
| if (tileLibBackend == "ptodsl") { | ||
| llvm::errs() | ||
| << "ExpandTileOp: PTODSL daemon RPC failed; refusing to fall back " | ||
| "to TileLang\n"; | ||
| return nullptr; | ||
| } |
There was a problem hiding this comment.
Using llvm::errs() directly for compilation failures bypasses the MLIR diagnostic infrastructure. Since tileOp is available as a parameter, using tileOp->emitError() is preferred as it allows diagnostics to be captured properly by IDEs and other tools.
if (tileLibBackend == \"ptodsl\") {
tileOp->emitError(
\"ExpandTileOp: PTODSL daemon RPC failed; refusing to fall back \"
\"to TileLang\");
return nullptr;
}| if (tileLibBackend == "ptodsl") { | ||
| llvm::errs() << "ExpandTileOp: PTODSL backend requires its daemon\n"; | ||
| return nullptr; | ||
| } |
There was a problem hiding this comment.
Using llvm::errs() directly for compilation failures bypasses the MLIR diagnostic infrastructure. Since tileOp is available as a parameter, using tileOp->emitError() is preferred as it allows diagnostics to be captured properly by IDEs and other tools.
if (tileLibBackend == \"ptodsl\") {
tileOp->emitError(\"ExpandTileOp: PTODSL backend requires its daemon\");
return nullptr;
}
Description
PTOAS currently relies on TileLang DSL to render TileOp templates for the VPTO backend.
This PR introduces a PTODSL-native TileLib path while preserving the existing TileLang backend. It establishes the infrastructure needed to migrate templates incrementally without changing the default backend or requiring performance-based version selection yet.
What changed
PTODSL TileLib
taddtsubtmultmaxtmintdivtcolmaxtaddandtmul.PTODSL daemon
PTOAS integration
--tile-lib-backend=tilelang|ptodsl.InsertTemplateAttributes, which runs before fusion and stores legal candidates on each TileOp as an attribute.idnameloop_depthpostupdatetailExpandTileOpto consume the attached candidates and render the first remaining candidate.Current selection behavior
This PR does not introduce fusion-aware or cost-model-based version selection.
When multiple legal candidates remain,
ExpandTileOpdeterministically renders candidate index zero. A future change can filter or reorder the candidate array before expansion.Out of scope
lib/TileOps.Validation
PTODSL TileLib daemon
Run the focused daemon tests:
This covers:
PTODSL TileLib Python suite
Run all TileLib Python tests:
python3 -m unittest discover \ -s ptodsl/tests \ -p 'test_tilelib_*.py'This covers template registration, legality constraints, candidate selection,
structured MLIR rendering, elementwise templates, and daemon behavior.
PTOAS integration
Run the focused PTODSL and legacy TileLang integration tests through the
generated lit configuration:
This covers:
InsertTemplateAttributesplacement before fusion;