Skip to content

feat(ptodsl): Migrate TileLib from TilelangDSL to PTODSL: Daemon and metadata#894

Open
ManiSadati wants to merge 6 commits into
hw-native-sys:mainfrom
afshinarefi:mani/ptodsl
Open

feat(ptodsl): Migrate TileLib from TilelangDSL to PTODSL: Daemon and metadata#894
ManiSadati wants to merge 6 commits into
hw-native-sys:mainfrom
afshinarefi:mani/ptodsl

Conversation

@ManiSadati

@ManiSadati ManiSadati commented Jul 1, 2026

Copy link
Copy Markdown

Description

PTOAS currently relies on TileLang DSL to render TileOp templates for the VPTO backend.

This PR introduces a PTODSL-native TileLib path while preserving the existing TileLang backend. It establishes the infrastructure needed to migrate templates incrementally without changing the default backend or requiring performance-based version selection yet.

What changed

PTODSL TileLib

  • Added a PTODSL-native template registry, decorator, metadata model, legality constraints, and render runtime.
  • Added initial A5 templates for:
    • tadd
    • tsub
    • tmul
    • tmax
    • tmin
    • default-precision tdiv
    • tcolmax
  • Added four registered implementations (versions) for tadd and tmul.
  • Added lazy per-operation template loading.
  • Added version selection and structured-MLIR rendering tests.

PTODSL daemon

  • Added an independent Unix-socket daemon, client, helper, wire protocol, and render cache.
  • Added separate RPC operations for:
    • retrieving all legal candidate metadata;
    • rendering a specifically selected candidate.
  • Removed any PTODSL dependency on TileLang template paths or daemon implementation.

PTOAS integration

  • Added --tile-lib-backend=tilelang|ptodsl.
  • Kept TileLang as the default backend.
  • Added InsertTemplateAttributes, which runs before fusion and stores legal candidates on each TileOp as an attribute.
  • Candidate attributes (for now) contain only:
    • id
    • name
    • loop_depth
    • postupdate
    • tail
  • Candidates are sorted by unique ID.
  • Updated ExpandTileOp to consume the attached candidates and render the first remaining candidate.
  • Kept the TileLang expansion path unchanged.
  • PTODSL daemon failures are reported directly and do not silently fall back to TileLang.

Current selection behavior

This PR does not introduce fusion-aware or cost-model-based version selection.

When multiple legal candidates remain, ExpandTileOp deterministically renders candidate index zero. A future change can filter or reorder the candidate array before expansion.

Out of scope

  • Making PTODSL the default TileLib backend.
  • Removing TileLang or lib/TileOps.
  • Fusion-driven performance selection.
  • Complete TileOp/template parity.
  • Mixed tile/scalar template operands.
  • Context-attribute-dependent templates.
  • Cube, DMA, and specialized-layout templates.

Validation

PTODSL TileLib daemon

Run the focused daemon tests:

python3 ptodsl/tests/test_tilelib_daemon.py

This covers:

  • daemon startup and ping;
  • owner-only Unix-socket permissions;
  • legal-candidate metadata queries;
  • request-specific tail metadata;
  • named-candidate rendering;
  • cache hits, clearing, and context-sensitive cache keys;
  • oversized RPC message rejection;
  • stale/broken socket-path cleanup;
  • explicit rejection of unsupported operands and unknown operations.

PTODSL TileLib Python suite

Run all TileLib Python tests:

python3 -m unittest discover \
  -s ptodsl/tests \
  -p 'test_tilelib_*.py'

This covers template registration, legality constraints, candidate selection,
structured MLIR rendering, elementwise templates, and daemon behavior.

PTOAS integration

Run the focused PTODSL and legacy TileLang integration tests through the
generated lit configuration:

"$LLVM_BUILD_DIR/bin/llvm-lit" -sv build/test/lit \
  --filter='expand_tile_op_(ptodsl_tsub|ptodsl_tadd|tilelang_tsub)'

This covers:

  • PTODSL single-candidate expansion;
  • compact multi-candidate metadata insertion;
  • deterministic candidate-zero rendering;
  • InsertTemplateAttributes placement before fusion;
  • legacy TileLang backend compatibility.

@reedhecre

reedhecre commented Jul 1, 2026

Copy link
Copy Markdown

Codex Review

该评论由 review 机器人自动更新。

Summary

Review failed at stage codex-review: exit=1

Findings

未生成结构化 findings,因为 review 过程提前失败。

Log Tail

 ptodsl/ptodsl/tilelib/templates/a5/tbinop.py       | 200 ++++++++++
 ptodsl/ptodsl/tilelib/templates/a5/tcolmax.py      |  69 ++++
 ptodsl/ptodsl/tilelib/templates/a5/tdiv.py         |  42 ++
 ptodsl/ptodsl/tilelib/templates/a5/tmax.py         |  36 ++
 ptodsl/ptodsl/tilelib/templates/a5/tmin.py         |  36 ++
 ptodsl/ptodsl/tilelib/templates/a5/tmul.py         | 110 ++++++
 ptodsl/ptodsl/tilelib/templates/a5/tsub.py         |  36 ++
 ptodsl/tests/fixtures/tadd_a5_8x64_f32.golden.mlir |  38 ++
 ptodsl/tests/test_tilelib_constraints.py           |  52 +++
 ptodsl/tests/test_tilelib_daemon.py                | 217 +++++++++++
 ptodsl/tests/test_tilelib_elementwise.py           |  49 +++
 ptodsl/tests/test_tilelib_render.py                |  71 ++++
 ptodsl/tests/test_tilelib_select.py                | 142 +++++++
 scripts/ptoas_env.sh                               |   3 +
 test/lit/vpto/expand_tile_op_ptodsl_tadd.pto       |  82 ++++
 test/lit/vpto/expand_tile_op_ptodsl_tsub.pto       |  47 +++
 tools/ptoas/CMakeLists.txt                         |   9 +-
 tools/ptoas/TilelangDaemon.cpp                     |  49 ++-
 tools/ptoas/TilelangDaemon.h                       |  13 +-
 tools/ptoas/ptoas.cpp                              | 153 ++++++--
 43 files changed, 3975 insertions(+), 102 deletions(-)
===== END STAGE clone rc=0 @ 2026-07-02 04:55:33 =====

===== STAGE codex-review @ 2026-07-02 04:55:33 =====
set -euo pipefail
cd '/tmp/ptoas-pr-review-monitor/runs/20260702_045526_pr894/repo'
'codex' exec -C '/tmp/ptoas-pr-review-monitor/runs/20260702_045526_pr894/repo' -s read-only -c 'model_provider="codereview"' -c 'model="gpt-5.4"' -c 'model_reasoning_effort="xhigh"' --output-schema '/tmp/ptoas-pr-review-monitor/runs/20260702_045526_pr894/review_schema.json' -o '/tmp/ptoas-pr-review-monitor/runs/20260702_045526_pr894/codex_last_message.json' --color never - < '/tmp/ptoas-pr-review-monitor/runs/20260702_045526_pr894/review_prompt.txt'
[monitor] stage timeout: 1800s
OpenAI Codex v0.115.0 (research preview)
--------
workdir: /tmp/ptoas-pr-review-monitor/runs/20260702_045526_pr894/repo
model: gpt-5.4
provider: codereview
approval: never
sandbox: read-only
reasoning effort: xhigh
reasoning summaries: none
session id: 019f1f77-3709-7bf0-9662-db986b0bea3d
--------
user
你现在在审查 GitHub PR。

仓库:hw-native-sys/PTOAS
PR:#894 feat(ptodsl): Migrate TileLib from TilelangDSL to PTODSL: Daemon and metadata
作者:ManiSadati
base branch:origin/main
head branch:HEAD(当前已 checkout 到 PR head)

要求:
1. 只审查这个 PR 相对 origin/main 的改动,必要时可以看上下文文件。
2. 重点找真实的 correctness / regression / contract mismatch / CI / runtime / compatibility 问题。
3. 不要提纯风格建议,不要提低价值猜测。
4. 严格按优先级输出:
   - P1:高概率会导致错误结果、编译/运行失败、严重回归、发布阻断
   - P2:重要缺陷、行为回归、遗漏校验/测试、较大兼容性问题
   - P3:次要但明确可改的问题
5. 如果没有问题,summary 直接写:未检查到 PR #894 存在问题,并返回 findings=[]。
6. 如果有问题,summary 简洁概括,findings 里每条都要给出:
   - severity
   - title
   - body(说明为什么是问题,尽量具体)
   - file(尽量给相对路径)
   - line(能确定就填整数,否则 null)

建议先查看:
- git status --short
- git diff --stat origin/main...HEAD
- git diff --unified=80 origin/main...HEAD

最终输出必须严格匹配 JSON schema。

mcp startup: no servers
Reconnecting... 1/5 (unexpected status 403 Forbidden: {"code":"INSUFFICIENT_BALANCE","message":"Insufficient account balance"}, url: https://codex.0u0o.com/responses, cf-ray: a1484ff91adf312c-LAX, request id: c2d65c44-f50d-4a88-98d4-b5d264fb5318)
Reconnecting... 2/5 (unexpected status 403 Forbidden: {"code":"INSUFFICIENT_BALANCE","message":"Insufficient account balance"}, url: https://codex.0u0o.com/responses, cf-ray: a1484ffc8a90984e-LAX, request id: 6092f611-63a7-4690-a2b4-5e4f1f185267)
Reconnecting... 3/5 (unexpected status 403 Forbidden: {"code":"INSUFFICIENT_BALANCE","message":"Insufficient account balance"}, url: https://codex.0u0o.com/responses, cf-ray: a14850015dd51e0f-LAX, request id: e09fb3ee-b322-4a45-94cd-9c780a868306)
Reconnecting... 4/5 (unexpected status 403 Forbidden: {"code":"INSUFFICIENT_BALANCE","message":"Insufficient account balance"}, url: https://codex.0u0o.com/responses, cf-ray: a14850090adfa0c6-LAX, request id: 19f2f681-7245-4d36-ac17-17b5c7709022)
Reconnecting... 5/5 (unexpected status 403 Forbidden: {"code":"INSUFFICIENT_BALANCE","message":"Insufficient account balance"}, url: https://codex.0u0o.com/responses, cf-ray: a1485014ac3b135e-LAX, request id: 64a158fe-b3d3-4914-99fe-146c9e96d5db)
ERROR: unexpected status 403 Forbidden: {"code":"INSUFFICIENT_BALANCE","message":"Insufficient account balance"}, url: https://codex.0u0o.com/responses, cf-ray: a148502bccdc6e62-LAX, request id: 12c938e5-e118-4675-93e0-a929dfd8be69
Warning: no last agent message; wrote empty content to /tmp/ptoas-pr-review-monitor/runs/20260702_045526_pr894/codex_last_message.json
===== END STAGE codex-review rc=1 @ 2026-07-02 04:55:43 =====

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the PTODSL TileLib backend, adding the InsertTemplateAttributes pass to attach legal template candidates to tile operations and updating ExpandTileOp to support both TileLang and PTODSL backends. It also implements a Unix-socket daemon server and client in Python to handle template metadata queries and specialization rendering. The review feedback is highly constructive, pointing out critical issues such as a potential denial-of-service vulnerability in socket message framing, race conditions in the template registry under concurrent access, and incorrect cache eviction logic. Additionally, the feedback addresses cross-platform compilation issues on Windows caused by POSIX-specific headers and functions, and suggests using MLIR's diagnostic infrastructure instead of direct llvm::errs() calls for better error reporting.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +35 to +38
def recv_message(sock) -> dict:
"""Receive one length-prefixed UTF-8 JSON message."""
length = int.from_bytes(recv_exactly(sock, 4), byteorder="big")
return json.loads(recv_exactly(sock, length).decode("utf-8"))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

Reading the length prefix from the socket and passing it directly to recv_exactly without validation can lead to massive memory allocation or hanging if the length prefix is corrupted or maliciously set to a very large value (e.g., up to 4 GB). Adding a maximum message size limit (e.g., 100 MB) is a standard security best practice to prevent denial of service (DoS) via memory exhaustion.

def recv_message(sock) -> dict:
    """Receive one length-prefixed UTF-8 JSON message."""
    max_message_size = 100 * 1024 * 1024  # 100 MB
    length = int.from_bytes(recv_exactly(sock, 4), byteorder=\"big\")
    if length > max_message_size:
        raise ValueError(f"Message length {length} exceeds maximum allowed size")
    return json.loads(recv_exactly(sock, length).decode(\"utf-8\"))

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added a 64 MiB MAX_MESSAGE_SIZE limit and validate both received and outgoing messages. Oversized input is rejected before reading its payload. Added a regression test covering this case.

Comment on lines +34 to +44
class TileTemplateRegistry:
def __init__(self):
self._descriptors: list = []

def register(self, descriptor) -> None:
# Re-registration (e.g. module reload) replaces the prior entry with the same name.
self._descriptors = [
d for d in self._descriptors
if not (d.op == descriptor.op and d.target == descriptor.target and d.name == descriptor.name)
]
self._descriptors.append(descriptor)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The daemon runs as a multi-threaded server (ThreadingUnixStreamServer), meaning multiple threads can handle requests concurrently. If different operations are requested concurrently for the first time, they will trigger lazy template loading, which calls register concurrently. Since register modifies self._descriptors (a plain list) without synchronization, this can lead to race conditions and corrupted registry state. Adding a threading.Lock protects the list from concurrent modification.

Suggested change
class TileTemplateRegistry:
def __init__(self):
self._descriptors: list = []
def register(self, descriptor) -> None:
# Re-registration (e.g. module reload) replaces the prior entry with the same name.
self._descriptors = [
d for d in self._descriptors
if not (d.op == descriptor.op and d.target == descriptor.target and d.name == descriptor.name)
]
self._descriptors.append(descriptor)
import threading
class TileTemplateRegistry:
def __init__(self):
self._descriptors: list = []
self._lock = threading.Lock()
def register(self, descriptor) -> None:
# Re-registration (e.g. module reload) replaces the prior entry with the same name.
with self._lock:
self._descriptors = [
d for d in self._descriptors
if not (d.op == descriptor.op and d.target == descriptor.target and d.name == descriptor.name)
]
self._descriptors.append(descriptor)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed by simplifying the server architecture. PTODSL now uses sequential socketserver.UnixStreamServer instead of ThreadingUnixStreamServer. Requests and lazy template registration execute on one server thread, so the registry no longer requires locking.

Comment thread ptodsl/ptodsl/tilelib/serving/daemon.py Outdated
Comment on lines +315 to +316
if os.path.exists(args.socket):
os.unlink(args.socket)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If args.socket is a broken symlink, os.path.exists returns False, so os.unlink is not called. However, the symlink still exists in the filesystem, which can cause the subsequent bind to fail with Address already in use. Using a try/except FileNotFoundError block safely unlinks the socket path regardless of whether it is a broken symlink.

Suggested change
if os.path.exists(args.socket):
os.unlink(args.socket)
try:
os.unlink(args.socket)
except FileNotFoundError:
pass

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Socket cleanup now calls os.unlink() directly and catches FileNotFoundError, which also correctly removes broken symlinks. The same helper is used during startup and shutdown, with a regression test.

Comment thread ptodsl/ptodsl/tilelib/serving/daemon.py Outdated
Comment on lines +281 to +285

with self._state_lock:
if len(self._cache) >= self._max_entries:
self._cache.pop(next(iter(self._cache)))
self._stats["evictions"] += 1

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In _instantiate, if len(self._cache) >= self._max_entries, an entry is evicted. However, if key is already present in self._cache, setting self._cache[key] = mlir_text will overwrite the existing entry and not increase the size of the cache. Evicting another entry in this case is unnecessary and reduces cache efficiency. Checking if key not in self._cache before evicting prevents this.

        with self._state_lock:
            if key not in self._cache and len(self._cache) >= self._max_entries:
                self._cache.pop(next(iter(self._cache)))
                self._stats[\"evictions\"] += 1
            self._cache[key] = mlir_text

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This no longer applies after changing the daemon to sequential UnixStreamServer. _instantiate checks and updates the cache within one request before another request can execute, so the key cannot be inserted concurrently between the lookup and eviction steps. The threading lock and concurrency-specific code were removed.

llvm::errs() << "Note: Daemon process started (pid=" << procInfo.Pid
llvm::errs() << "Note: Daemon process started (pid=" << procInfo.Pid
<< ") but socket not found. Check daemon logs.\n";
kill(procInfo.Pid, SIGTERM);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The use of POSIX kill and SIGTERM directly in TilelangDaemon.cpp breaks compilation on non-POSIX platforms like Windows. Wrapping it in #ifndef _WIN32 or using portable LLVM/platform-specific APIs ensures cross-platform compatibility.

Suggested change
kill(procInfo.Pid, SIGTERM);
#ifndef _WIN32
kill(procInfo.Pid, SIGTERM);
#endif

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TilelangDaemon.cpp was already POSIX-only and already called kill(..., SIGTERM) in DaemonManager::stop() before this PR. It also uses Unix-domain sockets, /tmp, getpid, and POSIX environment semantics.

#include <cstdlib>
#include <signal.h>
#include <thread>
#include <unistd.h>

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The inclusion of <unistd.h> is POSIX-specific and will cause compilation failures on Windows. It should be conditionally included only on non-Windows platforms.

Suggested change
#include <unistd.h>
#ifndef _WIN32
#include <unistd.h>
#endif

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<unistd.h> was already included before this PR and is required by the existing POSIX daemon implementation, including getpid(). This component manages Unix-domain sockets.

Comment on lines +1103 to +1108
if (tileLibBackend == "ptodsl") {
llvm::errs()
<< "ExpandTileOp: PTODSL daemon RPC failed; refusing to fall back "
"to TileLang\n";
return nullptr;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using llvm::errs() directly for compilation failures bypasses the MLIR diagnostic infrastructure. Since tileOp is available as a parameter, using tileOp->emitError() is preferred as it allows diagnostics to be captured properly by IDEs and other tools.

    if (tileLibBackend == \"ptodsl\") {
      tileOp->emitError(
          \"ExpandTileOp: PTODSL daemon RPC failed; refusing to fall back \"
          \"to TileLang\");
      return nullptr;
    }

Comment on lines +1113 to 1116
if (tileLibBackend == "ptodsl") {
llvm::errs() << "ExpandTileOp: PTODSL backend requires its daemon\n";
return nullptr;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using llvm::errs() directly for compilation failures bypasses the MLIR diagnostic infrastructure. Since tileOp is available as a parameter, using tileOp->emitError() is preferred as it allows diagnostics to be captured properly by IDEs and other tools.

  if (tileLibBackend == \"ptodsl\") {
    tileOp->emitError(\"ExpandTileOp: PTODSL backend requires its daemon\");
    return nullptr;
  }

@ManiSadati ManiSadati changed the title Mani/ptodsl feat(ptodsl): Migrate TileLib from TilelangDSL to PTODSL: Daemon and metadata Jul 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants