pylate-onnx-export fails on several PyLate-compatible models (LFM2, Jina-v2) — multiple small bugs, proposed fixes


Trying to export `LiquidAI/LFM2-ColBERT-350M` (and `jinaai/jina-colbert-v2` as a fallback) to ONNX for use with ColGREP via `pylate-onnx-export` (package `colbert_export` v0.1.0, pulled from PyPI). 

The exporter only produces `tokenizer.json` + `config_sentence_transformers.json` — no `model.onnx` — and exits with a series of errors. Each error below was resolved by a small patch; the final blocker is architectural and specific to LFM2.

Environment: Python 3.12, `pylate-onnx-export==0.1.0`, `torch==2.x`, Linux.

## Bug 1 — `KeyError: 'token_type_ids'` on any non-ModernBERT architecture

```
Model architecture: Lfm2Model
Uses token_type_ids: True
Saved tokenizer to: ...
Saved config to: ...
Error: 'token_type_ids'
```

`detect_model_architecture` in `colbert_export/export.py` has a hardcoded allowlist:

```python
uses_token_type_ids = True
if "ModernBert" in model_class_name:
    uses_token_type_ids = False
```

Any non-ModernBERT backbone (LFM2, XLM-Roberta variants, Qwen, Llama-based ColBERTs…) defaults to `True`, then the exporter does `inputs["token_type_ids"]` against a tokenizer that didn't emit that key → KeyError.

**Proposed fix:** probe the tokenizer directly — authoritative for any architecture:

```python
tokenizer = pylate_model[0].tokenizer
probe = tokenizer("probe", return_tensors="pt")
uses_token_type_ids = "token_type_ids" in probe
```

## Bug 2 — new `torch.onnx` dynamo path fails on non-stock architectures

With bug 1 fixed, the export proceeds to `torch.onnx.export` and fails:

```
RuntimeError: 8*s72 (…) is not tracked with proxy for
  <torch.fx.experimental.proxy_tensor._ModuleStackTracer object at 0x...>
[torch.onnx] Obtain model graph for `ColBERTForONNX([...]` with
  `torch.export.export(..., strict=False)`... ❌
[torch.onnx] Obtain model graph for `ColBERTForONNX([...]` with
  `torch.export.export(..., strict=True)`... ❌
```

Recent PyTorch defaults the exporter to the dynamo path (`torch.export.export` → `onnxscript`), which doesn't handle some shape-dependent control flow in non-stock modeling code.

**Proposed fix:** pass `dynamo=False` to `torch.onnx.export` to fall back to the legacy TorchScript tracer, which is far more forgiving. Optionally make it configurable.

## Bug 3 — missing `onnxscript` dependency not declared

Even if you keep the dynamo path, `onnxscript` is needed and not declared in the package deps:

```
Error: No module named 'onnxscript'
```

**Proposed fix:** add `onnxscript` to install_requires (or at minimum document it in the README prerequisites).

## Bug 4 — no `trust_remote_code=True` for models that ship custom modeling code

When trying Jina-v2 as an alternative:

```
Error: jinaai/xlm-roberta-flash-implementation You can inspect the repository
  content at https://hf.co/jinaai/jina-colbert-v2.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.
```

`pylate_models.ColBERT(...)` accepts `trust_remote_code`, but the exporter doesn't forward it. Blocks any HF model that ships custom modeling code (Jina, a bunch of community ColBERTs).

**Proposed fix:**

```python
pylate_model = pylate_models.ColBERT(
    model_name_or_path=model_name,
    device="cpu",
    do_query_expansion=False,
    trust_remote_code=True,
)
```

(Or expose it as a CLI flag, defaulting true — the user already has to opt in to running the exporter on this specific model.)

## Bug 5 — BF16 weights cause symbolic-registry misses during ONNX graph build

On Jina-v2 with all the above fixed, the full 24-layer graph traces, then fails at `linalg_vector_norm`/`clamp_min`/`aten::add` inside `F.normalize`:

```
Error: Argument passed to at() was not in the map.
```

(This is `std::unordered_map::at()` throwing `out_of_range` — the ONNX symbolic registry has no handler for the op/dtype combination.)

Root cause: Jina's `xlm-roberta-flash-implementation` loads in BFloat16; several ONNX symbolic handlers (at least through opset 20) have no BF16 coverage for these ops.

**Proposed fix:** force FP32 before export:

```python
model = ColBERTForONNX(pylate_model, uses_token_type_ids=arch_info["uses_token_type_ids"])
model = model.float()
model.eval()
```

After this patch, Jina-v2 still fails (see bug 6), but this unblocks any model that loads in BF16 by default.

## Bug 6 — architectural blocker on LFM2 and Jina's flash-XLM-Roberta (not actionable upstream)

With all above patches applied, both models still hit variants of `Argument passed to at() was not in the map.`:

- **LFM2 (`LiquidAI/LFM2-ColBERT-350M`):** fails inside Liquid-Foundation-Model blocks. Liquid's gated convolution and linear-recurrence ops have no ONNX equivalents at all — moving handlers to a higher opset doesn't help.
- **Jina-v2 (`jinaai/jina-colbert-v2`):** fails inside the XLM-Roberta-flash encoder at seemingly ordinary ops (`aten::linear`, `aten::add`, `aten::linalg_vector_norm`). Each successive workaround (hand-rolled L2 normalize, tensor-wrapped epsilon, skipping normalization entirely) just moves the error to the next op.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pylate-onnx-export fails on several PyLate-compatible models (LFM2, Jina-v2) — multiple small bugs, proposed fixes #93

Bug 1 — `KeyError: 'token_type_ids'` on any non-ModernBERT architecture

Bug 2 — new `torch.onnx` dynamo path fails on non-stock architectures

Bug 3 — missing `onnxscript` dependency not declared

Bug 4 — no `trust_remote_code=True` for models that ship custom modeling code

Bug 5 — BF16 weights cause symbolic-registry misses during ONNX graph build

Bug 6 — architectural blocker on LFM2 and Jina's flash-XLM-Roberta (not actionable upstream)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

pylate-onnx-export fails on several PyLate-compatible models (LFM2, Jina-v2) — multiple small bugs, proposed fixes #93

Description

Bug 1 — KeyError: 'token_type_ids' on any non-ModernBERT architecture

Bug 2 — new torch.onnx dynamo path fails on non-stock architectures

Bug 3 — missing onnxscript dependency not declared

Bug 4 — no trust_remote_code=True for models that ship custom modeling code

Bug 5 — BF16 weights cause symbolic-registry misses during ONNX graph build

Bug 6 — architectural blocker on LFM2 and Jina's flash-XLM-Roberta (not actionable upstream)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Bug 1 — `KeyError: 'token_type_ids'` on any non-ModernBERT architecture

Bug 2 — new `torch.onnx` dynamo path fails on non-stock architectures

Bug 3 — missing `onnxscript` dependency not declared

Bug 4 — no `trust_remote_code=True` for models that ship custom modeling code