环境信息(Environment)
OS: Windows 10 x64
Python: 3.12.10
ComfyUI: 0.7.0
PyTorch: 2.8.0
CUDA: 12.9
transformers: 4.x (latest)
Node: ComfyUI_LayerStyle_Advance
Feature: SegmentAnythingUltra / GroundingDINO
问题描述(Problem)
When running SegmentAnythingUltra / GroundingDINO, ComfyUI crashes with:
TypeError: _path_isfile: path should be string, bytes, os.PathLike or integer, not NoneType
Stack trace points to:
os.path.isfile(vocab_file)
inside transformers/models/bert/tokenization_bert.py.
关键现象(Important Findings)
Internet access is working
HuggingFace Hub is accessible
The following works correctly in the same Python environment:
from transformers import AutoTokenizer
AutoTokenizer.from_pretrained("bert-base-uncased")
However, GroundingDINO initialization fails inside the node
🧠 根本原因分析(Root Cause)
The current local_groundingdino/util/get_tokenlizer.py assumes:
AutoTokenizer.from_pretrained("bert-base-uncased")
will always return a tokenizer with a valid vocab_file path
This assumption breaks with modern transformers versions:
AutoTokenizer often returns BertTokenizerFast
Fast tokenizers do not expose vocab_file
vocab_file becomes None
GroundingDINO later calls os.path.isfile(vocab_file) → crash
This issue is not related to network, cache, or missing models, but to:
incompatible assumptions between old GroundingDINO code and new transformers behavior
✅ 解决方案(Working Fix)
Force GroundingDINO to use local HF snapshot paths and explicitly initialize:
BertTokenizer with a real vocab.txt
BertModel from the same local snapshot
Patched get_tokenlizer.py
import os
import huggingface_hub
from transformers import BertTokenizer, BertModel
def _get_local_bert_path(text_encoder_type: str):
repo_path = huggingface_hub.snapshot_download(
repo_id=text_encoder_type,
allow_patterns=[
"vocab.txt",
"tokenizer_config.json",
"config.json",
"pytorch_model.bin",
],
)
return repo_path
def get_tokenlizer(text_encoder_type):
repo_path = _get_local_bert_path(text_encoder_type)
vocab_file = os.path.join(repo_path, "vocab.txt")
if not os.path.isfile(vocab_file):
raise RuntimeError(f"vocab.txt not found: {vocab_file}")
tokenizer = BertTokenizer(
vocab_file=vocab_file,
do_lower_case=True,
)
return tokenizer
def get_pretrained_language_model(text_encoder_type):
repo_path = _get_local_bert_path(text_encoder_type)
model = BertModel.from_pretrained(
repo_path,
local_files_only=True,
)
return model
结果(Result)
GroundingDINO loads correctly
SegmentAnythingUltra works as expected
No _path_isfile NoneType error
Compatible with:
Python 3.12
transformers ≥ 4.x
Fast / slow tokenizer differences
This is not a user environment issue, but a compatibility issue caused by newer transformers behavior.
Explicitly loading BERT tokenizer/model from local HF snapshots resolves the problem reliably.
环境信息(Environment)
OS: Windows 10 x64
Python: 3.12.10
ComfyUI: 0.7.0
PyTorch: 2.8.0
CUDA: 12.9
transformers: 4.x (latest)
Node: ComfyUI_LayerStyle_Advance
Feature: SegmentAnythingUltra / GroundingDINO
问题描述(Problem)
When running SegmentAnythingUltra / GroundingDINO, ComfyUI crashes with:
TypeError: _path_isfile: path should be string, bytes, os.PathLike or integer, not NoneType
Stack trace points to:
os.path.isfile(vocab_file)
inside transformers/models/bert/tokenization_bert.py.
关键现象(Important Findings)
Internet access is working
HuggingFace Hub is accessible
The following works correctly in the same Python environment:
from transformers import AutoTokenizer
AutoTokenizer.from_pretrained("bert-base-uncased")
However, GroundingDINO initialization fails inside the node
🧠 根本原因分析(Root Cause)
The current local_groundingdino/util/get_tokenlizer.py assumes:
AutoTokenizer.from_pretrained("bert-base-uncased")
will always return a tokenizer with a valid vocab_file path
This assumption breaks with modern transformers versions:
AutoTokenizer often returns BertTokenizerFast
Fast tokenizers do not expose vocab_file
vocab_file becomes None
GroundingDINO later calls os.path.isfile(vocab_file) → crash
This issue is not related to network, cache, or missing models, but to:
incompatible assumptions between old GroundingDINO code and new transformers behavior
✅ 解决方案(Working Fix)
Force GroundingDINO to use local HF snapshot paths and explicitly initialize:
BertTokenizer with a real vocab.txt
BertModel from the same local snapshot
Patched get_tokenlizer.py
结果(Result)
GroundingDINO loads correctly
SegmentAnythingUltra works as expected
No _path_isfile NoneType error
Compatible with:
Python 3.12
transformers ≥ 4.x
Fast / slow tokenizer differences
This is not a user environment issue, but a compatibility issue caused by newer transformers behavior.
Explicitly loading BERT tokenizer/model from local HF snapshots resolves the problem reliably.