Fix/neo4j nested attributes serialization#1
Merged
Conversation
* fix: replace edge name with uuid in resolution debug log Edge names can contain PII. Use UUIDs instead in the resolve_extracted_edge debug log message. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove PII from remaining debug logs - nodes.py: replace entity name with uuid and char count in embedding logs - edges.py: replace edge fact text with uuid and char count in embedding log - community_operations.py: replace full object dump with uuid and edge count - search/search.py: remove user query from search latency log Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: convert embedding log timing from seconds to milliseconds Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The Docker images were pinned to graphiti-core 0.23.1, which is 4 months behind the current release. This updates all Dockerfiles and compose files to default to 0.28.1. Also fixes the sed version-replacement patterns which only matched >= but the pyproject.toml uses ==, so the build-arg override was silently failing. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add GLiNER2 hybrid LLM client Implements GLiNER2Client, a hybrid LLM client that uses GLiNER2 (lightweight extraction model) for entity and relation extraction while delegating reasoning tasks (deduplication, summarization, community operations) to a secondary LLMClient. Key features: - Local CPU-friendly extraction using GLiNER2 - Message parsing to extract entity types, relations, and text from Graphiti prompts - Response-model-based dispatch (ExtractedEntities/ExtractedEdges → GLiNER2, others → delegated LLM) - Support for both local and API-based GLiNER2 modes - Full async integration via asyncio.to_thread() Includes example usage in examples/gliner2/ with Neo4j integration. Dependencies: gliner2>=1.2.0 (optional) * fix: address code review feedback on GLiNER2 client - Use _generate_response_with_retry() for tenacity retry support - Case-insensitive entity matching in relation filtering - Add DEBUG logging for filtered relations - Remove redundant env var defaults in example - Add docstring note about synchronous model loading - Clarify token estimation is approximate Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add python_version>=3.11 marker to gliner2 dependency onnxruntime 1.24.2 (transitive dep of gliner2) dropped Python 3.10 support. CI runs `uv sync --all-extras` on Python 3.10, causing all jobs to fail. Adding a version marker ensures the gliner2 extra is only resolved on Python 3.11+. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: expand example with longer texts and multilingual episodes - Add detailed English political biography and mortgage settlement text - Add Spanish, French, and Portuguese episodes with overlapping entities (Kamala Harris, California, San Francisco, Gavin Newsom) - Expand JSON metadata with additional fields - Add multiple search queries to demonstrate retrieval - Fix pyright errors: use typing.Any for model type (GLiNER2 vs GLiNER2API) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: delegate edge extraction and summarization to LLM client GLiNER2 extracts structured triples (head, relation_type, tail) but cannot generate natural-language facts, temporal bounds, or proper relation types. This produced low-quality facts like "Kamala Harris related to San Francisco". Now GLiNER2 only handles entity extraction (ExtractedEntities). All other pipeline operations — edge/relation extraction, node summary, deduplication — are delegated to the LLM client which generates proper facts paraphrased from source text. Removed: _handle_relation_extraction, _extract_entity_names, _extract_relation_types, _EDGE_EXTRACTION_MODEL. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: parse Python repr entity types from prompt templates The prompt templates interpolate entity_types as Python list[dict] directly (str()), producing Python repr with single quotes and None rather than valid JSON. json.loads() fails on this format. Now tries json.loads first, then falls back to ast.literal_eval. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add custom entity types, extraction latency tracking, and detailed output - Add Person, Organization, Location, Initiative entity types to example - Pass entity_types to add_episode() for typed GLiNER2 extraction - Track extraction latencies in GLiNER2Client.extraction_latencies - Print extracted entities, edges, attributes, and summaries per episode - Print latency summary (mean/min/max/total) at end of example - Use gpt-5.2 with reasoning='none' in example Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: switch example from OpenAI to Gemini for LLM and embeddings - Replace OpenAIClient with GeminiClient (gemini-2.5-flash-lite) - Replace default OpenAI embedder with GeminiEmbedder (gemini-embedding-001) - Example now uses GOOGLE_API_KEY only (no OpenAI dependency) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: sort imports in gliner2 example Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add README, raise threshold to 0.7, use gliner2-large-v1 default - Add examples/gliner2/README.md with GLiNER2 repo, paper, and HuggingFace links - Mark GLiNER2Client as experimental - Document swappable LLM/embedding providers - Raise extraction threshold from 0.5 to 0.7 to reduce spurious entities - Switch default model to gliner2-large-v1 - Update .env.example for Gemini (GOOGLE_API_KEY) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* refresh readme content * remove readme roadmap section * fix readme review issues
* restore readme title block * center readme badges
* harden search filter inputs * validate entity node labels on save * tighten security regression coverage
* Bump graphiti-core version to 0.28.2 Update version across pyproject.toml, MCP server, server, Docker configs, and root lock file. MCP server and server lock files will need regeneration after PyPI publish. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Revert MCP server version bump until release MCP server depends on graphiti-core from PyPI, so the version bump should happen after the 0.28.2 release is published. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Revert server graphiti-core requirement bump until release Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Updates mcp-server version to 1.0.2 and bumps graphiti-core dependency to >=0.28.2 to address security vulnerability (Cypher injection hardening added in 0.28.2). Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
Add a prominent 'We're Hiring!' callout to the README promoting open Engineer and Developer Relations positions at Zep, linking to the careers page.
* zep upstream * Remove Kuzu from test infrastructure and internal Go reference Kuzu is being deprecated — remove it from the test driver list and all Kuzu-specific test skips. Also remove a comment referencing an internal Go file path that should not be in the public repo. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Bumps the uv group with 1 update in the / directory: [langchain-core](https://github.com/langchain-ai/langchain). Bumps the uv group with 2 updates in the /mcp_server directory: [langchain-core](https://github.com/langchain-ai/langchain) and [cryptography](https://github.com/pyca/cryptography). Updates `langchain-core` from 1.2.12 to 1.2.22 - [Release notes](https://github.com/langchain-ai/langchain/releases) - [Commits](langchain-ai/langchain@langchain-core==1.2.12...langchain-core==1.2.22) Updates `langchain-core` from 1.2.12 to 1.2.22 - [Release notes](https://github.com/langchain-ai/langchain/releases) - [Commits](langchain-ai/langchain@langchain-core==1.2.12...langchain-core==1.2.22) Updates `cryptography` from 46.0.5 to 46.0.6 - [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst) - [Commits](pyca/cryptography@46.0.5...46.0.6) --- updated-dependencies: - dependency-name: langchain-core dependency-version: 1.2.22 dependency-type: indirect dependency-group: uv - dependency-name: langchain-core dependency-version: 1.2.22 dependency-type: indirect dependency-group: uv - dependency-name: cryptography dependency-version: 46.0.6 dependency-type: indirect dependency-group: uv ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* feat: add automated PR triage system with evaluation rubric Add a Claude Code-powered PR triage workflow that evaluates incoming PRs against Graphiti's project principles and produces structured priority assessments. This helps maintainers quickly identify high-value PRs among the 128+ open PRs. Components: - .github/prompts/pr-triage.md: Evaluation rubric covering 5 dimensions (category, quality, alignment, slop detection, impact) with structured JSON output and human-readable PR comments - .github/workflows/pr-triage.yml: GitHub Action with three trigger modes: auto on PR open (pull_request_target), manual single-PR dispatch, and batch mode for all open PRs - .github/scripts/setup-triage-labels.sh: One-time label creation script Security mitigations for fork PRs: - Uses pull_request_target (never checks out fork code) - Reads diffs only via gh pr diff (GitHub API, text only) - Strict tool allowlist (no arbitrary Bash execution) - Post-step label validation removes unexpected labels - Explicit prompt injection warnings in evaluation prompt https://claude.ai/code/session_01VJPHGChKzqPEThSkPw7sqd * harden PR triage: diff size limits, re-triage, injection defense, timeouts - Add diff size limit (>5000 lines skips triage, applies needs-rfc label) - Add synchronize trigger so updated PRs get re-triaged automatically - Remove stale triage labels before re-evaluation - Add --append-system-prompt with injection defense at system level - Add --max-turns (30 for single PR, 500 for batch) to prevent runaway loops - Add timeout-minutes: 360 to batch job - Gate validation step behind diff size check - Add wc to batch job allowed tools for diff size checking https://claude.ai/code/session_01VJPHGChKzqPEThSkPw7sqd * skip triage for maintainer PRs using fork check Add check-fork job (same pattern as claude-code-review.yml) to skip triage on PRs from getzep/graphiti (non-fork = maintainer). Only fork PRs from external contributors get auto-triaged. - Auto-trigger (pull_request_target): gated on is_fork == true - Manual dispatch: always runs (maintainers can triage any PR) - Batch mode: filters out PRs where headRepository is getzep/graphiti - Uses always() so triage job runs even when check-fork is skipped (workflow_dispatch events skip the check-fork job) https://claude.ai/code/session_01VJPHGChKzqPEThSkPw7sqd * prioritize bug fixes, require RFC for all new features/integrations Update triage rubric: - Bug fixes to existing functionality are now top priority (HIGH) - New features and integrations (drivers, LLM providers, embedders) require a linked RFC issue regardless of PR size - PRs adding new integrations without RFC get request-rfc action - Alignment check updated: has_rfc_if_needed applies to all features, not just >500 LOC PRs https://claude.ai/code/session_01VJPHGChKzqPEThSkPw7sqd * docs: update CONTRIBUTING.md with RFC and priority rules - Bug fixes to existing functionality are the top priority - All new features and integrations (drivers, LLM providers, embedders) require an RFC issue before submitting a PR, not just >500 LOC changes - PRs without a linked RFC will be tagged needs-rfc and not reviewed https://claude.ai/code/session_01VJPHGChKzqPEThSkPw7sqd * enable code review for fork PRs via pull_request_target Switch claude-code-review.yml from pull_request to pull_request_target so fork PRs get automatic code review with access to ANTHROPIC_API_KEY. Security model (same as pr-triage.yml): - Always check out the BASE repo, never the fork - Read diffs only via gh pr diff (GitHub API, text only) - Strict tool allowlist (no arbitrary Bash execution) - --append-system-prompt marks all PR content as untrusted - --max-turns 30 to prevent runaway loops - Explicit prompt injection warnings Changes to claude-code-review.yml: - pull_request -> pull_request_target (enables fork PR reviews) - Removed check-fork job (all PRs reviewed, not just internal) - Added concurrency group to prevent duplicate reviews - Switched to direct_prompt with security rules - Added tool restrictions matching triage workflow Changes to claude-code-review-manual.yml: - Removed unsafe `gh pr checkout` (was executing fork code) - Now checks out base repo and reads diff via API - Added same security hardening (tool allowlist, injection defense) - Replaced actions/github-script with simpler gh pr comment https://claude.ai/code/session_01VJPHGChKzqPEThSkPw7sqd * add priority and RFC rules to triage prompt context section The Contribution Requirements section at the top of the triage prompt was missing the updated rules (bug fix priority, RFC for all new features/integrations). Added them so Claude sees these rules in the initial context, not just in the evaluation logic later. https://claude.ai/code/session_01VJPHGChKzqPEThSkPw7sqd --------- Co-authored-by: Claude <noreply@anthropic.com>
--- updated-dependencies: - dependency-name: aiohttp dependency-version: 3.13.4 dependency-type: indirect dependency-group: uv - dependency-name: aiohttp dependency-version: 3.13.4 dependency-type: indirect dependency-group: uv ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…#1370) fix: use prompt instead of direct_prompt in all workflows direct_prompt is not a valid input for anthropics/claude-code-action@v1. The correct input is prompt. This caused all three workflows to receive empty instructions, making Claude do nothing useful. Fixed in: - .github/workflows/pr-triage.yml (2 occurrences) - .github/workflows/claude-code-review.yml (1 occurrence) - .github/workflows/claude-code-review-manual.yml (1 occurrence) https://claude.ai/code/session_01VJPHGChKzqPEThSkPw7sqd Co-authored-by: Claude <noreply@anthropic.com>
…ep#1372) Slop detection changes: - tests-missing alone is not slop — slop is the combination of overarchitected code + verbose/unfocused description + no tests - Added overarchitected and verbose-unfocused-description as signals - Replaced boilerplate-description with more specific signal - Updated slop-detected label to require the combination, not just 3+ arbitrary signals Triage comment changes: - Added "Note to Author" section that tells the PR author exactly what they need to do to comply with CONTRIBUTING.md (missing RFC, missing tests, slop rework, etc.) - Updated needs-rfc label to apply for features/integrations without RFC, not just >500 LOC https://claude.ai/code/session_01VJPHGChKzqPEThSkPw7sqd Co-authored-by: Claude <noreply@anthropic.com>
Neo4j was crashing when entity/edge attributes contained nested structures (Maps of Lists, Lists of Maps) because attributes were being spread as individual properties instead of serialized to JSON strings. Changes: - Serialize attributes to JSON for Neo4j (like Kuzu already does) - Update read path to handle both JSON strings and legacy dict format - Add integration tests for nested attribute structures - Maintain backward compatibility with existing code Fixes issue where LLM extraction with complex structured attributes would cause: Neo.ClientError.Statement.TypeError - Property values can only be of primitive types or arrays thereof. Modified Files: - graphiti_core/utils/bulk_utils.py: Serialize attributes for Neo4j - graphiti_core/nodes.py: Handle JSON string attributes in read path - graphiti_core/edges.py: Handle JSON string attributes in read path - graphiti_core/models/nodes/node_db_queries.py: Use n.attributes for Neo4j - graphiti_core/models/edges/edge_db_queries.py: Use e.attributes for Neo4j New Files: - tests/test_neo4j_nested_attributes_int.py: Integration tests - docs/neo4j-attributes-fix.md: Comprehensive documentation
…e behavior Issues fixed: 1. Only serialize attributes for Neo4j, not FalkorDB/Neptune 2. Maintain backward compatibility with existing Neo4j data Changes: - Write path: Use elif to specifically target Neo4j only - Query path: Use COALESCE and return both n.attributes and properties(n) - Read path: Try JSON string first, fall back to spread properties - FalkorDB/Neptune: Restore original spread behavior This ensures: - New Neo4j nodes: attributes as JSON string (supports nesting) - Old Neo4j nodes: attributes spread as properties (backward compatible) - FalkorDB/Neptune: unchanged behavior (no breaking changes)
Pin all workflow actions to full-length commit SHAs Pin all 44 external action references across 13 workflow files to full-length commit SHAs for supply chain security, preventing compromised tags from injecting malicious code. Original version tags are preserved as inline comments for readability. https://claude.ai/code/session_01QfWs95xMGKUKGH5ppgNGgh Co-authored-by: Claude <noreply@anthropic.com>
remove OIDC and harden tool allowlist
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Brief description of the changes in this PR.
Type of Change
Objective
For new features and performance improvements: Clearly describe the objective and rationale for this change.
Testing
Breaking Changes
If this is a breaking change, describe:
Checklist
make lintpasses)Related Issues
Closes #[issue number]