A Clojure utility for analyzing and resolving duplicate nodes in knowledge graph exchange (KGX) formatted knowledge graphs, with Tablassert integration and Biolink Model compliance. It updates Tablassert-integreated "table_configs" through a CLI.
- Duplicate Detection: Identifies duplicate entries in TSV files
- Interactive Resolution: CLI prompts for handling conflicts
- YAML Configuration: Generates config files to prevent future duplicates
- Clojure 1.10+
- Java JDK 8+
- Leiningen
For arguments...
# With Leiningen
lein run -h
To resolve duplicates...
# With Leiningen
lein run -n nodes.tsv -e edges.tsvTo test the application...
# With Leiningen
lein testcurie-clean/
├── src/
│ └── duplicate_utility/
│ ├── io/
│ │ ├── tsv.clj
│ │ └── yaml.clj
│ ├── processing/
│ │ ├── duplicates.clj
│ │ └── resolution.clj
│ ├── core.clj
│ └── validation.clj
└── test/
└── duplicate_utility/
├── io/
│ ├── tsv_test.clj
│ └── yaml_test.clj
├── processing/
│ ├── duplicates_test.clj
│ └── resolution_test.clj
├── core_test.clj
├── test_utils.clj
└── validation_test.clj