feat: semantic deduplication #1

Open

Labels

kcelestinomaria

opened

on Nov 11, 2025

Embedding-based near-duplicate detection using FAISS or Annoy, is important.

Implementing a configurable threshold for “similarity score” will help remove redundant rows.

Metadata

Assignees

No one assigned

Labels

Type

No type

Fields

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests