Simple Dart library for string similarity and substring search. Pure Dart, null-safe, with configurable engines and caching for repeated comparisons.
- Similarity metrics: Levenshtein, Jaro-Winkler, Cosine, Jaccard, and more.
- Search algorithms: KMP, Boyer-Moore, Rabin-Karp, and a standard wrapper.
- Instance-based engines with configurable normalization and caching.
- Compiled patterns for efficient repeated substring searches.
- Extension methods for ergonomic usage on String.
Add this to your pubspec.yaml:
dependencies:
string_search_algorithms: ^1.0.1import 'package:string_search_algorithms/string_search_algorithms.dart';
void main() {
final score = StringSimilarity.compare(
'Dwayne',
'Duane',
algorithm: SimilarityAlgorithm.jaroWinkler,
);
print('Similarity: $score');
final index = StringSearch.indexOf(
'The quick brown fox jumps over the lazy dog',
'brown',
algorithm: SearchAlgorithm.boyerMoore,
);
print('Index: $index');
}final score = StringSimilarity.compare(
'kitten',
'sitting',
algorithm: SimilarityAlgorithm.levenshtein,
);
final details = StringSimilarity.compareWithDetails(
'Dwayne',
'Duane',
algorithm: SimilarityAlgorithm.jaroWinkler,
);Use StringSimilarityEngine for per-instance configuration and caching.
final engine = StringSimilarityEngine(
options: const SimilarityOptions(
normalization: NormalizationOptions(
toLowerCase: true,
removeAccents: true,
removeSpecialChars: true,
trimWhitespace: true,
),
cache: CacheOptions(
enabled: true,
normalizedCapacity: 1000,
bigramCapacity: 1000,
ngramCapacity: 1000,
),
algorithms: AlgorithmOptions(
ngramSize: 3,
tverskyAlpha: 0.5,
tverskyBeta: 0.5,
jaroWinklerPrefixScale: 0.1,
jaroWinklerBoostThreshold: 0.7,
),
),
);
final score = engine.compare(
'Cafe!',
'cafe',
algorithm: SimilarityAlgorithm.levenshtein,
);final candidates = ['apple', 'banana', 'orange', 'grape'];
final matches = StringSimilarity.findMatches(
'appel',
candidates,
minScore: 0.5,
);
for (final match in matches) {
print('${match.value}: ${match.score}');
}final text = 'The quick brown fox jumps over the lazy dog';
final index = StringSearch.indexOf(
text,
'brown',
algorithm: SearchAlgorithm.boyerMoore,
);final pattern = StringSearch.compile(
'fox',
algorithm: SearchAlgorithm.kmp,
);
if (pattern.containsIn(text)) {
print('Found!');
}
for (final match in pattern.findAllIn(text)) {
print('Found at ${match.index}');
}NormalizationOptionscontrols trimming, case folding, accent removal, and custom preprocessors/postprocessors.CacheOptionssizes the normalized string, bigram, and n-gram caches.AlgorithmOptionstunes Jaro-Winkler, N-gram size, and Tversky parameters.
See the API docs for full option details.
Benchmark scripts live in benchmark/:
dart run benchmark/similarity_benchmark.dart
dart run benchmark/search_benchmark.dartAPI docs will be available on pub.dev: https://pub.dev/documentation/string_search_algorithms/latest/
Contributions are welcome. Please read CONTRIBUTING.md and open a PR.
Licensed under the MIT License. See LICENSE.