Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
bd2686f
Merge pull request #25 from Build5Nines/main
crpietschmann Feb 22, 2025
2e8e1df
added database info (schema, version) to exported vector database file
crpietschmann Feb 22, 2025
2fa0262
add another save/load test
crpietschmann Feb 22, 2025
b142351
added BenchmarkDotNet tests for save/load vector database
crpietschmann Feb 22, 2025
c923da7
Update build-release.yml
crpietschmann Feb 22, 2025
767b220
Update build-release.yml
crpietschmann Feb 22, 2025
f41cc5b
Update build-release.yml
crpietschmann Feb 22, 2025
94dd8da
Update build-release.yml
crpietschmann Feb 22, 2025
d28d347
Update build-release.yml
crpietschmann Feb 22, 2025
b68a6d9
Update build-release.yml
crpietschmann Feb 22, 2025
139330f
Update build-release.yml
crpietschmann Feb 22, 2025
4d3d2c5
Update build-release.yml
crpietschmann Feb 22, 2025
8cf70e1
Update build-release.yml
crpietschmann Feb 22, 2025
8223928
output perf results to GitHub Action step summary
crpietschmann Feb 22, 2025
d2cd4c6
add CHANGELOG.md
crpietschmann Feb 22, 2025
513e67c
Update build-release.yml
crpietschmann Feb 22, 2025
d28682f
Update build-release.yml
crpietschmann Feb 22, 2025
4b57bd6
Update build-release.yml
crpietschmann Feb 22, 2025
8eac458
Update build-release.yml
crpietschmann Feb 22, 2025
e740b9e
Update Build5Nines.SharpVector.csproj
crpietschmann Feb 22, 2025
5964a34
Update build-release.yml
crpietschmann Feb 22, 2025
f780fe4
Update VectorDatabaseTests.cs
crpietschmann Feb 22, 2025
7c72b5c
Update VectorDatabaseTests.cs
crpietschmann Feb 22, 2025
72e2420
Add DatabaseFile.Load static methods and DatabaseFile.LoadDatabaseInf…
crpietschmann Feb 22, 2025
8f85142
add code coverage
crpietschmann Feb 22, 2025
40a1bf2
Update dotnet-tests.yml
crpietschmann Feb 22, 2025
e374fbf
Update dotnet-tests.yml
crpietschmann Feb 22, 2025
16798e7
Update dotnet-tests.yml
crpietschmann Feb 22, 2025
8137925
added a couple Chinese character unit tests
crpietschmann Feb 22, 2025
923af23
Comment out the Chinese character tests for now due to issue #8
crpietschmann Feb 22, 2025
1b84a65
Found a fix for Chinese language/character support (#8)
crpietschmann Feb 23, 2025
ff0c1c7
update references for SharpVector 2.0.0
crpietschmann Feb 23, 2025
21ebf9d
Update OnnxRuntime references (think I have it working)
crpietschmann Feb 23, 2025
dbc8776
Update SharpVector.OpenAI to 2.0.0 with save/load functionality support
crpietschmann Feb 23, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 17 additions & 1 deletion .github/workflows/build-release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,11 @@ on:
push:
branches:
- main
- dev
pull_request:
branches:
- main
- dev
workflow_dispatch:

jobs:
Expand All @@ -30,6 +32,9 @@ jobs:

- name: Build
run: dotnet build --configuration Release --no-restore

- name: Performance Test
run: dotnet run --project SharpVectorPerformance --configuration Release

# - name: Publish
# run: dotnet publish --configuration Release --output ./publish --no-build
Expand All @@ -40,7 +45,18 @@ jobs:
# name: release-build
# path: ./publish

- name: Upload artifact
- name: Performance Results
run: |
echo "## Performance Results" > $GITHUB_STEP_SUMMARY
cat ./BenchmarkDotNet.Artifacts/results/SharpVectorPerformance.MemoryVectorDatabasePerformance-report-github.md >> $GITHUB_STEP_SUMMARY

- name: Upload Performance artifact
uses: actions/upload-artifact@v4
with:
name: performance-results
path: './src/BenchmarkDotNet.Artifacts/*'

- name: Upload Nuget artifact
uses: actions/upload-artifact@v4
with:
name: nuget-package
Expand Down
15 changes: 13 additions & 2 deletions .github/workflows/dotnet-tests.yml
Original file line number Diff line number Diff line change
@@ -1,9 +1,14 @@
name: .NET Core Tests

on:
push:
branches:
- main
- dev
pull_request:
branches:
- main
- dev
workflow_dispatch:

jobs:
Expand All @@ -28,5 +33,11 @@ jobs:
- name: Build
run: dotnet build --no-restore

- name: Run tests
run: dotnet test --no-build --verbosity normal
- name: Run tests with code coverage
run: dotnet test --no-build --verbosity normal --results-directory "./TestResults/Coverage/" --collect:"XPlat Code Coverage"

- name: Upload test results artifact
uses: actions/upload-artifact@v4
with:
name: test-results
path: '**/TestResults/**'
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,6 @@ obj
bin

.DS_Store

BenchmarkDotNet.Artifacts/
TestResults/
66 changes: 66 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## v2.0.0

Added:

- Add data persistence capability to save/load from a file or to/from a `Stream` (Both SharpVector and SharpVector.OpenAI)
- Add Chinese language/character support

Breaking Change:

- Refactor `IVocabularyStore` to be used within `MemoryDictionaryVectorStoreWithVocabulary`. This simplifies implementation of `MemoryVectorDatabaseBase`, and helps to enable data persistence capability.

Notes:

- The breaking change only applies if the base classes are being used. If the `BasicMemoryVectorDatabase` is being used, this will likely not break applications that depend on this library. However, in some instances where explicitly depending on `VectorTextResult` it's properties (without using `var` in consuming code) there might be minor code changes needed when migrating from previous versions of the library.

## v1.0.1 (2025-02-06)

- Upgrade to .NET 8 or higher

### v1.0.0 (2024-05-24)

Added:

- Simplify object model by combining Async and non-Async classes, `BasicMemoryVectorDatabase` now support both synchronous and asynchronous operations.
- Refactored to remove unnecessary classes where the `Async` versions will work just fine.
- Improve async/await and multi-threading use

### v0.9.8-beta (2024-05-20)

Added:

- Added `Async` version of classes to support multi-threading
- Metadata is no longer required when calling `.AddText()` and `.AddTextAsync()`
- Refactor `IVectorSimilarityCalculator` to `IVectorComparer` and `CosineVectorSimilarityCalculatorAsync` to `CosineSimilarityVectorComparerAsync`
- Add new `EuclideanDistanceVectorComparerAsync`
- Fix `MemoryVectorDatabase` to no longer requird unused `TId` generic type
- Rename `VectorSimilarity` and `Similarity` properties to `VectorComparison`

### v0.9.5-beta (2024-05-18)

Added:

- Add `TextDataLoader` class to provide support for different methods of text chunking when loading documents into the vector database.

### v0.9.0-beta (2024-05-18)

Added:

- Introduced the `BasicMemoryVectorDatabase` class as the basic Vector Database implementations that uses a Bag of Words vectorization strategy, with Cosine similarity, a dictionary vocabulary store, and a basic text preprocessor.
- Add more C# Generics use, so the library is more customizable when used, and custom vector databases can be implemented if desired.
- Added `VectorTextResultItem.Similarity` so consuming code can inspect similarity of the Text in the vector search results.
- Update `.Search` method to support search result paging and threshold support for similarity comparison
- Add some basic Unit Tests

### v0.8.0-beta (2024-05-17)

Added:

- Initial release - let's do this!
48 changes: 0 additions & 48 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -138,54 +138,6 @@ Here's a screenshot of the test console app running:

![](assets/build5nines-sharpvector-console-screenshot.jpg)

## Change Log

## v2.0.0 (In Progress)

Feature:
- Add data persistence capability

Breaking Change:
- Refactor `IVocabularyStore` to be used within `MemoryDictionaryVectorStoreWithVocabulary`. This simplifies implementation of `MemoryVectorDatabaseBase`, and helps to enable data persistence capability.

Notes:
- The breaking change only applies if the base classes are being used. If the `BasicMemoryVectorDatabase` is being used, this will likely not break applications that depend on this library. However, in some instances where explicitly depending on `VectorTextResult` it's properties (without using `var` in consuming code) there might be minor code changes needed when migrating from previous versions of the library.

## v1.0.1 (2025-02-06)

- Upgrade to .NET 8 or higher

### v1.0.0 (2024-05-24)

- Simplify object model by combining Async and non-Async classes, `BasicMemoryVectorDatabase` now support both synchronous and asynchronous operations.
- Refactored to remove unnecessary classes where the `Async` versions will work just fine.
- Improve async/await and multi-threading use

### v0.9.8-beta (2024-05-20)

- Added `Async` version of classes to support multi-threading
- Metadata is no longer required when calling `.AddText()` and `.AddTextAsync()`
- Refactor `IVectorSimilarityCalculator` to `IVectorComparer` and `CosineVectorSimilarityCalculatorAsync` to `CosineSimilarityVectorComparerAsync`
- Add new `EuclideanDistanceVectorComparerAsync`
- Fix `MemoryVectorDatabase` to no longer requird unused `TId` generic type
- Rename `VectorSimilarity` and `Similarity` properties to `VectorComparison`

### v0.9.5-beta (2024-05-18)

- Add `TextDataLoader` class to provide support for different methods of text chunking when loading documents into the vector database.

### v0.9.0-beta (2024-05-18)

- Introduced the `BasicMemoryVectorDatabase` class as the basic Vector Database implementations that uses a Bag of Words vectorization strategy, with Cosine similarity, a dictionary vocabulary store, and a basic text preprocessor.
- Add more C# Generics use, so the library is more customizable when used, and custom vector databases can be implemented if desired.
- Added `VectorTextResultItem.Similarity` so consuming code can inspect similarity of the Text in the vector search results.
- Update `.Search` method to support search result paging and threshold support for similarity comparison
- Add some basic Unit Tests

### v0.8.0-beta (2024-05-17)

- Initial release - let's do this!

## Maintained By

The **Build5Nines SharpVector** project is maintained by [Chris Pietschmann](https://pietschsoft.com?utm_source=github&utm_medium=sharpvector), founder of [Build5Nines](https://build5nines.com?utm_source=github&utm_medium=sharpvector), Microsoft MVP, HashiCorp Ambassador, and Microsoft Certified Trainer (MCT).
5 changes: 3 additions & 2 deletions samples/genai-rag-onnx/Program.cs
Original file line number Diff line number Diff line change
Expand Up @@ -162,19 +162,20 @@ static async Task Main(string[] args)
var generatorParams = new GeneratorParams(model);
generatorParams.SetSearchOption("max_length", maxPromptLength);
generatorParams.SetSearchOption("past_present_share_buffer", false);
generatorParams.SetInputSequences(tokens);
//generatorParams.SetInputSequences(tokens);

// Generate the response
Console.WriteLine("AI is thinking...");
var generator = new Generator(model, generatorParams);
generator.AppendTokenSequences(tokens);

// show in console that the assistant is responding
Console.WriteLine("");
Console.Write("Assistant: ");

// Output response as each token in generated
while (!generator.IsDone()) {
generator.ComputeLogits();
//generator.ComputeLogits();
generator.GenerateNextToken();
var output = GetOutputTokens(generator, tokenizer);
Console.Write(output);
Expand Down
10 changes: 5 additions & 5 deletions samples/genai-rag-onnx/genai-rag-onnx.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,11 @@
<Nullable>enable</Nullable>
</PropertyGroup>

<ItemGroup>
<PackageReference Include="Build5Nines.SharpVector" Version="1.0.0" />
<PackageReference Include="Microsoft.ML.OnnxRuntime" Version="1.17.3" />
<PackageReference Include="Microsoft.ML.OnnxRuntimeGenAI" Version="0.2.0-rc7" />
<PackageReference Include="Microsoft.ML.OnnxRuntimeGenAI.Cuda" Version="0.2.0-rc7" />
<ItemGroup>
<PackageReference Include="Build5Nines.SharpVector" Version="2.0.0" />
<PackageReference Include="Microsoft.ML.OnnxRuntime" Version="1.20.1" />
<PackageReference Include="Microsoft.ML.OnnxRuntimeGenAI" Version="0.6.0" />
<PackageReference Include="Microsoft.ML.OnnxRuntimeGenAI.Cuda" Version="0.6.0" />
</ItemGroup>

</Project>
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
</ItemGroup>

<ItemGroup>
<PackageReference Include="Build5Nines.SharpVector" Version="[1.0.0,2.0.0)" />
<PackageReference Include="Build5Nines.SharpVector" Version="[2.0.0,3.0.0)" />
<PackageReference Include="OpenAI" Version="2.1.0" />
</ItemGroup>
</Project>
Loading
Loading