Skip to content

pyrun-cloud/s3vectors-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

S3 VECTORS PIPELINE

This pipeline uses s3vectors to manage vector representations of data. It allows indexing vectors in S3 and performing similarity-based queries or searches. It also helps create the necessary S3 vector bucket and resources required to run the entire process.

STEPS TO EXECUTE PIPELINE

Use of a custom Dataset

If you use your own dataset, you need to specify all the necessary parameters in the "Initializations" section.
Make sure your files (dataset, queries, and true_neighbors) follow the correct format (as indicated in the notebook).

1 - Upload CSV Files

Upload the CSV files from files.zip to any S3 bucket.


2 - Run Initializations

Execute the "Initializations" section to:

  • Import the necessary packages
  • Create the required S3 vector resources

Note:

If the resources already exist, an error will appear.
You can run "Clean environment" to delete the existing resources.


3 - Run Vectors Indexing

Execute the "Vectors Indexing" section to insert vectors into S3 Vectors.

You have two options:

  • Insert the entire dataset at once
  • Insert vectors one by one

Important:

Specify the name of the S3 bucket where the CSV files are located.


4 - Run Querying

Execute the "Querying" section to perform queries on the indexed dataset.


5 - Run Query Recall

Execute the "Query Recall" section to calculate the precision of the queries.


6 - (Optional) Get Vectors

Execute "Get Vectors" to retrieve a specific vector from the dataset.


7 - (Optional) Clean Environment

Execute "Clean environment" to delete both local and S3 vector resources.


Recommended order:
Initializations → Vectors Indexing → Querying → Query Recall
(Optional: Get Vectors, Clean environment)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors