Comparing hashes of data to prevent saving the same data (1): hashing data file

_Team members: Jayesh, Kajoyrie, Jason
Sprint 4: 6/19-6/26_

**Overall Goal:** 
When we download data within the archiver, we key and hash the data and then we check within the registry and see if it is different from the hash of the last version of the data. 

**What does success look like?**
- We first want a function in `archiver.js` that takes a data file and a registry id, applies a MD5 hash to the data file, searches in the registry for the data assigned to that registry id, finds the data hash (that we will later implement to be stored in the registry too), and compares the two hashes, returning true is they are the same.
- If the data hash or registry id does not exist in the registry, return false.
- We will then want to add the data hash as a piece of data stored in the registry too when we archive data.

**Comments:**
- Ideally, we will compute the hash when we download the data initially so that we do not have to read the data twice.
- Hashing: we can use MD5 to hash the data when reading the contents of the file and do it incrementally, instead of the whole thing in the memory.
    - Good resource to look at when starting: archiving the file name in Ethan's archive demo
    - It will be interesting to see if the hash is the same depending on if we read the file as the bytes vs. text. 
- We want to only do this for data files because the about info files would change very frequently without much benefit for us.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comparing hashes of data to prevent saving the same data (1): hashing data file #7

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Comparing hashes of data to prevent saving the same data (1): hashing data file #7

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions