Modified module ziparchives to alternatively open zip files as byte strings.#90
Modified module ziparchives to alternatively open zip files as byte strings.#90nervecenter wants to merge 4 commits intoguzba:masterfrom
ziparchives to alternatively open zip files as byte strings.#90Conversation
…een read in as byte strings. This allows extracting from recursive archives in-memory.
…s to obtain reader pointer and length. Added wrapper procs to open an archive as either a file or a byte string. Removed module ziparchives_inmem.
|
Why not |
|
@quantimnot Primarily just because |
|
Hey sorry for not getting back to this. Regarding the question from @quantimnot, ZipArchiveReader must have the actual backing data around for its lifetime in order to pull files out of it later, so an openarray parameter will just require me to do a An important requirement of ZipArchiveReader is that it does not decompress all of the files into memory up front. There's good reasons for not doing that. Supporting a string is fine but it will need to get stored. The parameter could be a sink parameter to potentially avoid a copy if it is not already. I don't mind the idea of a ptr + len version of this, where it is the lib user's job to keep the data somewhere so the pointer stays valid, but that's another thing. |
|
Also I do agree with @nervecenter that string is simply better than seq[byte] or whatever. Every API in Nim that actually deals with bytes takes string so I gave up fighting this years go. Just embrace string and never ever ever use seq[byte], its just a trap (there is no actually good easy conversion, casting is not actually safe so its a copyMem to convert, yay). |
|
@guzba I'm fairly certain I'm only reading the archive into memory as-is, and the procs I added decompress individual contents of the archive one at a time on demand in memory. If there's something extra going on behind the scenes, I apologize if that's causing issues. It's worked quite well for me in my in-production project. |
The module
ziparchivesfeatures a modifiedZipArchiveReaderobject type. It now contains astring-based alternative to amemfile, and a newZipArchiveReaderModeto determine which field to read from. There are two new utility procs:getDataPtr()andgetDataLen(), which depending on the mode of the input reader, get the casted data pointer or the length.Most of
openZipArchive()was moved into a newopenZipArchiveInternal(), which features all of the internal zip archive reading logic. The procopenZipArchive*()is now a wrapper for initializing a reader inMemfileMode. The procopenZipArchiveBytes*()is a wrapper for opening a zip archive inStringMode, and takes a string of bytes; the returned reader can perform operations on those bytes as a .zip file, performing all operations in-memory.Correspondingly,
extractAll*()had an alternative spun out,extractAllBytes*(), which extracts a byte-string archive to the chosen directory. The common internal logic was given its own procextractAllInternal(), and the directory check was given its own proccheckExtractDestination().There are also two extra files present at a higher level. The
inner_test.ziparchive has three internal .zip files, each containing three internal text files, where the filename is a number and the contents are that number's corresponding whole English word. This is a test artifact fortest_ziparchives_inmem.nim, which can be run from its own directory withnim r test_ziparchives_inmem.nim. This test extracts all the text files flatly to the working directory. There is an alternative that extracts each inner archive to its own directory usingextractAllBytes*(). It should be noted that running this may conflict with any nimble-installed versions of zippy, so it should be isolated.