-
-
Notifications
You must be signed in to change notification settings - Fork 18
Description
Hello, I love this project!!! I was reviewing the verification mechanisms for WACZ files at https://specs.webrecorder.net/wacz-auth/0.1.0/#proof-of-authenticity, and noticed a peculiar wording in their requirements definition (emphasis mine):
Proving web archive authenticity can be difficult. Ideally, proof of authenticity could guarantee that any web server served a particular URL at a particular point in time. Unfortunately, this is not currently possible with existing web standards, as even TLS does not provide "non-repudiation".
The rest of the document goes on to describe the difficulty with verifying timestamps (and their great mechanism to address that), which I understand is necessary. However, if I created a WACZ archive of some pages, is it currently possible for the website operator to simply claim the WACZ was falsified when it was first created (e.g. the archive was created against a fake site with the same domain)? Even if the HTTPS cert doesn't verify timestamps, it seems extremely useful to be able to say "this website definitely absolutely served this content at some point" by incorporating the HTTPS certificate, and then relying on the rest of the authenticity spec to provide additional confirmation of its contents.
Do I understand this problem correctly, or does WACZ already incorporate the HTTPS certificate at time of archive creation in a way that verifies the content was actually downloaded from the remote server? It seems like the wacz-auth spec tries very hard to solve a more specific problem with timestamping and may have missed the opportunity to add this additional layer of verification, but I'm not sure.