No description
Find a file
2025-10-28 13:34:37 +01:00
src wtf 2025-10-28 13:34:37 +01:00
.gitignore wtf 2025-10-28 13:34:37 +01:00
Cargo.toml wtf 2025-10-28 13:34:37 +01:00
Makefile wtf 2025-10-28 13:34:37 +01:00
README.md wtf 2025-10-28 13:34:37 +01:00
run-demo.sh wtf 2025-10-28 13:34:37 +01:00

HTML Archives

Self-extracting HTML archives that stream large files using browser's sequential parsing.

How it works

Files are base64-encoded and split into <script> tags. As the browser parses each script, it decodes the chunk, accumulates data, and auto-downloads completed files. Old script tags are removed to keep memory low.

The idea is that since the HTML file can be really large, we try to evaluate chunks as the HTML file gets parsed.

In practice this stuff doesn't perform at all. Chrome still buffers up the download in memory. There was an iteration of this (of which I lost the code) that tried to use some JavaScript APIs (ReadableStream, WritableStream) to make the download stream while chunks are being parsed, but it didn't make a difference, in fact the throughput got worse.

Usage

cargo run archive.html file1.txt image.jpg document.pdf
# Open archive.html in browser, files auto-download

Demo

./run-demo.sh 3 1024 # Creates GB-sized test files and archives them

License

Public domain.