streaming_zip

This is a Deno library for doing streaming encoding and decoding of zip files. Streaming encoding and decoding is useful when you don't have random access to read or write a zip file. This can be the case if you want to decode a zip file while it's still being downloaded, or if you want to send a zip file as soon as possible while it's still being made with minimal buffering and latency.

This library supports reading and writing zip files with the zip64 and extended timestamps extensions.

Limitations

The cases where this library is useful are expected to be limited. This library is mostly released for educational purposes.

This library is not right to use for reading zip files from disk (where random access is available) unless you intend to read all files from it in order.

This library is not right for writing zip files to disk unless the files are uncompressed or pre-compressed, you know all of their compressed and uncompressed sizes, and you know their CRC checksums ahead of time. It's expected that the main case this would be realistic is where you're transforming a pre-existing zip file.

Zip files are not generally conducive to being creating as a stream because the size of each (optionally compressed) item and its CRC checksum must be known before the item may be written into the zip file. (The zip file format does have an option to put sizes after file data to better allow zip files to be encoded as a stream, but zip files using that option are not decodeable as a stream, so this library does not support that.)

Usage

read()

import { read } from "https://deno.land/x/streaming_zip@v1.0.0/read.ts";
import { Buffer } from "https://deno.land/std@0.135.0/streams/buffer.ts";

const req = await fetch("https://example.com/somefile.zip");
for await (const entry of read(req.body!)) {
  // Each entry object is of ReadEntry type:
  /*
  export type ReadEntry = {
    type: "file";
    name: string;
    extendedTimestamps?: ExtendedTimestamps;
    originalSize: number;
    compressedSize: number;
    crc: number;
    body: OptionalStream;
  } | {
    type: "directory";
    name: string;
    extendedTimestamps?: ExtendedTimestamps;
  };
  */

  if (entry.type === "file") {
    if (entry.name.endsWith(".txt")) {
      const buffer = new Buffer();
      await entry.body.stream().pipeTo(buffer.writable);
      const contents = textDecoder.decode(buffer.bytes());
      console.log(`contents of ${entry.name}: ${contents}`);
    } else {
      console.log(`ignoring non-txt file ${entry.name}`);
      // Every file entry must either have entry.body.stream() or entry.body.autodrain() called.
      entry.body.autodrain();
    }
  } else {
    console.log(`directory found: ${entry.name}`);
  }
}

write()

import { write } from "https://deno.land/x/streaming_zip@v1.0.0/write.ts";
import { readableStreamFromIterable } from "https://deno.land/std@0.135.0/streams/conversion.ts";
import { crc32 } from "https://deno.land/x/crc32@v0.2.2/mod.ts";
import { Buffer } from "https://deno.land/std@0.135.0/streams/buffer.ts";

async function* entryGenerator(): AsyncGenerator<WriteEntry> {
  yield {
    type: "directory",
    name: "some-subdir/",
    extendedTimestamps: {
      modifyTime: new Date("2022-03-28T05:37:04.000Z"),
    },
  };
  const fileBuf = new TextEncoder().encode("Text file contents here.\n");
  yield {
    type: "file",
    name: "fortune.txt",
    extendedTimestamps: {
      modifyTime: new Date("2022-03-29T05:37:04.000Z"),
    },
    body: {
      stream: readableStreamFromIterable([fileBuf]),
      originalSize: fileBuf.byteLength,
      originalCrc: parseInt(crc32(fileBuf), 16),
    },
  };
}

const stream = write(entryGenerator());
const buffer = new Buffer();
await stream.pipeTo(buffer.writable);
console.log(buffer.bytes());

Documentation

This library is made available on deno.land at https://deno.land/x/streaming_zip, and has documentation pages generated at https://doc.deno.land/https://deno.land/x/streaming_zip/mod.ts.