Rust-Brotli rs: brotli-rs — implementation of Brotli compression

Brotli-rs - Brotli decompression in pure, safe Rust

Build Status Works on stable Works on beta Works on nightly

Documentation

Compression provides a <Read>-struct to wrap a Brotli-compressed stream. A consumer is thus able to read a compressed stream with usual convenience.

Changelog

v0.3.22 -> v0.3.23


Bug with literal buffer not being populated when processing uncompressed metablock, bug where a valid stream could be rejected too early as oversized, if the last bytes happened to be shortened by an OMIT-type transformation after the early check, reported and fixed by Daniel.

v0.3.21 -> v0.3.22


Bug with metablock structure not getting fully reset when encountering a new metablock in the brotli stream, reported and fixed by Daniel.

v0.3.20 -> v0.3.21


Bug with multiple metablocks, reported and fixed by Daniel.

###v0.3.19 -> v0.3.20

Worked around feature gate issue in nightly. (Thanks, Corey!)

v0.3.18 -> v0.3.19


Removed 64k big Pseudo-Code lookup table, because creating this trivial table probably took more time than making the calculcation on the fly. (Bench tests seem to suggest a 1% time savings without the lookup table)

v0.3.17 -> v0.3.18


Fixed case where a simple prefix code could have duplicate symbols.

v0.3.16 -> v0.3.17


Fixed case where a complex prefix code could have an incorrect checksum on its runlength code.

v0.3.15 -> v0.3.16


  • Fixed incorrect calculation of alphabet size for distance code.
  • Fixed evaluation where streams with excessive insert/copy lengths could be rejected early.

v0.3.14 -> v0.3.15


Fixed injection of invalid symbols in simple prefix code.

v0.3.13 -> v0.3.14


Fixed invalid block-type in switch command. (Thanks, Corey!).

v0.3.12 -> v0.3.13


Fixed uncaught non-positive distances. (Thanks, Corey!).

v0.3.11 -> v0.3.12


Fixed uncaught zero-byte in word transformation. (Thanks, Corey!).

v0.3.10 -> v0.3.11


Fixed possible arithmetic overflow in word transformation. (Thanks, Corey!).

v0.3.9 -> v0.3.10


Fixed incorrect type for runlength code. (Thanks, Corey!).

v0.3.8 -> v0.3.9


Fixed incorrect array index bound check in tree lookup. (Thanks, Corey!).

v0.3.7 -> v0.3.8


Fixed some value range checks on block types and ntree*. (Thanks, Corey!).

v0.3.6 -> v0.3.7


Went over "unreachable!()" statements, analyzed, and handled error condition properly, if they were reachable through invalid data.

v0.3.5 -> v0.3.6


Fixed a case where an invalid prefix code with all-zero codelengths could create an index-out-of-bounds panic. (Thanks, Corey!).

v0.3.4 -> v0.3.5


Fixed a case where an invalid insert-and-copy-length-code would produce a panic. (Thanks, Corey!).

v0.3.1 -> v0.3.4


Fair amount of internal small improvements, improving code quality. Fixed a couple of cases where invalid streams would lead to panics and/or infinite loops (Thanks, Corey!).

v0.3.0 -> v0.3.1


This is only a minor version bump, with no breakage in usage, but it's exciting nonetheless!

In Brotli, a lot of work is done with and by prefix codes. Through a change in the internal representation of prefix codes, it was possible to speed the reference benchmark time by a factor of ~7. The benchmark decompresses the contents of the file data/monkey.compressed.

  • With linked-list-based, recursive tree implementation:
    test bench_monkey ... bench: 866,888 ns/iter (+/- 58,119)

  • With array-based, iterative tree implementation, before max-depth constraint:
    test bench_monkey ... bench: 704,282 ns/iter (+/- 220,068)

  • With array-based, iterative tree implementation, with max-depth constraint:
    test bench_monkey ... bench: 120,745 ns/iter (+/- 16,627)

v0.2.0 -> v0.3.0


  • renamed crate compression -> brotli
  • restructured modules to avoid redundant paths like brotli::brotli::Decompressor (now it's just brotli::Decompressor)

v0.1.0 -> v0.2.0


  • Decompressor::new() now accepts a Read, as opposed to a BitReader.

Comments

  • Performance 10x worse than C implementation
    Performance 10x worse than C implementation

    Mar 11, 2016

    I've been timing the C implementation versus the rust implementation and I generally notice about a factor of 8-10x difference.

    Do you know offhand any obvious performance tradeoffs that were made in the design of this version?

    Do you have any ideas about various strategies we could employ to bring it within a factor of two, or ideally to the same speed as the C version especially for multi-megabyte files?

    I noticed no inline assembly in the C version, so I'm hoping it is possible to bring the rust version to parity.

    Have you done any profiling of the existing code or compared it with the C code?

    Reply
  • Fix #21 By removing redundant (incorrect) early check about copy length
    Fix #21 By removing redundant (incorrect) early check about copy length

    Mar 10, 2016

    This checks insertion length but waits until the dictionary transforms are performed to do the copy length metablock length check

    Reply
  • Uncompressed data is not fed into the literal_buf
    Uncompressed data is not fed into the literal_buf

    Mar 11, 2016

    Brotli expects the literal_buf to be prepopulated with uncompressed_data, if such data is encountered before something needing context bytes.

    Currently the literal_buf is left to be 0,0 despite an uncompressed meta-block being encountered

    bug 
    Reply
  • Doesn't build on Rust nightly
    Doesn't build on Rust nightly

    Dec 5, 2015

       Compiling brotli v0.3.19 (file:///Users/coreyf/Development/rust/brotli-rs)
    src/lib.rs:686:18: 686:33 error: const indexing is an unstable feature
    src/lib.rs:686              sum += 32 >> code_lengths[i];
                                             ^~~~~~~~~~~~~~~
    src/lib.rs:686:18: 686:33 help: in Nightly builds, add `#![feature(const_indexing)]` to the crate attributes to enable
    src/lib.rs:1512:10: 1512:60 error: const indexing is an unstable feature
    src/lib.rs:1512                 1 << BROTLI_DICTIONARY_SIZE_BITS_BY_LENGTH[copy_length]
                                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    src/lib.rs:1512:10: 1512:60 help: in Nightly builds, add `#![feature(const_indexing)]` to the crate attributes to enable
    src/lib.rs:1518:34: 1518:84 error: const indexing is an unstable feature
    src/lib.rs:1518             let transform_id = word_id >> BROTLI_DICTIONARY_SIZE_BITS_BY_LENGTH[copy_length];
                                                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    src/lib.rs:1518:34: 1518:84 help: in Nightly builds, add `#![feature(const_indexing)]` to the crate attributes to enable
    error: aborting due to 3 previous errors
    Could not compile `brotli`.
    
    To learn more, run the command again with --verbose.
    
    [email protected] ~/D/r/brotli-rs (master) [101]> rustc -Vv
    rustc 1.6.0-nightly (d49e36552 2015-12-05)
    binary: rustc
    commit-hash: d49e365528e026df6f56fe5eb001e81e2383fbf5
    commit-date: 2015-12-05
    host: x86_64-apple-darwin
    release: 1.6.0-nightly
    
    bug 
    Reply
  • Access to the 'brotli' crates.io name
    Access to the 'brotli' crates.io name

    Feb 7, 2016

    https://crates.io/crates/brotli

    Servo has ownership of this name right now, but we can grant you access if you want. You just need to log in to (and by doing so 'creating') your crates.io account.

    Reply
  • Panic when decompressing
    Panic when decompressing

    Oct 25, 2015

    extern crate brotli;
    
    use std::io::{self, Read};
    use brotli::Decompressor;
    
    fn main() {
        let mut input = vec![];
        let _ = Decompressor::new(&b"\x1b\x3f\xff\xff\xdb\x4f\xe2\x99\x80\x12".to_vec() as &[u8]).read_to_end(&mut input);
    }
    

    Crash discovered using afl.rs

    bug 
    Reply
  • zip file compressed with brotli doesn't roundtrip
    zip file compressed with brotli doesn't roundtrip

    Mar 10, 2016

    https://www.dropbox.com/s/ubfjbfe0oowfvtl/svg.zip?dl=0 when compressed to brotli is https://www.dropbox.com/s/9ifkxzaxblfhz50/svg.zip.compressed?dl=0

    and it fails to roundtrip with the following message

    "data/svg.zip.compressed":
    output length = 65703
    res = Err(Error { repr: Custom(Custom { kind: InvalidData, error: StringError("More uncompressed bytes than expected in meta-block") }) })
    ===========
    

    However the copy and insert lengths appear to be the same as with the google-provided decompressor https://www.dropbox.com/s/gc6rjqtca4mi4cx/svg.zip.cerrlog.txt?dl=0

    brotli's log:

    https://www.dropbox.com/s/vptya6rxmks5f97/svg.zip.errlog.txt?dl=0

    Here's the tail end of the log from the brotli-rs binary

    Insert And Copy Length = 137
    (m_len, self.meta_block.count_output, insert_length, copy_length) = (87897, 87883, 1, 3)
    btype = 0
    [p1, p2] = RingBuffer { buf: [0, 0], pos: 0, cap: 2 }
    Context Mode = 3
    Lit = 0 0
    Count output 87884
    Count output 87887
    Insert And Copy Length = 182
    (m_len, self.meta_block.count_output, insert_length, copy_length) = (87897, 87887, 6, 8)
    Fatal (m_len, a, b, c) = (87897, 87887, 87893, 87901)
    output length = 65703
    res = Err(Error { repr: Custom(Custom { kind: InvalidData, error: StringError("More uncompressed bytes than expected in meta-block") }) })
    ===========
    
    OK ()
    
    bug 
    Reply