Rust-Rav1e v0.3.5: rav1e — The fastest and safest AV1 encoder.

icon
Latest Release: v0.3.5

rav1e Travis Build Status AppVeyor Build Status Actions Status Coverage Status

The fastest and safest AV1 encoder.

Overview

rav1e is an AV1 video encoder. It is designed to eventually cover all use cases, though in its current form it is most suitable for cases where libaom (the reference encoder) is too slow.

Features

  • Intra, inter, and switch frames
  • 64x64 superblocks
  • 4x4 to 64x64 RDO-selected square and 2:1/1:2 rectangular blocks
  • DC, H, V, Paeth, smooth, and all directional prediction modes
  • DCT, (FLIP-)ADST and identity transforms (up to 64x64, 16x16 and 32x32 respectively)
  • 8-, 10- and 12-bit depth color
  • 4:2:0 (full support), 4:2:2 and 4:4:4 (limited) chroma sampling
  • 11 speed settings (0-10)
  • Near real-time encoding at high speed levels
  • Constant quantizer and target bitrate (single- and multi-pass) encoding modes
  • Still picture mode

Releases

For the foreseeable future, a weekly pre-release of rav1e will be published every Tuesday.

Windows builds

Automated AppVeyor builds can be found here. Click on a build (it is recommended you select a build based on "master"), then click ARTIFACTS to reveal the rav1e.exe download link.

Building

NASM

Some x86_64-specific optimizations require a recent version of NASM and are enabled by default.

Install nasm

ubuntu 20.04

sudo apt install nasm

ubuntu 18.04

sudo apt install nasm-mozilla
# link nasm into $PATH
sudo ln /usr/lib/nasm-mozilla/bin/nasm /usr/local/bin/

fedora 31, 32

sudo dnf install nasm

windows
Have a NASM binary in your system PATH.

release binary

To build release binary in target/release/rav1e run:

cargo build --release

Unstable features

Experimental API and Features can be enabled by using the unstable feature.

cargo build --features unstable

Those Features and API are bound to change and evolve, do not rely on them staying the same over releases.

Target-specific builds

The rust autovectorizer can produce a binary that is about 6%-7% faster if it can use avx2 in the general code, you may allow it by issuing:

RUSTFLAGS="-C target-cpu=native" cargo build --release

or

RUSTFLAGS="-C target-feature=+avx2,+fma" cargo build --release

The resulting binary will not work on cpus that do not sport the same set of SIMD extensions enabled.

Building the C-API

rav1e provides a C-compatible set of library, header and pkg-config file.

To build and install it you can use cargo-c:

cargo install cargo-c
cargo cinstall --release

Compressing video

Input videos must be in y4m format. The monochrome color format is not supported.

cargo run --release --bin rav1e -- input.y4m -o output.ivf

Decompressing video

Encoder output should be compatible with any AV1 decoder compliant with the v1.0.0 specification. You can build compatible aomdec using the following:

mkdir aom_test
cd aom_test
cmake /path/to/aom -DAOM_TARGET_CPU=generic -DCONFIG_AV1_ENCODER=0 -DENABLE_TESTS=0 -DENABLE_DOCS=0 -DCONFIG_LOWBITDEPTH=1
make -j8
./aomdec ../output.ivf -o output.y4m

Configuring

rav1e has several optional features that can be enabled by passing --features to cargo test. Passing --all-features is discouraged.

  • asm - enabled by default. When enabled, assembly is built for the platforms supporting it.
    • It requires nasm on x86_64.
    • It requires gas on aarch64.

NOTE: SSE2 is always enabled on x86_64, neon is always enabled for aarch64, you may set the environment variable RAV1E_CPU_TARGET to rust to disable all the assembly-optimized routines at the runtime.

Using the AOMAnalyzer

Local Analyzer

  1. Download the AOM Analyzer.
  2. Download inspect.js and inspect.wasm and save them in the same directory.
  3. Run the analyzer: AOMAnalyzer path_to_inspect.js output.ivf

Online Analyzer

If your .ivf file is hosted somewhere (and CORS is enabled on your web server) you can use:

https://arewecompressedyet.com/analyzer/?d=https://people.xiph.org/~mbebenita/analyzer/inspect.js&f=path_to_output.ivf

Design

The File Structure and design of the encoder is explained more in the Structure document.

Contributing

Please read our guide to contributing to rav1e.

Getting in Touch

Come chat with us on the IRC channel #daala on Freenode! If you don't have IRC set up you can easily connect from your web browser.

Comments

  • Update IVF Header with number of frames in file
    Update IVF Header with number of frames in file

    Oct 22, 2021

    Currently, we are not writing FrameNumber/number of frames in the file during the final bitstream creation. Would be beneficial if we find a mechanism to do it.

    ~Right now when the final bitstream is muxed to ivf, we are not writing bitrate, and duration.~

    Ref: https://github.com/xiph/rav1e/blob/master/ivf/src/lib.rs

    Reply
  • Hide full search speed setting from public api
    Hide full search speed setting from public api

    Nov 5, 2021

    Thoughts?

                                                                                                                                                                                                           
    Reply
  • "sample value too large"', src/api/lookahead.rs:154:47

    Nov 21, 2021

    Describe the bug "sample value too large"', src/api/lookahead.rs:154:47' error encoding a 4k hevc to AV1

    To Reproduce

    ffmpeg -i 20210213_121844.mp4
     -c:v librav1e -qp 80 -speed 4
     -tile-columns 2 -tile-rows 2
      -c:a libfdk_aac -b:a 128k
       20210213_121844_av1.mp4
    

    Expected behavior Encode the video

    Required Information

    cargo 1.56.0 (4ed5d137b 2021-10-04)
    rustc 1.56.1 (59eed8a2a 2021-11-01)
    NASM version 2.15.05 compiled on Nov  5 2021
    

    Version:

    rav1e 0.5.0-beta (0.5.0-beta) (release)
    

    Operating system:

    Linux  5.11.0-40-generic #44-Ubuntu SMP Wed Oct 20 16:16:42 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
    

    Console Output

    ffmpeg -i 20210213_121844.mp4
     -c:v librav1e -qp 80 -speed 4
     -tile-columns 2 -tile-rows 2
      -c:a libfdk_aac -b:a 128k
       20210213_121844_av1.mp4
    ffmpeg version N-104465-g08a501946f Copyright (c) 2000-2021 the FFmpeg developers
      built with gcc 10 (Ubuntu 10.3.0-1ubuntu1)
      configuration: --prefix=/home/rub/ffmpeg_build --pkg-config-flags=--static --extra-cflags=-I/home/rub/ffmpeg_build/include --extra-ldflags=-L/home/rub/ffmpeg_build/lib --extra-libs='-lpthread -lm' --ld=g++ --bindir=/home/rub/bin --enable-gpl --enable-gnutls --enable-libaom --enable-libass --enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libsvtav1 --enable-libdav1d --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-librav1e --enable-nonfree
      libavutil      57.  7.100 / 57.  7.100
      libavcodec     59. 12.100 / 59. 12.100
      libavformat    59.  8.100 / 59.  8.100
      libavdevice    59.  0.101 / 59.  0.101
      libavfilter     8. 16.101 /  8. 16.101
      libswscale      6.  1.100 /  6.  1.100
      libswresample   4.  0.100 /  4.  0.100
      libpostproc    56.  0.100 / 56.  0.100
    Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '20210213_121844.mp4':
      Metadata:
        major_brand     : mp42
        minor_version   : 0
        compatible_brands: isommp42
        creation_time   : 2021-02-13T11:20:22.000000Z
        com.android.version: 11
        com.android.capture.fps: 30.000000
      Duration: 00:01:37.12, start: 0.000000, bitrate: 53890 kb/s
      Stream #0:0[0x1](eng): Video: hevc (Main 10) (hvc1 / 0x31637668), yuv420p10le(tv, bt2020nc/bt2020/smpte2084), 3840x2160, 53632 kb/s, SAR 1:1 DAR 16:9, 29.95 fps, 30 tbr, 90k tbn (default)
        Metadata:
          creation_time   : 2021-02-13T11:20:22.000000Z
          handler_name    : VideoHandle
          vendor_id       : [0][0][0][0]
        Side data:
          displaymatrix: rotation of -90.00 degrees
      Stream #0:1[0x2](eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 256 kb/s (default)
        Metadata:
          creation_time   : 2021-02-13T11:20:22.000000Z
          handler_name    : SoundHandle
          vendor_id       : [0][0][0][0]
    File '20210213_121844_av1.mp4' already exists. Overwrite? [y/N] y
    Stream mapping:
      Stream #0:0 -> #0:0 (hevc (native) -> av1 (librav1e))
      Stream #0:1 -> #0:1 (aac (native) -> aac (libfdk_aac))
    Press [q] to stop, [?] for help
    Output #0, mp4, to '20210213_121844_av1.mp4':
      Metadata:
        major_brand     : mp42
        minor_version   : 0
        compatible_brands: isommp42
        com.android.capture.fps: 30.000000
        com.android.version: 11
        encoder         : Lavf59.8.100
      Stream #0:0(eng): Video: av1 (av01 / 0x31307661), yuv420p10le(tv, bt2020nc/bt2020/smpte2084, progressive), 2160x3840 [SAR 1:1 DAR 9:16], q=2-31, 30 fps, 15360 tbn (default)
        Metadata:
          creation_time   : 2021-02-13T11:20:22.000000Z
          handler_name    : VideoHandle
          vendor_id       : [0][0][0][0]
          encoder         : Lavc59.12.100 librav1e
        Side data:
          displaymatrix: rotation of -0.00 degrees
      Stream #0:1(eng): Audio: aac (mp4a / 0x6134706D), 48000 Hz, stereo, s16, 128 kb/s (default)
        Metadata:
          creation_time   : 2021-02-13T11:20:22.000000Z
          handler_name    : SoundHandle
          vendor_id       : [0][0][0][0]
          encoder         : Lavc59.12.100 libfdk_aac
    thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: "sample value too large"', src/api/lookahead.rs:154:47
    stack backtrace:
       0:     0x5595ac3d7aec - <unknown>
       1:     0x5595ac43430c - <unknown>
       2:     0x5595ac3c8b75 - <unknown>
       3:     0x5595ac3db050 - <unknown>
       4:     0x5595ac3dac07 - <unknown>
       5:     0x5595ac3db844 - <unknown>
       6:     0x5595ac3db320 - <unknown>
       7:     0x5595ac3d7fb4 - <unknown>
       8:     0x5595ac3db289 - <unknown>
       9:     0x5595aa7f2311 - <unknown>
      10:     0x5595aa7f2403 - <unknown>
      11:     0x5595ac2fbc97 - <unknown>
      12:     0x5595ac4ede7b - <unknown>
      13:     0x5595ac4fa290 - <unknown>
      14:     0x5595aa7eedc1 - <unknown>
      15:     0x5595ac395753 - <unknown>
      16:     0x5595ac398261 - <unknown>
      17:     0x5595ac52d3eb - <unknown>
      18:     0x5595ac3e7eb3 - <unknown>
      19:     0x7fc5d4ff9450 - start_thread
                                   at ./nptl/./nptl/pthread_create.c:473:8
      20:     0x7fc5d4306d53 - clone
      21:                0x0 - <unknown>
    fatal runtime error: failed to initiate panic, error 5
    Abortado (`core' generado)
    
    CPU AMD Ryzen 9 5900X +21Gb of free RAM

    Video mediainfo

    mediainfo 20210213_121844.mp4 
    General
    Complete name                            : 20210213_121844.mp4
    Format                                   : MPEG-4
    Format profile                           : Base Media / Version 2
    Codec ID                                 : mp42 (isom/mp42)
    File size                                : 624 MiB
    Duration                                 : 1 min 37 s
    Overall bit rate                         : 53.9 Mb/s
    Encoded date                             : UTC 2021-02-13 11:20:22
    Tagged date                              : UTC 2021-02-13 11:20:22
    com.android.version                      : 11
    
    Video
    ID                                       : 1
    Format                                   : HEVC
    Format/Info                              : High Efficiency Video Coding
    Format profile                           : Main [email protected]@Main
    HDR format                               : SMPTE ST 2094 App 4, Version 1, HDR10+ Profile B compatible
    Codec ID                                 : hvc1
    Codec ID/Info                            : High Efficiency Video Coding
    Duration                                 : 1 min 37 s
    Source duration                          : 1 min 37 s
    Bit rate                                 : 53.7 Mb/s
    Width                                    : 3 840 pixels
    Height                                   : 2 160 pixels
    Display aspect ratio                     : 16:9
    Rotation                                 : 90°
    Frame rate mode                          : Variable
    Frame rate                               : 29.970 (29970/1000) FPS
    Minimum frame rate                       : 10.000 FPS
    Maximum frame rate                       : 30.040 FPS
    Color space                              : YUV
    Chroma subsampling                       : 4:2:0
    Bit depth                                : 10 bits
    Bits/(Pixel*Frame)                       : 0.216
    Stream size                              : 621 MiB (100%)
    Source stream size                       : 621 MiB (100%)
    Title                                    : VideoHandle
    Language                                 : English
    Encoded date                             : UTC 2021-02-13 11:20:22
    Tagged date                              : UTC 2021-02-13 11:20:22
    Color range                              : Limited
    Color primaries                          : BT.2020
    Transfer characteristics                 : PQ
    Matrix coefficients                      : BT.2020 non-constant
    Mastering display color primaries        : Display P3
    Mastering display luminance              : min: 0.0050 cd/m2, max: 1000 cd/m2
    mdhd_Duration                            : 97123
    Codec configuration box                  : hvcC
    
    Audio
    ID                                       : 2
    Format                                   : AAC LC
    Format/Info                              : Advanced Audio Codec Low Complexity
    Codec ID                                 : mp4a-40-2
    Duration                                 : 1 min 37 s
    Bit rate mode                            : Constant
    Bit rate                                 : 256 kb/s
    Channel(s)                               : 2 channels
    Channel layout                           : L R
    Sampling rate                            : 48.0 kHz
    Frame rate                               : 46.875 FPS (1024 SPF)
    Compression mode                         : Lossy
    Stream size                              : 2.96 MiB (0%)
    Title                                    : SoundHandle
    Language                                 : English
    Encoded date                             : UTC 2021-02-13 11:20:22
    Tagged date                              : UTC 2021-02-13 11:20:22
    
    bug 
    Reply
  • doctests break on rust 1.57.0 (release build)
    doctests break on rust 1.57.0 (release build)

    Dec 5, 2021

    Describe the bug

    Two of 4 tests fail after upgrading from rust 1.56.0 to rust 1.57.0 in https://github.com/NixOS/nixpkgs/pull/148358#issuecomment-986081970

    To Reproduce

    $ cargo test --release
    failures:
         src/api/context.rs - api::context::Context<T>::receive_packet (line 204)
         src/api/context.rs - api::context::Context<T>::receive_packet (line 229)
     test result: FAILED. 4 passed; 2 failed; 0 ignored; 0 measured; 0 filtered out; finished in 14.40s
    

    Required Information Toolchain (if is a build problem):

    $ cargo --version -> 1.57.0
    $ rustc --version -> 1.57.0
    $ nasm --version # if on x86_64 -> 2.15.05
    

    Version:

    $ rav1e --version -> 0.5.0
    

    Operating system:

    $ uname -a
    Linux turingmachine 5.15.6-zen2 #1-NixOS ZEN SMP Tue Jan 1 00:00:00 UTC 1980 x86_64 GNU/Linux
    
    bug 
    Reply
  • Heavy artefact in the middle of the video
    Heavy artefact in the middle of the video

    Dec 11, 2021

    Describe the bug 0.5.0 produces heavy artefacts in the middle of the video, as shown in the sample below.

    To Reproduce Watch or download the encoded result here: https://drive.google.com/file/d/1kaQCIcGPAzmjIS-4zb8JsALle7o8hoa0/view?t=460 at timestamp 07:41 the picture starts to be very crippled by artefacts for ~3 seconds

    This was produced from this h264 source video: link with the following command: ffmpeg -i dst.mp4 -c:v librav1e -b:v 100k -rav1e-params speed=5 -acodec copy rav12.mkv

    Expected behavior No heavy artifacts should be present in the output

    Version: 0.5.0 Operating system: Linux localhost.localdomain 5.14.6-1-default #1 SMP Mon Sep 20 07:02:13 UTC 2021 (6131a3c) x86_64 x86_64 x86_64 GNU/Linux

    Console Output

    $ ffmpeg -i dst.mp4 -c:v librav1e -b:v 100k -rav1e-params speed=5 -acodec copy rav12.mkv
    ffmpeg version 4.4.1 Copyright (c) 2000-2021 the FFmpeg developers
      built with gcc 11 (SUSE Linux)
      configuration: --prefix=/usr --libdir=/usr/lib64 --shlibdir=/usr/lib64 --incdir=/usr/include/ffmpeg --extra-cflags='-O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -Werror=return-type -flto=auto -ffat-lto-objects -g' --optflags='-O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -Werror=return-type -flto=auto -ffat-lto-objects -g' --disable-htmlpages --enable-pic --disable-stripping --enable-shared --disable-static --enable-gpl --enable-version3 --enable-libsmbclient --disable-openssl --enable-avresample --enable-gnutls --enable-ladspa --enable-vulkan --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcelt --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libdc1394 --enable-libdrm --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librav1e --enable-librubberband --enable-libsvtav1 --enable-libsoxr --enable-libspeex --enable-libssh --enable-libsrt --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libv4l2 --enable-libvpx --enable-libwebp --enable-libxml2 --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lto --enable-lv2 --enable-libmfx --enable-vaapi --enable-vdpau --enable-version3 --enable-libfdk-aac-dlopen --enable-nonfree --enable-libvo-amrwbenc --enable-libx264 --enable-libx265 --enable-librtmp --enable-libxvid
      libavutil      56. 70.100 / 56. 70.100
      libavcodec     58.134.100 / 58.134.100
      libavformat    58. 76.100 / 58. 76.100
      libavdevice    58. 13.100 / 58. 13.100
      libavfilter     7.110.100 /  7.110.100
      libavresample   4.  0.  0 /  4.  0.  0
      libswscale      5.  9.100 /  5.  9.100
      libswresample   3.  9.100 /  3.  9.100
      libpostproc    55.  9.100 / 55.  9.100
    Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'dst.mp4':
      Metadata:
        major_brand     : isom
        minor_version   : 512
        compatible_brands: isomiso2avc1mp41
        encoder         : Lavf57.83.100
      Duration: 00:15:20.10, start: 0.000000, bitrate: 278 kb/s
      Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p(tv, bt709), 640x480, 242 kb/s, 15 fps, 50 tbr, 16k tbn, 30 tbc (default)
        Metadata:
          handler_name    : VideoHandler
          vendor_id       : [0][0][0][0]
      Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 16000 Hz, mono, fltp, 32 kb/s (default)
        Metadata:
          handler_name    : SoundHandler
          vendor_id       : [0][0][0][0]
    Stream mapping:
      Stream #0:0 -> #0:0 (h264 (native) -> av1 (librav1e))
      Stream #0:1 -> #0:1 (copy)
    Press [q] to stop, [?] for help
    Output #0, matroska, to 'rav12.mkv':
      Metadata:
        major_brand     : isom
        minor_version   : 512
        compatible_brands: isomiso2avc1mp41
        encoder         : Lavf58.76.100
      Stream #0:0(und): Video: av1 (AV01 / 0x31305641), yuv420p(tv, bt709, progressive), 640x480, q=2-31, 100 kb/s, 15 fps, 1k tbn (default)
        Metadata:
          handler_name    : VideoHandler
          vendor_id       : [0][0][0][0]
          encoder         : Lavc58.134.100 librav1e
      Stream #0:1(und): Audio: aac (LC) ([255][0][0][0] / 0x00FF), 16000 Hz, mono, fltp, 32 kb/s (default)
        Metadata:
          handler_name    : SoundHandler
          vendor_id       : [0][0][0][0]
    frame=13798 fps=4.9 q=-0.0 Lsize=   15060kB time=00:15:20.06 bitrate= 134.1kbits/s speed=0.324x    
    video:11259kB audio:3638kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 1.092567%
    
    bug 
    Reply
  • Try using perf c2c on a heavily tile threaded video
    Try using perf c2c on a heavily tile threaded video

    Jan 3, 2022

    perf has a tool to measure cross-core contention:

    https://joemario.github.io/blog/2016/09/01/c2c-blog/

    It would be interesting to run this while in tile threaded mode to see if we have any accidental extra dependencies.

    Reply
  • Desync with speed 0 and 1
    Desync with speed 0 and 1

    Nov 19, 2019

    Tested with commit: acb7b69227ab29b42f8258b66d5c0dfee9a32470

    ./target/release/rav1e ~/Downloads/crowd_run_2160p50.y4m -o test.ivf -r test_rec.y4m --quantizer 128 --speed=1 --limit=3 --low_latency --tune=Psnr

    ../aom_build/aomdec test.ivf -o test_dec.y4m -v ../aom_build/aomdec test.ivf --rawvideo -o test_dec.yuv -v ffmpeg -i test_rec.y4m test_rec.yuv -v 0 cmp test_rec.yuv test_dec.yuv

    test_rec.yuv test_dec.yuv differ: char 8324073, line 1 ==> difference in 2nd frame

    Similar desync with speed 0. 1st frame is fine. Checked that it first happens at the 2nd frame both 1st and 2nd frame when two frames are encoded, and does. not happen if only one frame is encoded. Speed 2, 6 and 10 were fine.

    bug high priority Chef's Choice desync cursed 
    Reply
  • Allow inclusion in static builds
    Allow inclusion in static builds

    Mar 12, 2021

    I'm trying to build a static version ffmpeg including rav1e, but it seems like one can't compile a static library that includes lgcc_s, because there's only a shard version of it.

    gcc-10 -L<prefix>/lib -I<prefix>/lib -L/usr/lib/gcc/x86_64-linux-gnu/10 -static -Wl,--as-needed -Wl,-z,noexecstack -I<prefix>/include/rav1e -L<prefix>/lib -o /tmp/ffconf.EpJq5glV/test /tmp/ffconf.EpJq5glV/test.o -lrav1e -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -lm -lpthread
    /usr/bin/ld: cannot find -lgcc_s
    collect2: error: ld returned 1 exit status
    ERROR: rav1e >= 0.1.0 not found using pkg-config
    

    I'm not a lot into C so I'm not sure if that is something that even can be changed at all or if there's something I couldn't find that's allows this (although I searched quite thorough and all of the "solutions" simply didn't work or aren't applicable).

    Reply
  • [WIP] Add horz/vert (2:1 only) partitions to top-down partition rdo
    [WIP] Add horz/vert (2:1 only) partitions to top-down partition rdo

    Nov 6, 2018

    Not complete yet.

    TODO :

    • [x] Required to add rectangular transforms, i.e. 8x4, 4x8, 16x8, 8x16, 32x16, 16x32, 32x64, 64x32
    • [x] Check the rectangular transforms work correctly with existing helper functions
    • [x] Check currently enabled intra prediction code can also work for rectangular cases
    • [x] Test top-down partition rdo with new partitions added
    • [x] Check with each of minimum block size
    • [x] Check both of intra and inter modes works correctly with rectangular block sizes/
    WorkInProgress 
    Reply
  • Don't loop through all ref frames when multiref disabled
    Don't loop through all ref frames when multiref disabled

    Sep 18, 2019

    AWCY no change

    Closes #1212

    Reply
  • Tile encoding
    Tile encoding

    Mar 18, 2019

    (description updated on 16 april 2019)

    This PR implements tile encoding (#631).

    1. The (many) first commits introduce tiling structures, which allow to expose simultaneous tiled regions of the whole frame data.
    2. Following commits use the tiling structures where necessary in the whole codebase.
    3. Then, command line arguments are added, the encoder encodes tiles separately, and separate tiles are written to the bitstream.
    4. Finally, parallelization is enabled (spoiler: 1 line of code).

    Context

    Encoding a frame first involves frame-wise accesses (initialization, etc.), then tile-wise accesses (to encode tiles in parallel), then frame-wise accesses using the results of tile-encoding (deblocking, cdef, …):

                                    \
          +----------------+         |
          |                |         |
          |                |         |  Frame-wise accesses
          |                |          >
          |                |         |   - FrameState<T>
          |                |         |   - Frame<T>
          +----------------+         |   - Plane<T>
                                    /    - ...
    
                  ||   tiling views
                  \/
                                    \
      +---+  +---+  +---+  +---+     |
      |   |  |   |  |   |  |   |     |  Tile encoding (possibly in parallel)
      +---+  +---+  +---+  +---+     |
                                     |
      +---+  +---+  +---+  +---+     |  Tile-wise accesses
      |   |  |   |  |   |  |   |      >
      +---+  +---+  +---+  +---+     |   - TileStateMut<'_, T>
                                     |   - TileMut<'_, T>
      +---+  +---+  +---+  +---+     |   - PlaneRegionMut<'_, T>
      |   |  |   |  |   |  |   |     |
      +---+  +---+  +---+  +---+     |
                                    /
    
                  ||   vanishing of tiling views
                  \/
                                    \
          +----------------+         |
          |                |         |
          |                |         |  Frame-wise accesses
          |                |          >
          |                |         |  (deblocking, CDEF, ...)
          |                |         |
          +----------------+         |
                                    /
    

    Tiling

    As you know, in Rust, it is not sufficient not to read/write the same memory from several threads, it must be impossible to write (safe) code that could do it. More precisely, a mutable reference may not alias any other reference to the same memory.

    That's the reason why, as a preliminary step, I replaced accesses using the whole plane as a raw slice in addition to the stride information by PlaneSlice (#1035) and PlaneMutSlice (#1043).

    But Plane(Mut)Slice still borrows the whole plane slice, so it does not, in itself, solves the problem.

    There are several structures to be tiled, which form a tree:

     +- FrameState → TileState
     |  +- Frame → Tile
     |  |  +- Plane → PlaneRegion 
     |  +  RestorationState → TileRestorationState
     |  |  +- RestorationPlane → TileRestorationPlane
     |  |     +- FrameRestorationUnits → TileRestorationUnits
     |  +  FrameMotionVectors → TileMotionVectors
     +- FrameBlocks → TileBlocks
    

    Most of them exist both in const and mutable version (e.g. PlaneRegion and PlaneRegionMut).

    Tiling structures

    PlaneRegion

    This is a view of bounded region of a Plane. It is similar to PlaneSlice, except that it does not borrow the whole underlying raw slice. That way, it is possible to get several non-overlapping regions simultaneously.

    In the end, we should probably merge it with PlaneSlice, but it requires more work because some frame-wise code still uses PlaneSlice in the code base.

    It is possible to retrieve a subregion of a region (which may not exceed its parent). In theory, a subregion is defined by a rectangle (for example: x, y, width, height), but in practice, we need more flexibility. For example, we often need to retrieve a region from an offset, using the same bottom-right corner as its parent without providing width and height.

    For that purpose, I propose a specific Area structure (actually, a Rust enum) to describe subregion bounds. Here are some usage examples:

    let region = plane.region(Area::Rect { x: 32, y: 32, width: 512, height: 512 });
    
    // the area is relative to the parent region
    let subregion = region.subregion(Area::StartingAt { x: 128, y: 128 });
    // it is equivalent to
    let subregion = region.subregion(Area::Rect { x: 128, y: 128, width: 384, height: 384 });
    // or
    let subregion = plane.region(Area:: Rect { x: 160, y: 160, width: 384, height: 384 });
    

    Retrieving a subregion from a BlockOffset is so common accross the code base that I decided to expose it directly:

    let bo = BlockOffset { x: 2, y: 3 };
    let subregion = region.subregion(Area::BlockStartingAt { bo });
    

    Like Plane(Mut)Slice, it provides operator[] and iterators over its rows:

    let row5 = &region[5];
    let value = region[3][4];
    for row in region.rows_iter() {
        let _first_four_values = &row[..4];
    }
    

    The mutable versions of the structure (PlaneRegionMut) and methods are also provided.

    Tile

    A Tile is a view of 3 colocated plane regions (Tile is to a PlaneRegion as a Frame is to a Plane).

    The mutable version (TileMut) is also provided.

    TileState

    The way the FrameState fields are mapped in TileState depends on how they are accessed tile-wise and frame-wise.

    Some fields (like qc) are only used during tile-encoding, so they are only stored in TileState.

    Some other fields (like input or segmentation) are not written tile-wise, so they just reference the matching field in FrameState.

    Some others (like rec) are written tile-wise, but must be accessible frame-wise once the tile views vanish (e.g. for deblocking).

    It contains 2 tiled views: TileRestorationState and a vector of TileMotionVectorsMut (a tiled view of FrameMotionVectors).

    This structure is only provided as mutable (TileStateMut). A const version is not necessary, and would require to instantiate a const version of all its embedded tiled views.

    TileBlocks

    TileBlocks is a tiled view of FrameBlocks. It exposes the blocks associated to the tile.

    The mutable version (TileBlocksMut) is also provided.

    Splitting into tiles

    A TilingInfo structure computes all the details about tiling from the frame width and height and the (log2 of the) number of tile columns and rows. The details are accessible for initializing data or writing into the bitstream.

    It provides an iterator over tiles (yielding one TileStateMut and one TileBlocksMut for each tile).

    Frame offsets vs tile offsets

    In encode_tile(), super-block, block and plane offsets are expressed relative to the tile. The tiling views expose its data relative to the tile:

    • plane_region[y][x] is pixel (x, y) relative to the plane region,
    • tile_blocks[boy][box] contains the Block at (box, boy) relative to the tile,

    TileStateMut exposes some references to frame-level data stored in FrameState:

    • input is a reference to the whole frame,
    • input_hres and input_qres are references to the whole planes.

    When accessing these frame-level data, tile offsets are converted to frame offsets, for example by:

    let frame_bo = ts.to_frame_block_offset(bo);
    

    Current state

    It works.

    Need more tests and reviews.

    Usage

    Pass the requested log2 number of tiles, with --tile-cols-log2 and --tile-rows-log2. For example, to request 2x2 tiles:

    rav1e video.y4m -o video.ivf --tile-cols-log2 1 --tile-rows-log2 1
    

    Currently, the number of tiles is passed in log2 (like in libaom, even if the aomenc options are called --tile-columns and --tile-rows), to avoid any confusion. Maybe we could find a correct user-friendly option later.

    Note that the actual number of tiles may be smaller (e.g. if the image size has fewer super-blocks).

    Reply
  • Don't restrict partition sizes to exactly fit on the right or bottom frame boundaries
    Don't restrict partition sizes to exactly fit on the right or bottom frame boundaries

    Feb 12, 2020

    • Current block partitioning approach on right or borrtom frame boundaries in rav1e
    • Both topdown and bottomup function keep splitting the input SB (SuperBlock) until no partitioned blocks straddle on the right or bottom frame boundaries. (The condition whether keep splitting or not is checked by 'must_split', which is set true if the current partition size straddle on the frame boundary or other condition such as if current size is larger than desired max partition size as shown in the code, https://github.com/xiph/rav1e/blob/8f273bcbde77e5f3711138a57c13dec9dc793973/src/encoder.rs#L2546)

    Affected areas of codebase can be:

    • [x] Partition search functions should not restrict partition sizes to exactly fit on the right or bottom frame boundaries
    • [x] Distortion compute functions to use visible area, which can be any size not defined by av1 partition sizes ~~- [x] Intra prediction to predict only visible pixels only~~
    • [x] Ref pixels for intra pred (i.e. intra edge pixels) should not use invisible pixels. ~~- [ ] Inter prediction (Motion Estimation) to predict only visible pixels only~~ ~~- [ ] Define the pixel values for invisible area as input to forward transforms, i.e. what kind of padding to use for input and reconstructed frame? Already defined that extension of directly adjacent and last available pixel.~~
    enhancement high priority compression performance 
    Reply