• Huffman: encode using paths down binary tree. Build tree by iteratively selecting the lowest prob pair of nodes to parent/group together.

    • Letter-oriented Huffman is good at detecting and compressing files using the letter frequencies alone, but it cannot detect correlations between consecutive letters (common words and syllables).

    Untitled

  • Lempel Ziv

    Untitled

    • both terms LZ77 and LZSS tend to be used very loosely, so they do not really imply very specific algorithms
      • in LZ77 the dictionary reference could actually be longer than the string it was replacing. In LZSS, such references are omitted if the length is less than the "break even" point.
      • LZSS is family of algorithms that generalizes LZ77 (source)
    • LZ77 is good at detecting and compressing these kinds of common words and syllables that occur far more often than you might guess from the letter frequencies alone.
  • Deflate compression: LZSS → Huffman

    • LZ77 compresses an original file into an intermediate sequence of literal letters and "copy items". Then Huffman further compresses that intermediate sequence
  • Zstd

  • JPEG compression

    • Algo

      • Convert from RGB to YCbCr color space which separates luminance from chrominance

      • Downsample chrominance—can get away with quite a bit without being perceptible

      • Divide into 8x8 blocks

      • Center around 0

      • Perform DCT, so the block is a linear combo of the following, coeffs between -1024 and 1024. Most big coefs are in lower freqs. (“energy compaction”)

        Untitled

      • Divide each int coef by a quantization value int. A quality level of 50% has the following table. Result is many of the values will go to 0, esp. the higher freq entries

        Untitled

      • RLE the result. Use zigzag pattern, since this keeps the high-freq 0s together:

        Untitled

      • Huffman compress

    • DCT: can be just a matmul. Because C below is orthonormal, its transpose is the inv.

      Untitled

      Untitled

      Untitled

    • Weakness: images with lots of high freqs, like text, which ends up having artifacts around the text

    • Video

  • PNG

    • Focus on lossless compression

    • Filters (like UP, AVG, PAETH, etc.) → deflate

    • Filters use one’s complement

      Untitled

    • Video

  • MPEG

    • I-frames: independent frames
      • DCT. Basically JPEGs. 8x8 blocks.
    • P-frames: predicted frames, depending only on earlier I and P frames and not B frames
      • Just the differences from prior decoded frame, then DCT, which is more effective with smaller magnitudes
    • B-frames: bidirectionally predicted frames
      • They are thus decoded and may sometimes be sent out of order by container format
    • Various sources in the literature on MPEG-1 contain statements indicating that a typical I-frame should use about 1 bit per pixel on average, a typical P-frame should use 0.1 bits per pixel and a typical B-frame should use 0.015 bits per pixel.
    • Motion compensation
      • Operate on macroblocks (16x16)

      • Want to find blocks that result in smaller encoding

      • Could just find in local area - good for real movement

      • But for finding the best match in the entire frame (purely for optimizing encoding), then want to search larger area. Do this by iteratively 2D binary search, with some fixed-depth beam search keeping the top 25%

        Untitled

        Untitled

        Untitled

      • Search by these metrics

        Untitled

    • Video