Interface SkippableIntegerCODEC

All Known Implementing Classes:
BinaryPacking, FastPFOR, FastPFOR128, GroupSimple9, JustCopy, Kamikaze, NewPFD, NewPFDS16, NewPFDS9, OptPFD, OptPFDS16, OptPFDS9, Simple16, Simple9, SkippableComposition, VariableByte

public interface SkippableIntegerCODEC
Interface describing a standard CODEC to compress integers. This is a variation on the IntegerCODEC interface meant to be used for random access (i.e., given a large array, you can segment it and decode just the subarray you need). The main difference is that you must specify the number of integers you wish to uncompress. This information should be stored elsewhere. This interface was designed by the Terrier team for their search engine.
Author:
Daniel Lemire
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    headlessCompress(int[] in, IntWrapper inpos, int inlength, int[] out, IntWrapper outpos)
    Compress data from an array to another array.
    void
    headlessUncompress(int[] in, IntWrapper inpos, int inlength, int[] out, IntWrapper outpos, int num)
    Uncompress data from an array to another array.
    int
    maxHeadlessCompressedLength(IntWrapper compressedPositions, int inlength)
    Compute the maximum number of integers that might be required to store the compressed form of a given input array segment, without headers.
  • Method Details

    • headlessCompress

      void headlessCompress(int[] in, IntWrapper inpos, int inlength, int[] out, IntWrapper outpos)
      Compress data from an array to another array. Both inpos and outpos are modified to represent how much data was read and written to. If 12 ints (inlength = 12) are compressed to 3 ints, then inpos will be incremented by 12 while outpos will be incremented by 3. We use IntWrapper to pass the values by reference. Implementation note: contrary to IntegerCODEC.compress(int[], me.lemire.integercompression.IntWrapper, int, int[], me.lemire.integercompression.IntWrapper), this may skip writing information about the number of encoded integers.
      Parameters:
      in - input array
      inpos - where to start reading in the array
      inlength - how many integers to compress
      out - output array
      outpos - where to write in the output array
    • headlessUncompress

      void headlessUncompress(int[] in, IntWrapper inpos, int inlength, int[] out, IntWrapper outpos, int num)
      Uncompress data from an array to another array. Both inpos and outpos parameters are modified to indicate new positions after read/write.
      Parameters:
      in - array containing data in compressed form
      inpos - where to start reading in the array
      inlength - length of the compressed data (ignored by some schemes)
      out - array where to write the uncompressed output
      outpos - where to start writing the uncompressed output in out
      num - number of integers we want to decode. May be less than the actual number of compressed integers
    • maxHeadlessCompressedLength

      int maxHeadlessCompressedLength(IntWrapper compressedPositions, int inlength)
      Compute the maximum number of integers that might be required to store the compressed form of a given input array segment, without headers.

      This is useful to pre-allocate the output buffer before calling headlessCompress(int[], IntWrapper, int, int[], IntWrapper).

      Parameters:
      compressedPositions - since not all schemes compress every input integer, this parameter returns how many input integers will actually be compressed. This is useful when composing multiple schemes.
      inlength - number of integers to be compressed
      Returns:
      the maximum number of integers needed in the output array