java.lang.Object
me.lemire.integercompression.FastPFOR
All Implemented Interfaces:
IntegerCODEC, SkippableIntegerCODEC

public class FastPFOR extends Object implements IntegerCODEC, SkippableIntegerCODEC
This is a patching scheme designed for speed. It encodes integers in blocks of integers within pages of up to 65536 integers. Note that it is important, to get good compression and good performance, to use sizeable arrays (greater than 1024 integers). For arrays containing a number of integers that is not divisible by BLOCK_SIZE, you should use it in conjunction with another CODEC: IntegerCODEC ic = new Composition(new FastPFOR(), new VariableByte()).

For details, please see:

Daniel Lemire and Leonid Boytsov, Decoding billions of integers per second through vectorization Software: Practice & Experience http://onlinelibrary.wiley.com/doi/10.1002/spe.2203/abstract http://arxiv.org/abs/1209.2137

For sufficiently compressible and long arrays, it is faster and better than other PFOR schemes.

Note that this does not use differential coding: if you are working on sorted lists, you should first compute deltas, @see me.lemire.integercompression.differential.Delta#delta. For multi-threaded applications, each thread should use its own FastPFOR object.
Author:
Daniel Lemire
  • Field Details

  • Constructor Details

    • FastPFOR

      public FastPFOR()
      Construct the fastPFOR CODEC with default parameters.
  • Method Details

    • headlessCompress

      public void headlessCompress(int[] in, IntWrapper inpos, int inlength, int[] out, IntWrapper outpos)
      Compress data in blocks of BLOCK_SIZE integers (if fewer than BLOCK_SIZE integers are provided, nothing is done).
      Specified by:
      headlessCompress in interface SkippableIntegerCODEC
      Parameters:
      in - input array
      inpos - where to start reading in the array
      inlength - how many integers to compress
      out - output array
      outpos - where to write in the output array
      See Also:
    • headlessUncompress

      public void headlessUncompress(int[] in, IntWrapper inpos, int inlength, int[] out, IntWrapper outpos, int mynvalue)
      Uncompress data in blocks of integers. In this particular case, the inlength parameter is ignored: it is deduced from the compressed data.
      Specified by:
      headlessUncompress in interface SkippableIntegerCODEC
      Parameters:
      in - array containing data in compressed form
      inpos - where to start reading in the array
      inlength - length of the compressed data (ignored by some schemes)
      out - array where to write the uncompressed output
      outpos - where to start writing the uncompressed output in out
      mynvalue - number of integers we want to decode. May be less than the actual number of compressed integers
      See Also:
    • maxHeadlessCompressedLength

      public int maxHeadlessCompressedLength(IntWrapper compressedPositions, int inlength)
      Description copied from interface: SkippableIntegerCODEC
      Compute the maximum number of integers that might be required to store the compressed form of a given input array segment, without headers.

      This is useful to pre-allocate the output buffer before calling SkippableIntegerCODEC.headlessCompress(int[], IntWrapper, int, int[], IntWrapper).

      Specified by:
      maxHeadlessCompressedLength in interface SkippableIntegerCODEC
      Parameters:
      compressedPositions - since not all schemes compress every input integer, this parameter returns how many input integers will actually be compressed. This is useful when composing multiple schemes.
      inlength - number of integers to be compressed
      Returns:
      the maximum number of integers needed in the output array
    • compress

      public void compress(int[] in, IntWrapper inpos, int inlength, int[] out, IntWrapper outpos)
      Description copied from interface: IntegerCODEC
      Compress data from an array to another array. Both inpos and outpos are modified to represent how much data was read and written to. If 12 ints (inlength = 12) are compressed to 3 ints, then inpos will be incremented by 12 while outpos will be incremented by 3. We use IntWrapper to pass the values by reference.
      Specified by:
      compress in interface IntegerCODEC
      Parameters:
      in - input array
      inpos - where to start reading in the array
      inlength - how many integers to compress
      out - output array
      outpos - where to write in the output array
    • uncompress

      public void uncompress(int[] in, IntWrapper inpos, int inlength, int[] out, IntWrapper outpos)
      Description copied from interface: IntegerCODEC
      Uncompress data from an array to another array. Both inpos and outpos parameters are modified to indicate new positions after read/write.
      Specified by:
      uncompress in interface IntegerCODEC
      Parameters:
      in - array containing data in compressed form
      inpos - where to start reading in the array
      inlength - length of the compressed data (ignored by some schemes)
      out - array where to write the compressed output
      outpos - where to start writing the uncompressed output in out
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • makeBuffer

      protected ByteBuffer makeBuffer(int sizeInBytes)
      Creates a new buffer of the requested size. In case you need a different way to allocate buffers, you can override this method with a custom behavior. The default implementation allocates a new Java direct ByteBuffer on each invocation.
      Parameters:
      sizeInBytes -
      Returns: