« Hufint » : différence entre les versions

De Lillois Fractale Wiki
Aller à la navigation Aller à la recherche
Contenu ajouté Contenu supprimé
Aucun résumé des modifications
Ligne 3 : Ligne 3 :
A Hufint is a variable-length-coded integer, with an optimal compression perspective.
A Hufint is a variable-length-coded integer, with an optimal compression perspective.


Hufint stands for Huffman integer, or Huffman coded integer.
Hufint stands for Huffman integer, or Huffman coded integer.


The contract of this coding tool is:
The contract of this coding tool is:
Ligne 9 : Ligne 9 :
*allow to encode any positive integer (including 0) a a set of bits
*allow to encode any positive integer (including 0) a a set of bits
*allow a non-ambiguous decoding algorithm
*allow a non-ambiguous decoding algorithm
*use less bits for smaller integer
*use less bits for smaller integer
*allow non-ambiguous encoding/decoding for arrays of integers.<br>


Hufint are used in the Matscape projects, wherever
Hufint are used in the Matscape projects, wherever
Ligne 15 : Ligne 16 :
*integer values are likely to be small (example: number of child)
*integer values are likely to be small (example: number of child)
*integer values are allowed to be large (but rather seldom)
*integer values are allowed to be large (but rather seldom)
*integer statistical distribution is roughly an exponential decrease
*integer statistical distribution is roughly an exponential decrease
*encoding size is critical
*encoding size is critical
*speed for math operations on the values is not critical
*speed for math operations on the values is not critical

Version du 22 janvier 2014 à 09:15

Purpose

A Hufint is a variable-length-coded integer, with an optimal compression perspective.

Hufint stands for Huffman integer, or Huffman coded integer.

The contract of this coding tool is:

  • allow to encode any positive integer (including 0) a a set of bits
  • allow a non-ambiguous decoding algorithm
  • use less bits for smaller integer
  • allow non-ambiguous encoding/decoding for arrays of integers.

Hufint are used in the Matscape projects, wherever

  • integer values are likely to be small (example: number of child)
  • integer values are allowed to be large (but rather seldom)
  • integer statistical distribution is roughly an exponential decrease
  • encoding size is critical
  • speed for math operations on the values is not critical

Hufint base

The Hufint base is the slice size for the Hufint bitset representation.

Matscape uses Hufint-6 representations (Hufint base = 6), where

  • numbers from 0 to 2^5-1 (31) are represented on 6 bits
  • numbers from 2^5 (32) to 2^10-1 (1023) are represented on 12 bits
  • numbers from 2^10 (1024) to 2^15-1 (1023) are represented on 18 bits
  • etc...

Algorithm

The algorithm is rather starightforward but not documented here.

Two public functions (in a java lib) are defined:

int f(BitSet) /* conversion of bitSet into a positive integer - decoding */

BitSet f(int) /* converison of a positive integer into bitSet - encoding */