Download text file
$if 0
 After much searching and testing these HUFFMAN string (de)compression
 functions are about the smallest, fastest I can come up with. I've done
 away with the "hashing" functions the routines I've found used and just
 created the bit masks mathmatically. They work the same and look the same
 so "if it looks like a duck and quacks like a duck then....".

 Anyhow, what Huffman compression does is counts all the incidences of byte
 values used in a string (file), then stores the used vales in an array and
 sorts that array in decending order. (highest count first) Then each
 character is assigned a bit mask. The bit masks are then layed into a buffer
 and viola! compressed data. Maybe! There is an outside chance that the data
 won't comperss because there are too many varied and oft used odd-ball
 characters but... this would be rare for text in a single language.

 So lets take "PETER PIPER PICKED A PECK OF PICKLED PEPPERS." as a test
 string. After the count, sort and bit map create you get a set of arrays
 that look like this

   "P"  CharVal(01) = 80  : Count%(01) =  9  : BitCode??(1) =     "00"
   "E"  CharVal(02) = 69  : Count%(02) =  8  : BitCode??(2) =    "010"
   " "  CharVal(02) = 32  : Count%(02) =  7  : BitCode??(2) =    "011"
   "C"  CharVal(02) = 67  : Count%(02) =  3  : BitCode??(2) =   "1000"
   "R"  CharVal(02) = 82  : Count%(02) =  3  : BitCode??(2) =   "1001"
   "I"  CharVal(02) = 73  : Count%(02) =  3  : BitCode??(2) =   "1010"
   "K"  CharVal(02) = 75  : Count%(02) =  3  : BitCode??(2) =   "1011"
   "D"  CharVal(02) = 68  : Count%(02) =  2  : BitCode??(2) =  "11000"
   "F"  CharVal(02) = 70  : Count%(02) =  1  : BitCode??(2) =  "11001"
   "A"  CharVal(02) = 65  : Count%(02) =  1  : BitCode??(2) =  "11010"
   "."  CharVal(02) = 46  : Count%(02) =  1  : BitCode??(2) =  "11011"
   "O"  CharVal(02) = 79  : Count%(02) =  1  : BitCode??(2) = "111000"
   "L"  CharVal(02) = 76  : Count%(02) =  1  : BitCode??(2) = "111001"
   "S"  CharVal(02) = 83  : Count%(02) =  1  : BitCode??(2) = "111010"
   "T"  CharVal(02) = 84  : Count%(02) =  1  : BitCode??(2) = "111011"

  Packed up, "PETER" looks like this "000101110110101001" and takes up
  18 bits or 2.25 bytes vs the original 5 bytes.

  Unpacking requires that you keep thinking in BITS left to right (that's
  backwards to the computer remember). When you read in the first bit and
  it is OFF then you're dealing with either the first, second or third
  character in the list ("P", "E" or " " ). If the next bit is, again
  OFF then the coded letter is "P" and so on for the next two bits if
  it is ON.

  If that frist bit is ON then you have to keep checking until you get an
  OFF bit and adding 4 for each ON bit you encounter. When you hit the OFF
  bit the next two bits add to your counter to determine which character
  in the list you're to use.
         ÚÄÄÄÄÄÄ + 4
         ³ÚÄÄÄÄÄ + 4
         ³³ÚÄÄÄÄ + 4
         ³³³ÚÄÄÄÄÄÄÄÄÄ skip this bit
         ³³³³ÚÄÄ + 2
         ³³³³³ÚÄ + 1
  "T" = "111011"
                 total = 15  so CharVal?(15) = 84

  Here's the whole word "PETER" layed out with count and logic.

   ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ the first bit is OFF so
   ³ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ   the next bit is OFF too so ============ "P"
   ³³ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ the first bits OFF so
   ³³³ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ   the next bit is ON so at least 2
   ³³³³ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ   the next bit is OFF so it's a 2 ======= "E"
   ³³³³³ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ the first bit is ON so at least 4
   ³³³³³³ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄ   the next bit is ON so +4 for  8
   ³³³³³³³ÚÄÄÄÄÄÄÄÄÄÄÄÄÄ   the next bit is ON so +4 for 12
   ³³³³³³³³ÚÄÄÄÄÄÄÄÄÄÄÄÄ   this bit is OFF so end of 4's
   ³³³³³³³³³ÚÄÄÄÄÄÄÄÄÄÄÄ     this bit is ON so   +2 for 14
   ³³³³³³³³³³ÚÄÄÄÄÄÄÄÄÄÄ     this bit is ON so   +1 for 15 ======= "T"
   ³³³³³³³³³³³ÚÄÄÄÄÄÄÄÄÄ the first bits OFF so
   ³³³³³³³³³³³³ÚÄÄÄÄÄÄÄÄ   the next bit is ON so at least 2
   ³³³³³³³³³³³³³ÚÄÄÄÄÄÄÄ   the next bit is OFF so it's a 2 ======= "E"
   ³³³³³³³³³³³³³³ÚÄÄÄÄÄÄ this first bit is ON so at least 4
   ³³³³³³³³³³³³³³³ÚÄÄÄÄÄ   the next bit is OFF so end of 4's
   ³³³³³³³³³³³³³³³³ÚÄÄÄÄ   this bit is OFF so add 0
   ³³³³³³³³³³³³³³³³³ÚÄÄÄ   this bit is ON so +1 for 5 ============ "R"
   000101110110101001


  Credit is due to that unknown individual who provided the HUFFMAN.TIP
  file that came with one of the earlier versions of PowerBASIC. From his
  or her notes I was able to decifer the coding system and create these
  functions. And so I now pass them onto you for your use.

  Don Schullian
  d83@DASoftVSS.com
$endif

DECLARE FUNCTION fHuffPack   (O AS STRING) AS STRING
DECLARE FUNCTION fHuffUnPack (P AS STRING) AS STRING

FUNCTION PBmain ()

  LOCAL Orig   AS STRING
  LOCAL Packed AS STRING
  LOCAL UnPked AS STRING

  CLS
  Orig = "PETER PIPER PICKED A PECK OF PICKLED PEPPERS."

  Packed = fHuffPack( Orig )
  PRINT FORMAT$( LEN(Orig  ), "Len = ## : >>" ); Orig  ; "<<"
  PRINT FORMAT$( LEN(Packed), "Len = ## : >>" ); Packed; "<<"

  PRINT STRING$(79,45)

  UnPked = fHuffUnPack( Packed )
  PRINT "Orig: >>"; Orig  ; "<<"
  PRINT "UnPk: >>"; UnPked; "<<"
  WaitKey$

END FUNCTION

'=========================================================================
'==  fHuffPack( InString )
'==
'==   PARAMS: InString = the data to be packed
'==  RETURNS: Packed data string
'==                  2 bytes   = original length
'==                  1 byte    = number of unique characters
'==                  x bytes   = unique characters
'==                  remainder = encoded data
'==           NULL if packed version of InString would exceed the
'==                 string size limitations set with STRING
'=========================================================================

FUNCTION fHuffPack( InString AS STRING ) AS STRING

  DIM   Count  (256) AS LONG                             ' count of all chars
  DIM   CharVal(256) AS LONG                             ' assigned values
  DIM   BitCode(256) AS LONG                             ' bit values
  LOCAL Packed       AS STRING                           ' working string buffer
  LOCAL InLen        AS LONG                             ' length of incoming
  LOCAL CharCount    AS LONG                             ' # of used chars
  LOCAL I_ptr        AS BYTE PTR                         ' ptr to incoming
  LOCAL P_ptr        AS BYTE PTR                         ' ptr to working buffer
  LOCAL BitBuf       AS LONG                             ' working buffer of bits
  LOCAL BitCount     AS LONG                             ' current bit position
  LOCAL X            AS LONG                             ' loop counter
  LOCAL B            AS LONG                             ' counter
  LOCAL F            AS LONG                             ' counter
  LOCAL P            AS LONG                             ' counter
  LOCAL S            AS LONG                             ' counter
                                                         '
  I_ptr  = STRPTR( InString )                            ' set incoming pointer
  InLen = LEN( InString )                                ' set incoming length
                                                         '
  FOR X = 1 TO InLen                                     ' count each char in
    INCR Count(@I_ptr)                                   '  the incoming string
    INCR I_ptr                                           '
  NEXT                                                   '
  I_ptr = I_ptr - InLen                                  ' reset pointer to start
                                                         '
  FOR X = 1 TO 256                                       ' assign and count the #
    IF Count(X) > 0 THEN                                 '  of used characters
      INCR CharCount                                     '
      CharVal(X) = X                                     '
    END IF                                               '
  NEXT                                                   '
                                                         ' gets the char values in
  ARRAY SORT Count(), TAGARRAY CharVal(), DESCEND        ' highest first order
  ARRAY INSERT CharVal(0)                                ' (1) is now highest
                                                         '
  BitCode(2) = 2                                         ' create a set of bit
  BitCode(3) = 3                                         ' values
  F          = 8                                         '  1 =    "00"
  FOR X = 4 TO CharCount                                 '  2 =   "010"
    BitCode(X) = F + B                                   '  3 =   "011"
    IF B < 3 THEN                                        '  4 =  "1000"
        INCR B                                           '  5 =  "1001"
      ELSE                                               '  6 =  "1010"
        B = F                                            '  7 =  "1011"
        SHIFT LEFT F, 1                                  '  8 = "11000"
        F = ( F OR B )                                   ' 12 = "111000"
        B = 0                                            '
    END IF                                               '
  NEXT                                                   '
                                                         '
  B = CharCount + 3                                      ' count number of bytes
  FOR X = 1 TO InLen                                     ' required to hold the
    ARRAY SCAN CharVal(1), = @I_ptr, TO P                ' 3 byte header +
    INCR I_ptr                                           ' number of characters +
    SELECT CASE P                                        ' the packed data
      CASE 1    : BitCount = BitCount + 2                '  (see list above for
      CASE 2, 3 : BitCount = BitCount + 3                '     bit counts)
      CASE ELSE : BitCount = BitCount + (P \ 4) + 3      '
    END SELECT                                           '
    IF BitCount => 32000 THEN                            ' lay off 4000 bytes
      B = B + 4000                                       '
      BitCount = BitCount - 32000                        '
    END IF                                               '
  NEXT                                                   '
  IF BitCount > 0 THEN                                   ' clean-up leftover
    BitCount = BitCount + 7                              '  bits + whatever is
    B = B + ( BitCount \ 8 )                             '  needed to get a full
  END IF                                                 '  bit
  I_ptr = I_ptr - InLen                                  ' reset incoming pointer
  Packed = STRING$( B, 0 )                               ' create working buffer
  MID$( Packed, 1, 2 ) = MKI$(InLen)                     ' put len of original
  P_ptr  = STRPTR( Packed ) + 2                          ' set the working pointer
  @P_ptr = CharCount                                     ' byte 3 = # of chars
  FOR X = 1 TO CharCount                                 ' bytes 4 -> are the
    INCR P_ptr                                           '  char values that
    @P_ptr = CharVal(X)                                  '  correspond to the bit
  NEXT                                                   '  codes
                                                         '
  BitCount = 0                                           ' start processing bytes
  FOR X = 1 TO InLen                                     '
    ARRAY SCAN CharVal(1), = @I_ptr, TO P                ' find array pos of next
    INCR I_ptr                                           '   char / bump pointer
    SELECT CASE P                                        ' set # of bits required
      CASE 1    : S = 2                                  '  to hold the bit-code
      CASE 2, 3 : S = 3                                  '
      CASE ELSE : S = (P \ 4) + 3                        '
    END SELECT                                           '
    SHIFT LEFT BitBuf, S                                 ' shift buffer # new bits
    BitBuf = ( BitBuf OR BitCode(P) )                    ' OR new bits into buffer
    BitCount = BitCount + S                              ' incr bit counter
    IF BitCount => 16 THEN GOSUB fHuffPack1              ' if the buffer is full
  NEXT                                                   '
  IF BitCount > 0 THEN                                   ' clean-up leftover bits
    IF BitCount > 8 THEN GOSUB fHuffPack1                '   more than 1 byte
    IF BitCount > 0 THEN                                 '   still not done
      SHIFT LEFT BitBuf, 8 - BitCount                    '   move bits to left
      INCR P_ptr                                         '   next/last pos in buf
      @P_ptr = ( BitBuf AND 255 )                        '   put it there, dude!
    END IF                                               '
  END IF                                                 '
                                                         '
  FUNCTION = Packed                                      ' RETURN packed data
  EXIT FUNCTION                                          '
  '======================================================'========================
  fHuffPack1:                                            ' remove full bytes from
    DO                                                   '  the bit buffer and put
      BitCount = BitCount - 8                            '  them in the work buf
      ROTATE RIGHT BitBuf, BitCount                      ' move the left most
      INCR P_ptr                                         '  bit to the right
      @P_ptr = ( BitBuf AND 255 )                        '  stuff it into the
      SHIFT RIGHT BitBuf, 8                              ' strip right most bits
      ROTATE LEFT BitBuf, BitCount + 8                   ' move bits back in order
    LOOP UNTIL BitCount < 8                              '
  RETURN                                                 '
                                                         '
END FUNCTION

'=========================================================================
'==  fHuffUnPack( Packed )
'==
'==   PARAMS: Packed
'==             SEE: fHuffPack for header details
'==  RETURNS: Unpacked data
'==     NOTE: There is NO error checking etc. as it is assumed that what
'==           you send in Packed was created with fHuffPack
'=========================================================================

FUNCTION fHuffUnPack( Packed AS STRING ) AS STRING

  DIM   CharVal(256) AS LONG                             ' char values from
  LOCAL UnPacked     AS STRING                           ' working buffer
  LOCAL InLen        AS LONG                             ' length of incoming
  LOCAL OutLen       AS LONG                             ' final length of out
  LOCAL I_ptr        AS BYTE PTR                         ' pointer to incoming
  LOCAL O_ptr        AS BYTE PTR                         ' pointer to outgoing
  LOCAL BitBuf       AS LONG                             ' buffer to work bits
  LOCAL FirstBit     AS LONG                             ' current bit position
  LOCAL S            AS LONG                             ' counter
  LOCAL P            AS LONG                             ' counter
                                                         '
  OutLen   = CVI( Packed )                               ' final returned len
  UnPacked = STRING$( OutLen + 4,0 )                     ' create working  buf
  O_ptr    = STRPTR( UnPacked )                          ' set working  ptr
  I_ptr    = STRPTR( Packed   ) + 2                      ' set incoming  ptr
  S        = @I_ptr                                      ' # of chars used
  InLen    = LEN( Packed ) - ( S + 2 )                   ' # of packed bytes
  FirstBit = -1                                          ' starts down here
                                                         '
  FOR P = 1 TO S                                         ' load char values
    INCR I_ptr                                           '  into array
    CharVal(P) = @I_ptr                                  '
  NEXT                                                   '
                                                         '
  DO                                                     ' start processing
    WHILE ( FirstBit < 15 ) AND ( InLen > 0 )            '  if bit buffer empty
      DECR InLen                                         '  knock off a byte
      INCR I_ptr                                         '  next byte in incoming
      FirstBit = FirstBit + 8                            '  loading 8 bits
      SHIFT LEFT BitBuf, 8                               '  make room in bit buf
      BitBuf = ( BitBuf OR @I_ptr )                      '  stuff new byte
    WEND                                                 '
    IF BIT( BitBuf, FirstBit ) = 0 THEN                  ' if the bit is 0
        DECR FirstBit                                    '  next bit position
        IF BIT(BitBuf,FirstBit) = 0 THEN                 '  if still 0 then
            P = 1                                        '   this is code "00"
          ELSE                                           '  the bit is ON
            DECR FirstBit                                '   next bit position
            P = 2 + BIT(BitBuf,FirstBit)                 '   "01" or "11"
        END IF                                           '
      ELSE                                               ' else the bit is 1
        P = 0                                            '  clear code value
        DO                                               '  check next bits
          P = P + 4                                      '   each ON = 4
          DECR FirstBit                                  '
        LOOP UNTIL BIT(BitBuf,FirstBit) = 0              '  until we hit an OFF
        FOR S = 2 TO 1 STEP -1                           '  skip next bit then
          DECR FirstBit                                  '
          IF BIT(BitBuf,FirstBit) THEN P = P + S         '   add 2 and/or 1
        NEXT                                             '
    END IF                                               '
    DECR FirstBit                                        ' next bit position
    @O_ptr = CharVal(P)                                  ' set the decoded char
    INCR O_ptr                                           ' move working buf ptr
  LOOP UNTIL ( InLen = 0 ) AND ( FirstBit < 0 )          ' got to get them all
                                                         '
  FUNCTION = LEFT$( UnPacked, OutLen )                   ' RETURN truncated
                                                         '
END FUNCTION