Roman Black on 2001-04-10 02:56:18 AM had some really good points and presents some good ideas. Most data compression and decompression algorithms don't use any multiplies or divides, so it's no big deal that PICs have slow multiplies. The *fastest* data compression routine I've seen is RLE (which is useless on English text). The next fastest data compression routine I've seen is: LZRW1A http://www.ross.net/compression/lzrw1a.html (compression source code; also decompression source code is here). but this needs a block of "history" RAM when decompressing. The bigger this block, the better the compression. Does the PIC have enough RAM for "adequate" compression ? Some other ``small'' data compression ideas were discussed on Deja (now Google Groups) under subject headings ``compression on a PDA'' and ``Compressing the Bible for a PDA''. http://groups.google.com/groups?q=%22Compressing+the+Bible+for+a+PDA%22&hl=en&lr=&safe=off&btnG=Google+Search&meta=site%3Dgroups%26group%3Dcomp.compression.* http://groups.google.com/groups?q=%22compression+on+a+PDA%22&hl=en&lr=&safe=off&btnG=Google+Search&meta=site%3Dgroups%26group%3Dcomp.compression.* (be sure to view the whole thread). I hear that ``spell checkers'' compress their lists of spelling words by storing a few bits indicating how many letters this word had in common with the start of the previous word (zero, for the first word (starting with A), the first word starting with B, etc), a few bits indicating how much to increment the next letter (unless the previous word didn't have a next letter), followed by the rest of the word. For example, ex fear feared fearing fed feed (26 bytes, not counting end-of-word indicator) would be encoded <0> e x <0> <1> e a r <4> e d <4> <4> ng <2> <3> <2> <1> d (If you just put the letters into bytes, and the numbers into nybbles, rather than something more efficient, this gives 15 bytes, not counting end-of-word indicator). I was playing with a very simple compression algorithm based on the letter frequencies in http://www.piclist.com/techref/method/compress/embedded.htm and http://www.piclist.com/techref/method/compress/etxtfreq.htm (which really needs a "up" link to http://www.piclist.com/techref/method/compress.htm ) that decoded like this: 100: space 101: e 110: t 111: n 0100: r 0101: o 0110: a 0111: i 00xxxx_xxxx: all other (8 bit) letters. I think I could decode this with a pretty compact program on a PIC. I think I could even *encode* this with a small subroutine on the PIC. It has the feature that I can encode *any* sequence of raw bytes, so I can send funky control codes. If I didn't need all possible 8 bit values I could combine this with Roman Black's idea and further decode some of those ``unused'' 10 bit symbols (obviously I don't need the ones for ``_etnroai'') into letter pairs, triplets, or entire words. True Morse code compresses almost as well as the above scheme -- if I use "1" for dot, "01" for dash, "00" for end-of-letter, I get 00 : space 100 : e 0100 : t 01100 : n 101100 : r 01010100 : o 10100 : a 1100 : i 11100 : s ... for the most common letters. -- David Cary -- http://www.piclist.com hint: To leave the PICList mailto:piclist-unsubscribe-request@mitvma.mit.edu