> > If you just fed random numbers into your system as described you
> > would get bad data slipping through every 1 in 256 times. As your
> > checksum could hold any of 256 combinations and one of them must be
> > correct. Because of this I have used 16bit checksums to cut down on
> > the chances of this happening.
>
>         I wonder how this method would do for a specfic bit-error rate, say 1
> error in 1000 bits (10E-3)? I'm actually using more than 8 bits because
> I look for four good frames in a row to get frame sync and so random
> data would be unlikly to cause it to acquire frame sync. If one frame is
> bad after frame sync is aquired, that frame's data is discarded but it
> doesn't give up on frame sync until it has 4 bad frames.

If your data rate is one error per 1000 bits, and all bit errors are
independent (i.e. every bit has a 0.001 chance of being bad, regardless
of whether any other bits are bad), then the likelihood of a wrong packet
passing a checksum/xorsum [assuming 56-bit data, 8-bit check] will be
about equal to the likelihood that at least one bit position will have two
faults out of eight.  There are 56*8 ways that this can happen, and the
likelihood of any particular one of them occuring is (literally) one in a
million, so the likelihood of 'any' of them occuring is about 450/million,
or 1 in 2200.

With an eight-bit CRC, using error *detection*, there must be at least
four bits in error for a bad packet to slip through.  The probability of
that happening is about 1 in 1.5 million.  Since a CRC will detect the
majority of 4 bit errors (albeit not all) the likelihood of a bad packet
slipping through are reduced even further.

If the application requires error *correction*, then any packet which
contains a single bit error will be correctable, those containing
double-bit errors will always be flagged as invalid, and those containing
triple-bit errors may be misinterpreted.  This gives us approximately...

1 in 500 probability of an unrecoverable packet (double-bit failure or
worse)
1 in 24,000 probability of a potential falsely-recognized packet (triple
bit failure or worse).


If the error rate on the input data stream is 1 in 1000 bits, then use of
anything less than an 8-bit CRC (not a mere checksum) is foolish if the
data is of any importance.  A CRC will also permit probable reconstruction
of damaged packets, assuming all errors to be independent, albeit with a
higher risk of regarding a 'bad' packet as good.  If there is any
likelihood of correlated bit errors (e.g. modem line noise) then unless
there are other mechanisms for detecting garbled transmissions (e.g.
framing errors or protocol violations) the dominant risk factor may be
that of a "random" packet (large numbers of random bits).  CRC's,
checksums, and xor'sums all do about equally well in this case.  If the
data contains eight or fewer consecutive bits which have been randomized,
all three methods are guaranteed to detect the error, but if nine or
more consecutive bits have been randomized, any method will have a 1 in
256 chance of passing a bad packet.

> > I would be more inclined to XOR all the numbers in the frame together
> > but otherwise I have used a 16bit version of what you show above.

There are pros and cons to checksum versus XORsum.  Both methods are
substantially inferior to CRC at dealing with random errors, and all three
methods fare equally against cluster errors.

>         It seems like adding would be better than XORing because I'm
> "spreading" the information within the 8 bits by the carries

The spreading of information via carries helps *a little*, but not
much.  A CRC is much more effective at spreading information.

> > I have also heard that CRC is more robust. It is supposed to be less
> > susceptible to added leading zeros on your message. Though this would
> > not apply in your fixed message length case. But as far as hard
> > facts regarding how sensitive CRC is at detecting errors - I am as
> > interested as you - If anyone out there knows this I would like to
> > hear.

Essentially, every bit in a CRC'd message will change about half the bits
in the CRC (though each bit will change a different set of CRC bits).  In
a good CRC, every bit change will change an *ODD* number of bits in the
CRC; consequently, changing an odd number of bits will alter the parity of
the message and be detected.  Further, because every bit in the message
will change more than one bit in the CRC (actually, at least three bits)
and no two bits in the message alter the same set of CRC bits, it's also
impossible for any double-bit error to escape detection.  CRC's are truly
an excellent method of error detection and on a PIC they can be coded
quite efficiently.

BTW, you may sometimes see reference to "Fletcher's Checksum".  It's
computed as:

csum1 = csum2 = 0

for each byte of message:
  csum1 = (csum1 + data) mod 255
  csum2 = (csum2 + csum1) mod 255

at end:
  send csum1 and csum2

Fletcher's checksum does look appealing, and would appear at first glance
to be robust.  In fact, however, it has a number of weaknesses:

[1] Because 255 equals 3*5*17, every third, fifth, and seventeenth byte
from the end of the packet will be checksummed weakly by csum2 (i.e. may
be altered to one or more alternative values without altering csum2).  For
example, if the sixth-to-last byte (6 is a multiple of 3) is supposed to
be a 5, then adding 85 or 170 (85 == 255/3) will not alter csum2.

[2] For packets of length 255 or greater, the 255th, 510th, etc. bytes
from the end will have no effect whatsoever on csum2 [this is just an
extension of problem 1]

[3] Problem [1] may be solved by using 251 (a prime number) as the modulus
instead of 255.  In fact, for packets of length 250 or less, Fletcher's
checksum (mod 251) isn't all that bad.  Even that, however, still has the
weakness that any data byte of 0, 1, 2, 3, or 4 may be substituted for
251, 252, 253, 254, or 255 without altering the checksum; this is perhaps
not as severe a weakness as with modulus 255 where a byte of 00 and FF
will checksum identically, but it's still not good.

If 16 bits are available for a check field (as Fletcher's checksum would
require) a CRC-16 is a much better choice.  There is, however, from what
I've read at least one telecom standard that uses Fletcher's (mod 255 no
less!) so you should be aware of it.