Data Link Error Detection / Correction Methods

Compression@

Errors introduced by communications faults, noise or other failures into valid data, especially compressed data were redundancy has been removed as much as possible, can be detected and/or corrected by introducing redundancy into the data stream.

Safety in Redundancy: More redundancy detects more errors, at the cost of more data transmitted. We could simply send each message 3 times, and discard any copy that disagrees with the other two. This is a simple example of a "perfect" code, although it is far from perfect, it is called that because it adds exactly enough redundancy to detect or correct some number of errors. In this case, not that many. Note that although it is unlikely, it IS possible that the two identical copies both happened to have the exact same error, and the third copy is the correct one. If each copy arrived different in some way, we might have detected two errors. Error detection and correction systems are rated by how much redundancy they cost, and how many errors they can detect or correct. In this example three "symbols" are used, each is the length of the original message. It can detect two errors, and correct one. We can do much better.

Parity: For example, if we send some specific sequence of ones and zeros, and then count the number of ones that we sent and send an extra 1 if that count is odd or an extra 0 if that count is even, then we have introduced a small amount of redundancy into the transmission. The extra bit is called the parity bit, it is even parity because it makes the total number of 1's in the transmission become an even number, and is a simple example of what is called an "extended" code. The receiver can then count up the number of 1 bits they recieved, perform the same calculation, and if the result does not match the extra bit we sent them, they will know that an error occurred. They have no way of knowing which bit was wrong, and if two bits were changed in value (a 1 to a 0 and a 0 to a 1) then the errors would pass undetected.

Cost / Benefit: Our first example, sending a message three times, costs a lot more than our second example of even parity, and doesn't do much more. A slightly more complex parity system will give us the same advantages of the 3x repetition and for 'x' bits, cost only a few extra bits.

If we divide our bits up into rows and columns, which is pretty normal anyway, we can calculate a parity for each row, and for each column. Column parity is commonly referred to as an LRC. When a single bit error occurs, we can tell which row it's in by checking the parity bit for that row, and the bit of that byte which is wrong will be shown by the incorrect parity of that column. For 8 bytes of data, the 64 bits have been protected against any 1 bit error, and a 2 bit error would be detectable, if not correctable. The cost is 8 column parity bits (one for each bit in a byte) and 9 row parity bits (one for each byte, and one more for the byte of column parity bits) for a total of 17 bits or 26% redundancy.

This would be written as [81,64,3] meaning that 81 bits are used to transmit 64 bits of data and changing any one bit in the data must change the extra error checking bits by at least 3. Note: Flipping 2 bits must cause the value to change by a value of at least three.

Binary Matrix: We can do better even than that, but to understand how, we need to start representing the calculation of the redundant bits via binary matrix multiplication which is done like this:

x   a b c
y * d e f
z   g h i
   -------
    X Y Z

X = (x & a) ^ (y & d) ^ (z & g)
Y = (x & b) ^ (y & e) ^ (z & h)
Z = (x & c) ^ (y & f) ^ (z & i)

Where each letter is a bit, x, y, and z are the input data, a thru i are the matrix bits defined by the error correcting code, and X, Y, and Z are the outputs. Instead of the standard multiplication and addition operators, we use binary operators: The "&" operator is a binary AND and the "^" is a binary XOR; if you aren't sure what those mean, review our digital logic tutorial, and/or the C language reference description of bitwise operators.

In general, each data bit on the left is ANDed with each bit in the matrix row to it's right, and then those results are XOR'd down with the results of the other rows to make the total at the bottom. An animation would make that so much more clear, please send us one?

A matrix can be used to show the simple parity example above. This is NOT the row/column parity we talked about above, just the single byte parity; the rows and columns here are of the matrix, not the data. To keep it simple, let us limit ourselves to just 3 bits of data. The Matrix is 3 rows of 4 columns; one row for each bit of data, and one column for each resulting bit of data and then the parity bit.

x   1 0 0 1
y * 0 1 0 1
z   0 0 1 1
    -------
    X Y Z P

X = x
Y = y
z = z
P = x ^ y ^ z  =  1 & ( x + y + z )  = is the total of the 1's odd?

As before, X is (x AND the top left bit of the matrix), XOR (y AND the left bit in the second row)... you should notice that all the rest of the bits in that left hand column are zero, which always returns a zero no matter what you AND it with, so in the end, X just equals x. The same thing happens with Y and Z. Our parity bit, P, ends up being x ^ y ^ z, which, it turns out is a 1 if there are an odd number of 1's in the data or a 0 if that count is even so that XYZP always has an even number of 1's. Lets show an example, and the operations we did to get the result.

1   1 0 0 1    1&1=1   1&0=0   1&0=0   1&1=1
0 * 0 1 0 1    0&0=0   0&1=0   0&0=0   0&1=0
1   0 0 1 1    1&0=0   1&0=0   1&1=1   1&1=1
    -------        -       -       -       -
    1 0 1 0        1^0^0=1 0^0^0=0 0^0^1=1 1^0^1=0

The advantage of this matrix system is that it makes it easy to visualize much more complex error detection and correction codes. For example, this matrix for a Hamming code will detect 2 or correct 1 error in 26 data bits while adding only 5 extra check bits. That is a 20% cost, far better than row and column parity, although it does require a bit more logic to encode or decode.

1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1
0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0
0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0
0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1
0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1
0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 0 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 1 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 1 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 1 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 1 0 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1

Glossary:

Syndrome
The XOR of the expected error-checking bits with those actually received is called the syndrome of a received codeword.
Hamming distance
The minimum change in value of extra check bits that must occur when any one bit is changed in the data section. An x-error-correcting code must have a minimum distance of at least 2x+1.

See also:

Books

Patents

Scott Dattalo

How to measure linearity error.

1) Collect a whole bunch of data and record a) the voltage applied and b) and the converted digital value. You also should collect several samples at each voltage value and use the mean of the digital values to represent converted digital value. There's another test for measuring the spread.

2) Find the least square's curve fit of the data. In other words, find the line that best fits the data.

3) Find the sample that most deviates from this line.

The line found by the least square's curve fit is of the form:

   y = m*x + b

where
 x = true value of the input signal (the applied voltage)

 y = expected value when the true value is applied (the actual value is
     what you measure)

 m & b are the slope and intercept of the line and are calculated:


m = [s(x) * s(x * d) - s(x*x) s(d)] / den

b = [s(x) * s(d) - n * s(x * d) ] / den

den = s(x)*s(x) - n * s(x*x)

where,

d = digitized value
n = total number of samples

and
          n
        -----
        \
s( ) =  /     (   )
        -----
         i=1

Once you have m and b, you can perform step 3 by running the applied voltage through the line representing the data and comparing the expected value output with the value you actually measured. The Linearity Error is then:


L = max( ( d - y ) / x_fr ) * 100%

where x_fr is the full range of the data, or 4096 for a 12-bit A/D.

Like I said earlier, I've got an octave program to do this. If anyone wants it, drop me a line. You probably can find a lsf function in excel too.

See also:

See:

Archive:

Questions:

Interested: