[OT]:: Shonky compression systems <- Re: [EE]:: Octarine hole - you have been warned - vast classic computing and electronics information store. On 15 August 2013 07:50, John Ferrell wrote: > My experience with OCR is is really old but I find it hard to believe > that substitution rates are not considered in modern equipment. > The problem that was mentioned is not substitution of characters in the OCR sense but substitution of graphical data blocks as a means of achieving high compression rates. A block of "graphics" is considered and found to have close to 100% match by whatever metric is used with another block of "graphics" so the 1st block is assigned a name and the second and subsequent occurrences are assigned pointers to that name. So, ... Graphics around noise character 6 are scanned and assigned name "Fred". Graphics are sent as is (or with some form of low loss or losslesscompression but name "Fred" is noted by transmitter and receiver. then Graphics around noisy character "8" are scanned and found to also adequately match graphics block known as "Fred" (even though they shouldn't) so instead of data being sent, "use "Fred" is sent. We now have a noisy 8 rather than a noisy 6 - they will not know. But. alas, they do. No OCR has occurred - just graphics_block.Recognition. You could as easily add or remove full stops, or dots above i's or j's. In a serif'd font you will not swap 1 and i but in a sans-serif'd font you may. ___________________ Vaguely related: Systems dealing with numbers with fractional parts which must ALWAYS be able to be "recombined" correctly must work in the number base of the target numbers or a multiple (thereof) or take special care to deal with the fractional parts. [That's poorly put and not even quite right, but let the example provide the mans to clarify it]. The calculation (1 / 3) x 3 'comes to grief' if an eg binary is used without special attention to what is being done. Do this sort of thing a number of times throughout a number processing system dealing with money and people will start complaining. Long ago I heard the CEO of a large organisation complaining that customers were complaining because their invoices did not quite add up correctly. Rather than fix a clearly undesirable customer data presentation system he was 'kicking against the pricks' and trying to insist that people should not mind if the company was apparently 'a cent of few' out in their totals. In fact the totals were correct to within the nearest rounded cent!. What was happening was that subtotals were also rounded (as they must be), also correctly to the nearest rounded cent, but the subsequent addition of tax (= " GST") to the end result further spread the results so that the sum of the parts did not necessarily exactly match the calculated true result when rounding errors were unable to be seen. Fix it they had to and fix it they did, I'm not sure quite how BUT it was probably done by incorrectly adjusting some of the subtotals to fudge the result:-). ie an account with some incorrectly arrived at subtotals gave a more apparently correct result. Russell --=20 http://www.piclist.com/techref/piclist PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist .