On Wed, Oct 28, 2009 at 10:46 PM, Dwayne Reid wrote: > I recall recent discussion (and past discussions) about the problem > of multiple writes causing eventual read failures of static eeprom > values (static as in they don't change often or at all). If the data doesn't change, don't initiate an EEPROM write. Only write it on change. This should eliminate this concern. However, in high reliability scenarios, five copies is perhaps overkill, and the way you are doing the write may be prone to problems related to power off conditions (ie, since all five bytes are being written you may run into issues regarding the validity of data if the write is in process when the unit loses power). You must seperate your writes in both time and space - they should not be immediately adjacent, and the writes should not occur at the same time. Further, you would be better off adopting a CRC and a counter for each redundant value (or redundant datablock). This gives you confidence the value is correct (error checking), and that it's the latest value stored (coherence). For example: In one high reliability, safety critical industry the most critical information is kept in three locations, everything else is stored in only two locations. Each time the value is changed the counter is incremented, the CRC calculated over both the value and the counter, and the first redundant copy is written. If anything fails during this write, the second copy still has the older value, and the CRC ensures that this one won't be used. Once the first copy is finished writing, a routine verifies it. Then the counter is incremented, the CRC recalculated, and the second copy is written and verified. Again, if anything fails during this write then the first copy is known good, and the counter verifies that it's the latest. If a third copy is made, then the same process happens all over again. On boot up both (or all three) values are checked to see if the CRC is ok. For those items where the CRC is ok, the counters are checked to see which one is the latest, and that value is used. During normal operation, another periodic routine goes over all the eeprom checking CRCs and comparing values in redundant copies. It simply sets a flag if something goes wrong, and the module may report the error for user replacement before the other locations go bad. Depending on the project requirements it may either 1) not attempt to to 'fix' anything or re-write anything, it merely reports on inconsistencies it finds, or 2) if a CRC is bad, or values don't match, it takes the latest good value and attempts a re-write of the bad copy. Options 2 is harder than option one, because you then have to store more informaiton in EEPROM to limit the number of times you attempt to re-write the data. Further, if the EEPROM is going bad it's often better to baby it and avoid writes, so fixing it may actually exacerbate the problem. This removes most problems with power down, bad eeprom cells, etc. It doesn't really perform EEPROM wear leveling, which is something you must consider if you believe you're going to be doing more than 1k writes to any given cell in the EEPROM over the life of the unit. Even though the EEPROM is rated to 10k, 100k, or more, that's an MTBF - statistically calculated, and any one cell can easily fail well before then and still be withing the statistical curve of their rating. Wear leveling has to have a bit more thought put into it in regards to infrequently changing values vs frequently changing values, and whether redundant copies are actually spaced apart in EEPROM. Simply copying the values 5 times and checking for correlation is a very poor method for error and coherence checking. CRC for error checking, and counters for coherence are a _significantly_ better option. But if you need to stick with the 5 bytes for whatever reason (perhaps this is overkill for your situation), make sure the writes are done in seperate operations, and space them out throughout the EEPROM if possible. In the PIC you mention it doesn't matter, but on other devices if there are multiple EEPROM 'blocks' then put seperate copies into different blocks of the EEPROM. Also, you'll need to have a hard-coded fallback value if all the copies fail their CRC checks, and your device is required to continue operation (either full or partial) given complete EEPROM failure. You may need to do this in your scenario if three or four cells have failed (and only 2 or no values match). I hope this is useful. -Adam -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist