|There is no reason why you can't reset some of the software during your |"Error" handler, essentially putting the system into its "fail Safe" mode. |after all, if you set it up that way, the only way your system is going to |get to the error handler, is if it jumps into the middle of one of the |code spaces you left unprogrammed. It depends why you got into the failed state in the first place. Often, the only probable mechanism for reaching unprogrammed code space is an electrical glitch; such a glitch may probably be better recovered via reset than a warm start; better still would be to power cycle the chip, but that would require external hardware. |I fail to see the difference between reacting to a jump failure, and |letting the hardware react to the same jump failure, except perhaps the |necessity of going through the "Hardware Reset" for what may be a glitch |that only affects the software one time. The $10,000,000 question here is what caused the jump table failure (if that's what killed the system). If the only way in which the failure could occur is by the CPU executing code incorrectly (e.g. if I code: ; Table evaluator [starting at address $30] Xlate: clrf PCLATH addwf PC db 1,2,4,8, 3,5,7,9, 2,4,6,8, 4,3,2,1 ; Later on... MainLoop: movf PORTB,w andlw $0F movwf LatchB ; Do some munging... ; Later on movf LatchB,w call Xlate movwf Result ; ... goto MainLoop If no code writes to INDF, and if the only write to LatchB is as above, there is no way the computed jump near the beginning should ever fail. Nonetheless, if the chip gets glitched, it's possible that the value in LatchB might get corrupted. Of course, if this DOES happen there's not much guarantee that anything else will be as it should be either... > If the principle of least astonishment is voided, you're best off to > trust nothing, run a (at least partial) hardware test, and RESTART > otherwise, i.e. either do a power-up restart or a Watchdog restart. > Then, you at least know that your hardware's set correctly, etc. - > because YOU JUST SET IT CORRECTLY. (You might think of a state machine > for your project - occasionally when everything tests OK, save state > "Checkpoint dump", should you end up in psychotic code space, TRUST > NOTHING, restart, and load your last checkpoint dump and work forwards > from there. At least that way if you crash, you don't have to duplicate > ALL your work from scratch... Also, the checkpoint dump can give you an > idea on what's going on ) I like that idea... just not sure I really want to perform a HARDWARE reset, every time the software glitches..... It makes sense to drop back to a checkpoint, especially if you checkpoint after you write to an external device, so you don't end up sending the same message twice... > > When you work with embedded hardware that controls electronics that > can quite literally blow up when over-driven, ASSUMING that things are > safe just isn't a good idea at all. (Say you were controlling a piece > of high powered pulsed RF transmitter with a PIC part, the transmitter's > turned on at super high power, and just before it is to be turned off, > the software crashes; You then assume that the transmitter's off and go > ahead and process the next 15 minutes worth of received data in the PIC, > setting up for the next transmission, as the transmitter not-so-slowly > melts into $25,000 worth of slag, but your job's secure and the boss > will be happy - you assumed it was safe, so it was, right? ) This is > different than a lamp dimmer or soundmaker where an occasional "oops" > just makes the light bulb flash a little brighter or the sound a little > different than expected; I for one find that the way you GET good > habits, is to always be very aware of what you're doing, when you're > -developing- habits I wouldn't be having this discussion, if I didn't want to get advice on the best way to achieve my goals... I just don't necessarily want the quick and easy answers, like doing a hardware reset every time you glitch. It might take a little longer, to try to resurrect the software from the checkpoint without doing a complete reset, first, then do the reset if you end up going through the error routine with the same checkpoint the second time, but at least, you don't have to take that time, just because you assumed the hardware was having problems when it was a temporary glitch. > (You can bet that any sane person who had that one, would make sure > that his power-up / watchdog software turned off the transmitter, first > thing. THEN went on to other things...) > > Mark > Being sane, your error routine, would do the same thing.... In essence, I am wondering why you need the WDT at all, if your code is designed to be fail-safe.... Maybe there is something I don't know about PIC's, that makes them different, but when I want to reset my computer, I usually do a WARM BOOT FIRST rather than pushing the reset, if only because some of the drives don't reset well from the hardware reset, but they all accept a warm boot. GREY