GRAEME SMITH email: grysmith@freenet.edmonton.ab.ca YMCA Edmonton Address has changed with little warning! (I moved across the hall! :) ) Email will remain constant... at least for now. On Mon, 15 Feb 1999, Mark Willis wrote: > Graeme Smith wrote: > > > > (un-programmed state) with a jump to preknown state, to redirect > > errant programs back into the main loop, in a known safe state. > > > > GREY > > What I think everyone here's saying, is that they've looked for such a > beastie, and there isn't such a beast as a "known safe state", once you > ended up in a unknown state through some unknown means. > Well, you seem to be assuming that a WDT gives you a "Known Safe State", (as long as you do the fail safe restart code). There is no reason why you can't reset some of the software during your "Error" handler, essentially putting the system into its "fail Safe" mode. after all, if you set it up that way, the only way your system is going to get to the error handler, is if it jumps into the middle of one of the code spaces you left unprogrammed. I fail to see the difference between reacting to a jump failure, and letting the hardware react to the same jump failure, except perhaps the necessity of going through the "Hardware Reset" for what may be a glitch that only affects the software one time. > If the principle of least astonishment is voided, you're best off to > trust nothing, run a (at least partial) hardware test, and RESTART > otherwise, i.e. either do a power-up restart or a Watchdog restart. > Then, you at least know that your hardware's set correctly, etc. - > because YOU JUST SET IT CORRECTLY. (You might think of a state machine > for your project - occasionally when everything tests OK, save state > "Checkpoint dump", should you end up in psychotic code space, TRUST > NOTHING, restart, and load your last checkpoint dump and work forwards > from there. At least that way if you crash, you don't have to duplicate > ALL your work from scratch... Also, the checkpoint dump can give you an > idea on what's going on ) I like that idea... just not sure I really want to perform a HARDWARE reset, every time the software glitches..... It makes sense to drop back to a checkpoint, especially if you checkpoint after you write to an external device, so you don't end up sending the same message twice... > > When you work with embedded hardware that controls electronics that > can quite literally blow up when over-driven, ASSUMING that things are > safe just isn't a good idea at all. (Say you were controlling a piece > of high powered pulsed RF transmitter with a PIC part, the transmitter's > turned on at super high power, and just before it is to be turned off, > the software crashes; You then assume that the transmitter's off and go > ahead and process the next 15 minutes worth of received data in the PIC, > setting up for the next transmission, as the transmitter not-so-slowly > melts into $25,000 worth of slag, but your job's secure and the boss > will be happy - you assumed it was safe, so it was, right? ) This is > different than a lamp dimmer or soundmaker where an occasional "oops" > just makes the light bulb flash a little brighter or the sound a little > different than expected; I for one find that the way you GET good > habits, is to always be very aware of what you're doing, when you're > -developing- habits I wouldn't be having this discussion, if I didn't want to get advice on the best way to achieve my goals... I just don't necessarily want the quick and easy answers, like doing a hardware reset every time you glitch. It might take a little longer, to try to resurrect the software from the checkpoint without doing a complete reset, first, then do the reset if you end up going through the error routine with the same checkpoint the second time, but at least, you don't have to take that time, just because you assumed the hardware was having problems when it was a temporary glitch. > (You can bet that any sane person who had that one, would make sure > that his power-up / watchdog software turned off the transmitter, first > thing. THEN went on to other things...) > > Mark > Being sane, your error routine, would do the same thing.... In essence, I am wondering why you need the WDT at all, if your code is designed to be fail-safe.... Maybe there is something I don't know about PIC's, that makes them different, but when I want to reset my computer, I usually do a WARM BOOT FIRST rather than pushing the reset, if only because some of the drives don't reset well from the hardware reset, but they all accept a warm boot. GREY