On Mon, 10 May 2010, Dwayne Reid wrote:

> At 07:49 PM 5/10/2010, sergio masci wrote:
> 
> >Regarding robustness, if you use a real state machine, you are better able
> >to resume operation if an external event causes the MCU to lock up and the
> >watchdog is able to restart it. Think of it as a watchdog event that can
> >occure during any state. You would use this event to trigger a state
> >change to a special state (which could be one of many such states) much
> >like any other event used by your state machine.
> >
> >At the start of many of your states you initialise things like variables
> >and output ports. In other states you update variables and output ports.
> >You would tend to use the watchdog event to take you back to a state (in
> >the current chain) that does the initialising or setting hardware to a
> >well defined condition.
> 
> I understand what you are saying but I must respectfully disagree with you.
> 
> The problem is that you don't know WHAT has been corrupted.  It may 
> be a single register, it might be many registers.  You just don't know.

Yes I understand this. A very big problem with many state machine 
implentations is the way programmers "hide" some state information in 
variables and use it to communicate between states. Then you have a 
situation where simply jumping to a state doesn't mean the state machine 
is actually in that state (because some other information important to 
that state is invalid).

Consider a state machine that uses NO RAM variables at all - except for a 
handfull of variables the executive needs to manage the state machine. 
Such a state machine could be used to implement quite complex control 
functions yet it would have a trivial number of RAM locations that need to 
be protected from corruption. Ok you can't actually put a force field 
around these RAM locations but you can do other things to try to ensure 
the values they hold are valid. You might make backup copies elsewhere in 
RAM, you might add error correcting code, you might only update them in 
safe ways. Ok you won't catch and fix every prossible type of error that 
could occure but you wont be any worse off than simply detecting a fault 
and doing a cold boot.

So if you progress from this to a state machine where specific RAM 
locations are only important to coressponding specific states then the 
specific state is responsible for initialising the RAM when it is entered. 
By definition you don't care about the rest of the RAM because you can 
only be in one state at a time. This other RAM can get corrupted but it 
doesn't mater. If your MCU now gets hammered by some external event and 
your state machine executive determines that your state machine is still 
valid it will enter the state again and by definition initialise its RAM 
again. So you've protected the state machine executive RAM and said the 
rest doesn't mater because we can fix it if it gets corrupted.

Ok, progressing again, let's say you have a state machine that has a group 
of states that use a small well defined self contained group of RAM 
locations to communicate between them. One of these states WILL cause 
these RAM locations to be initalised. Again, you don't care about the rest 
of the RAM, it can get corrupted but it doesn't mater. If your MCU now 
gets hammered by some external event (while in one of the states within 
this group) and your state machine executive determines that your state 
machine is still valid it will enter the initialising state again and by 
definition initialise its RAM again. This is like all of the states within 
the GROUP responding to a watchdog event by jumping to the initialising 
state of that GROUP.

I/O ports could be treated the same way as a RAM location shared by a 
group of states, the only fly in the ointment being that here a group of 
states might be tied to a single pin on the port while another group of 
states might be tied to another pin on the SAME port. We can get around 
this problem a few ways but by far the most eligant would be to use 
muliple state machines running concurrently each looking after their own 
I/O pins and having their own set of initialisation states.

Seriously, the biggest problem with state machines is that programmers 
tend to think of states and events as a way of getting around a 
programming problem rather than seeing states and events as a replacement 
for lines of code. It is very common to find huge programs that are 
described as "state machines" with (litarally) only a few defined states 
and events.

> 
> I've always had the luxury of being able to restart the whole machine 
> right from the beginning (cold-boot) and that is the approach I've 
> always taken.

And having done this have you ever squirelled away some special flag 
somewhere that enabled you to kind-of-restart (from the beginning 
possibly) one of several tasks or functions?

I'm not saying state machines provide a bullet proof way of coping with an 
MCU lockup and watchdog reset, what I am saying is that at best they 
provide a way of handeling the fault gracefully and possibly recovering 
completly and at worst they are no worse than a cold boot.

Regards
Sergio Masci
-- 
http://www.piclist.com PIC/SX FAQ & list archive
View/change your membership options at
http://mailman.mit.edu/mailman/listinfo/piclist