BYTE magazine Oct 1979
David E Cortesi, 2340 Tasso St, Palo Alto CA 94301
I was pleased to see a good introductory article on the use of finite state machines appear in BYTE (see "Designing a Command Language" by G A Van den Bout, BYTE, June 1979, page 176). I have found the finite state machine (or finite state automaton, or just FSA) to be a valuable tool in my programmer's toolkit. The finite state machine is an aid to organizing one's thoughts while designing, a good way of producing a really unam- biguous specification document, and as an implemented program it can yield very efficient and reliable code. The finite state machine has long been a plaything of the theoreticians of computer science; you can find it described and analyzed in any textbook on compiler design (it is a good textbook if you can understand the description!). Unfortunately the finite state machine rarely moves out of the textbook and into practical programs. I would like to extend Van den Bout's article with 2 examples from my own experience as a professional programmer that show how the finite state machine solved difficult programming problems in the real world. The first case arose during the design of a timesharing system that was to have a large number of commands. The syntax of the command language was laid down ear- ly in the project, but the specification of the commands themselves kept changing. If I and my colleagues had tried to write detailed code to parse each of the many commands and operands, especially in the face of chang- ing specifications, we would have been swamped. We had to do something to systematize the command-parsing code. We hit on the idea of using finite state machines represented as directed graphs (like the figures in the previous BYTE article). Since we were using a macro-assembler, we created NODE and ARC macroinstructions so that we could "draw" the graph of a command by writing a series of macro calls. Listing 1 shows how some of the chess game commands in the prior article might look in such a macrolanguage. Each macroinstruction assembled to a small group of constants. We thought of these groups as the machine language of an imaginary finite state computer. We then wrote a finite state interpreter which could process these machine instructions. This interpreter program took as its input: (1) the top node of a graph; (2) the tokenized command line from the user; and (3) a small working storage area where semantic routines could leave their Listing 1: A graph representation of a finite state machine as it might look drawn with a macroassembler. The macroinstructions would assemble to machine language for a hypothetical finite state computer; that in turn would be simulated by an interpreter.
Sl ANODE ; TOP NODE, ARCS SELECT VERB-TOKENS ARC TOKEN=KWD.VALUE='MOVE',NEXT=S2M ARC TOKEN=KWD.VALUE='CAP',NEXT=S2C ARC TOKEN=KWD.VALUE='TAKE',NEXT-Sll ZNODE 'MOVE, CAP, OR TAKE?' S2M ANODE VERB=1 ; SET VERB-CODE OF MOVE ARC TOKEN=ANY,NEXT=S2 S2C ANODE VERB=E ; SET VER3-CODE OF CAP ; COMMON GRAPH F0R MOVE AND CAP S2 ANODE ARC TOKEN=KWD,VALUE='FROM',NEXT=S3 ARC TOKEN=KWD,VALUE='TO',NEXT=58 ZNODE '?? PLEASE SAY TO OR FROM' ; GRAPH OF 'FROM XX TO YY' PART S3 ANODE ARC TOKEN=POS,SEMACT=FRPOS,NEXT=S4 ZNODE 'A POSITION MUST FOLLOW FROM' S4 ANODE ARC TOKEN=KWD.VALUE='TO',NEXT=S5 ZNODE 'FROM XX -- EXPECTING TO' S5 ANODE ARC TOKEN=POS,SEMACT=TOPOS,NEXT=S6 ZNODE 'A POSITION MUST FOLLOW TO' ; GRAPH OF 'TO XX FROM YY' VARIANT S8 ANODE ARC TOKEN=POS,SEMACT=TOPOS,NEXT=S9 ZNODE 'A POSITION MUST FOLLOW TO' S9 ANODE ARC TOKEN=KWD.VALUE='FROM',NEXT=Sl0 ZNODE 'TO XX -- EXPECTING FROM' S10 ANODE ARC TOKEN=POS,SEMACT=FRPOS,NEXT=S6 ZNODE 'A POSITION MUST FOLLOW FROM' ; END-CHECK FOR MOVE AND CAP ARC TOKEN=END ; OMITTED tlEXT- MEANS 'ALL DONE' ZNODE 'EXTRA OPERAND' Sll ANODE VERB=3 ; SET VERB-CODE OF TAKE ... etc, etc, etc.
Table 1: A finite state machine for processing numeric constants, represented as an array. Each row is a state of the machine; a column is selected by the next input token. At the intersection is the row number for the next step, and the name of an action to be done.
A0: do nothing Al: note negitive A2: collect intrger digit A3: note rational A4: note exponential A5: collect fraction digit A6: note negitive exponent A7: collect exponent digit El: number(?) is <E>.. E2: number is null E3: <sign><sign>... E4: <sign><E>.. E5: <sign><end> E6: 0..<sign>.. E7: <digit>..<sign>.. E8: ..<decimal><sign>.. E9: ..<decimal><decimal>.. E10: ..<E><decimal>.. Ell: ..<E><E>.. E12: ..<e><end> E13: ..<E><sign><sign>.. Zl: exit, value is zero ZZ: exit, value is integral Z3: exit for rational Z4: exit for exponential
row | Input token | ||||||
"+" | "-" | "0" | "1...9" | "." | "E" | <end> | |
1 | 2/A0 | 2/A1 | 3/A0 | 4/A2 | 5/A3 | 0/E1 | 0/E2 |
2 | 0/E3 | 0/E3 | 3/A0 | 4/A2 | 5/A3 | 0/E4 | 0/E5 |
3 | 0/E6 | 0/E6 | 3/A0 | 4/A2 | 5/A3 | 6/A4 | 0/Z1 |
4 | 0/E7 | 0/E7 | 4/A2 | 4/A2 | 5/A3 | 6/A4 | 0/Z2 |
5 | 0/E8 | 0/E8 | 5/A5 | 5/A5 | 0/E9 | 6/A4 | 0/Z3 |
6 | 7/A0 | 7/A6 | 7/A0 | 7/A7 | 0/E10 | 0/E11 | 0/E12 |
7 | 0/E13 | 0/E13 | 7/A7 | 7/A7 | 0/E10 | 0/E11 | 0/Z4 |