One Minute to Learn Programming: Finite Automata
Today, we're gonna talk about an incredibly simple programming system that has a ton of real-world applications: Finite State Machines (FSMs). You're gonna see them a lot in
both theory and systems courses, so this is a very important lecture.
(By the way - for those of you who've learned about FSMs before; everything in this lecture refers to Deterministic Finite Automata (DFAs). We use the terms FSM and DFA interchangeably.)
FSMs take a string as input and output either
YES or
NO. By a string, we mean a finite sequence of characters, such as
abaca or
10011101. The below diagram represents a typical FSM. The circles represent states, the double-circles represent accepting states, and the arrows represent transitions; the initial or starting state is indicated by the external arrow:
Let's say we fed the string
0111 to the above automata. What would happen?
- We start at the leftmost circle, or state (this is indicated by the external arrow).
- The first character is a
0. The transition arrow for that state says to stay at the same state.
- The second character is a
1. We move to the top state in accordance with the transition.
- The third character is a
1. We move to the right state (actually, we'd have done that even if it was a 0).
- The fourth character is a
1. We stay at the same state (right).
- Since we ended up at an accepting state (states with two circles are accepting states), we respond
YES.
On the other hand,
001011110100 is not accepted. Why?
FSMs have two major components: states and transitions. A state stores information about the past. In other words, it reflects the input changes from the system start to the present moment. A transition indicates a state change and is described by a condition that would need to be fulfilled to enable the transition. A defining characteristic of an FSM is that the next state of the machine is completely determined by the current state and the current input.
Another way to think about finite automata is in terms of programs without any variables. All such a program can do is look at each incoming character to determine what line to go to, and eventually return true or false (depending on whether it thinks the string matches or doesn't). Nonetheless, such simple machines can display complex behavior.
Formal Definition
We define a finite state machine in terms of a finite set of states, a specific start state, a set of accepting states (a subset of state sets), a finite alphabet, and state transition instructions:
- Finite set of states
.
- A finite alphabet
. In the example DFA above, the alphabet is
.
- A start state
.
- A set of accepting states
.
- State transition function
where
means that when the machine is in state
, it will transition to
given input character
. That is, we have a transition which looks like this:
. Since
is a function, transitions must be defined for every state and alphabet symbol combination. (This is not to say that DFAs cannot "crash," but any DFA which crashes can be made into one which doesn't crash by adding a single "garbage" state.)
Formally, let

be a FSM.

starts in state

. In operation, machine

reads one letter at a time from the input string (going from left to right). If

is in state

and reads some letter

,

moves to state

(if

is undefined, then the machine stops/crashes).
-
accepts the string
if, after
finishes reading
, it ends in an accepting state.
-
rejects the string
if, after
finishes reading
, it ends in a non-accepting state.
The set of inputs accepted by

is:

is called the
language of the machine M. (

is defined as the set of all strings over the alphabet

.)
Examples
What are the languages accepted by these machine?
In this case, the initial state (arrow symbol) is also the accepting state (double circle). The arrow represents the transition a, b. Any finite string of a's and b's ends up in the accepting state. So, the language

(all finite strings of a's and b's.)
The string
aa doesn't end up in an accepting state. Neither does
abba. However, the strings
a and
aba do. In general, only strings that have an odd number of characters end up in the accepting state. Strings of even length are rejected. Hence, the language accepted by this machine

.
In this case, the strings
a,
aaba, and
bbba don't work, whereas
ab,
aabb, and
bbbab do.

.

.
Designing FSMs
(Most of the below content is from Sipser's
Introduction to the Theory of Computation.)
We can also do the reverse: try to find a machine that accepts a given language. For example, constructing a machine for the language L = {all strings in {a,b}* with at least one a}.
In general, constructing a FSM is more a creative process than simple formula. However, the following approach might be somewhat helpful: put
yourself in the place of the machine you are trying to design and then see how you would go about performing the machine's task.
Suppose that you are given some language and want to design a FSM that recognizes it. Pretending to be the automaton, you receive an input string and must determine whether it is a member of the language the automaton is supposed to recognize. You get to see the symbols in the string one by one. After each symbol, you must decide whether the string seen so far is in the language. The reason is that you, like the machine, don't know when the end of the string will come, so you must always be ready with the answer.
Also, you need to decide what you need to remember about the string as you are reading it. Remember, you're designing a
finite state machine, and thus you
cannot remember all that you have seen to date (the input may be thousands and thousands of characters, after all.) Fortunately, you only need to remember the crucial information about the input to date - what that information is varies from language to language.
Here are a couple of languages. What might we want to keep track of?
- L = {all strings in {0,1}* with an odd number of 1's}. We just need to keep track of whether the number of 1s we've read to date is odd or even.
- L = {all strings in {a,b}* that contain at least one
a }. We just need a flag that becomes true if we've seen an a.
Can you design FSMs for the above? Here are a few more exercises:
- L = {all strings in {a,b}* with an even number of
ab pairs}.
- L = {all strings in {a,b}* containing
ababb as a consecutive substring}.
Here's a harder one, but a good test:
- L = {all strings in {a,b}* that end in
abb }.
In general, finite automata act as detectors, each one picking up a specified pattern or sequence. These are pattern-seeking machines.
A good strategy when designing FSMs is to come up with "names" for your states. For example, suppose that you wanted to build a machine which accepts a string from {0,1}* if and only if it is of length 1 modulo 3. You could do this in three states where your states could be named "string so far is of length 0 mod 3," "string so far is of length 1 mod 3," and "string so far is of length 2 mod 3." Whenever you are told to construct a machine for a given language, think about all the intermediate steps necessary to verify that a string will be accepted.
Languages
Let's take a quick diversion from FSMs and talk about the concept of a
language. Any

is defined to be a language. Basically,

is just a set of strings (it is called a language for historical reasons.) Some examples of languages:
- The set of all strings ending in a b where
.
- The set of all strings of the form
: some number of a's followed by an equal number of b's where
.
- The set of all palindromes where
.
- The set of all graphs (encoded in binary) with a Hamiltonian tour
.

is called a
regular language if there is some finite automaton that accepts

. In this chapter, we have already seen many regular languages, including

, strings of even length, strings containing
ababb, and others. In the list above, only the top example is regular. Let's try to get a grip on which languages are regular.
Theorem. Any finite language is regular.
Proof. Make a machine with a "path" for each string in the language. Basically, we can pick a tree to generate all strings in our finite `set (with an accepting state at the end of each string path.)
Here's an example with

(
Note: I'm being slightly lazy in the transitions - see exercise below)
Note that if we have an infinite list of words, the technique will not work.
Exercise
The above FSM is actually not complete - we don't show every possible transition. Certain classes you take will allow this: if a transition is undefined, the FSM will
crash (obviously the input will not be accepted if that happens.) It doesn't actually matter if you allow this: as an exercise, explain how you can transform an FSM like the above into one without undefined transitions.
Theorem. The union of two regular languages

is regular (we assume that they have the same alphabet)
(The union of two languages is the set of all strings present in
either language.) How can we prove this? Here's what we'll do - since

and

are regular, there exist FSMs

that accept them. Let's construct a
new FSM based on

that accepts the union of the two languages. This new FSM will just
simulate 
and

simultaneously, seeing whether either accept at the end.
Proof. Let

,

be FSMs that accept

respectively. We define a new FSM

to accept

.
- The states of
are just the pairs of states from
.
- The alphabet
is identical.
- The transition function
: on input state
and character
, output
.
- The initial state
is just the state corresponding to
.
- The set of accept states
just corresponds to
.
We claim that this machine accepts

. Consider a member

. The construction is symmetric, so we assume that

. Imagine running

and

simultaneously on

. We have constructed the DFA

such that at any point during the process,

is in some state

iff

is in some tuple state of the form

(i.e. the left part of the tuple will match the state

is in. Also, since

,

must end up in an accepting state. However, our construction guarantees that if

is an accepting state for

, all states of the form

are accepting states for

. Therefore

accepts this string.
We also need to show that if

accepts a string, then that string must be in

or

. Let

be the accepting state

ends up in after reading

. Our construction guarantees that either

was an accepting state for

or

was an accepting state for

. WLOG, assume the former. However, we know that

would only be in some state

after reading some word

if

were in state

after reading

, so

will accept

.
In general, regular languages are closed under the following operations.
- Union:
.
- Intersection:
.
- Reverse:
.
- Negation:
.
- Concatenation:
.
- Star:
.
Nonregular languages
How powerful are finite automata? In other words, are all languages regular?
Consider the language

(remember that

is just the symbol which denotes the empty string). In other words, we have a bunch of a's followed by an equal number of b's. No finite automaton accepts this language. Can we prove this?
Well, we could try saying that no machine has enough states to keep track of the number of a's it might encounter. But this alone is a fairly weak argument.
To see why this is, consider the following example: L = {strings where the number of occurrences of the pattern
ab is equal to the number of occurrences of the pattern
ba }. By the above argument this can't be regular: no machine has enough states to keep track of the number of occurrences of ab. But, as the following diagram shows, this language is regular!
The ABA automaton shown above accepts only the strings with an equal number of
ab occurrences and ba occurrences. For example, it accepts
aba,
aabba, and

, but rejects
baaa. So, we can compare two quantities without counting.
We need a professional-strength proof that

is not regular. Here are two
very handy principles to keep in mind when developing a proof.
- Pigeonhole principle: Given
boxes and
objects, at least one box must contain more than one object.
- Letterbox principle: If the average number of letters per box is
, then some box will have at least
letters. (Similarly, some box has at most
.)
Theorem. The language

is not regular.
Proof. Assume that the language

is regular. Then there is some machine

with

states that accepts it. For each

, let

be the state of

after reading

. By the pigeonhole principle, there exists some

such that

, but

. Therefore, the machine will erroneously accept

, which is a contradiction. Thus, there is no such machine and the language is not regular.
Again, make sure you understand this argument - its slightly tricky! If we assume that this language is regular, then there is some automata with

states that accepts it. Let's try feeding

to this machine, keeping track of the states that it visits. There are only

states, so while its still just reading a's, the machine will have to be in the same state twice: let's say that its in state

after reading

a's and also in state

after reading

a's. This means, though, that if the machine accepts

and

(which it's supposed to), then it will also accept

and

- this is a contradiction! Therefore, the language is not regular.
Intuitively, we can say that finite automata can't count!
There's tons more to learn about finite automata. Much of that material is covered in advanced courses, such as 15-354: Computational Discrete Math or 15-453: Formal Languages, Automata, and Computation (FLAC).
Extra Remark. (You are not responsible for knowing this.) Since

is a countably infinite set, the total number of languages is
uncountable. However, there are only countably many FSMs and therefore, the number of regular languages is
countable. We may therefore conclude that there are uncountably many languages which are
not regular.
Application: String Matching
The Unix operating system has a command called
grep. The
grep command allows you to search for a specified pattern in one or more files, printing any lines which contain the specified set of characters.
grep uses regular expressions to specify the search pattern and a finite automaton to implement the search itself.
Given input text

of length

and a string

of length

: does

appear inside the text

?
Naively, we can take string

and compare it to the text

, looking for a match as we step through the text reading it character by character. In computational terms, we would need to perform on the order of

comparisons: Cost:

comparisons. This is fairly wasteful - we aren't taking advantage of the information we're learning while comparing (consider trying to find
aaaab in
aaaaaaaaab.)
The automata solution is to build a special-purpose machine

to find a given pattern that accepts any string with

as a consecutive substring. We then feed the text to M. In this case, the cost is

comparisons plus the time to build M. It solves the problem reading the characters only once. As it happens, we can use the so-called Knuth-Morris-Pratt algorithm to build M quickly. Moreover, it turns out that looking for a given string can actually be done, even in the worst case, with fewer than t comparisons. We don't have to look at every symbol. We can save time by reading only a portion of the text.
15-211 covers the implementation of this algorithm, so we'll pass on doing it in this class.
The
grep command or utility in Unix is just one of many real-world applications of finite state machines. Such mechanisms underlie or govern vending machines, fridge thermostats, elevators, train track switches, and lexical analyzers for parsers. In several of these instances, more complex approaches have replaced finite state machines as the underlying controllers. For example, a computer chip and a little Java program instead of a finite state machine now typically handles change in a vending machine. Similarly, control logic governs thermostats.
One place where finite state machines are still used is in train track switches, mainly because they are simple and can be analyzed mathematically. We can prove theorems about them. So the design is verifiable to ensure that the devices operate safely.
References
Introduction to the Theory of Computation, first edition. Chapter 1.1
Editor Notes
Diagrams were created using Microsoft Powerpoint 2007.