CSCI 341 Theory of Computation

Fall 2025, with Schmid

← antimirov derivatives kleenes theorem →

The Algebra of Regular Languages

So far, we have seen that every regular language is finitely presentable, formally \(\mathsf{Reg} \subseteq \mathsf{Fin}\). Continuing our journey towards the proof of Kleene's Theorem, which states that \(\mathsf{Reg} = \mathsf{Fin}\), we need to gain a bit more proficiency with regular expressions. The most important step in the proof of the reverse containment is to show that certain systems of equations involving regular expressions can be solved. This process of reasoning with equations between regular expressions is called the algebra of regular expressions, and can be really fun once you get used to it. If solving systems of equations doesn't sound like algebra to you, then I'm not sure what will.

Let's get a bit more formal about what all this is about.

(Language Equivalence) Given regular expressions \(r_1, r_2\in \mathit{RExp}\), we say that \(r_1\) and \(r_2\) are language equivalent if \(\mathcal L(r_1) = \mathcal L(r_2)\), and in such a case we will write \(r_1 =_{\mathcal L} r_2\).

It may come as a surprise to you that different regular expressions can be language equivalent. It helps to think of arithmetic: \(5 + 2 = 2 + 5\), even though those two arithmetic expressions are different. One might call these two arithmetic expressions number equivalent. In a similar fashion, the regular expressions \(a + b\) and \(b + a\) (over an alphabet containing \(a\) and \(b\)) are language equivalent. Indeed, \[ \mathcal L(a + b) = \{a, b\} = \{b, a\} = \mathcal L(b + a) \] Not too crazy, right?

(Basic Equations) Calculate the language semantics for each of the regular expressions below. Which regular expressions are language equivalent?

\((ab + b) + c\)
\(c + (a + \varepsilon) b\)
\(a(b + c)\)
\((b + c)a\)

Unions

Here is another example of language equivalent regular expressions: \[ a + (b + c) =_{\mathcal L} (a + b) + c \] Again, remember that regular expressions are just sequences of symbols, so the two regular expressions that appear above are not exactly the same. However, it's not hard to prove that they are language equivalent: on the one hand, we have \[\begin{aligned} \mathcal L(a + (b + c)) &= \mathcal L(a) \cup \mathcal L(b + c) \\ &= \mathcal L(a) \cup (\mathcal L(b) \cup \mathcal L(c)) \\ &= \{a\} \cup (\{b\} \cup \{c\}) \\ &= \{a\} \cup \{b,c\} \\ &= \{a,b,d\} \end{aligned}\] On the other, we also have \[ \mathcal L((a + b) + c) = \{a, b, c\} \hspace{5em} (\star) \] Therefore, \(\mathcal L((a + b) + c) = \mathcal L(a + (b + c))\), and we can write \((a + b) + c =_{\mathcal L} a + (b + c)\).

(Calculating a Union) Recreate the calculation of \(\mathcal L(a + (b + c))\) above, but for \(\mathcal L((a + b) + c)\) to produce \((\star)\). What is the difference between calculating the values of \(\mathcal L(a + (b + c))\) and \(\mathcal L((a + b) + c)\)?

(Everything About Unions) Let \(r,r_1,r_2,r_3 \in \mathit{RExp}\) be any regular expressions. Then

\(r_1 + r_2 =_{\mathcal L} r_2 + r_1\)
\(r + \emptyset =_{\mathcal L} r\)
\(\emptyset + r =_{\mathcal L} r\)
\(r + r =_{\mathcal L} r\)
\(r_1 + (r_2 + r_3) =_{\mathcal L} (r_1 + r_2) + r_3\)

(Understanding the Union Rules) Use the equations in Everything About Unions to derive the equation \[ (ab + b) + (ba + \emptyset) =_{\mathcal L} ba + (b + ab) \] Label each equation you write down with the number of the equation you used in the Everything About Unions Lemma. For example, as a first couple steps (I'm not saying these are the correct first steps), you might write \[\begin{aligned} (ab + b) + (ba + \emptyset) &= (ab + b) + (\emptyset + ba) \hspace{5em} (\text{Unions 1.}) \\ &= ((ab + b) + \emptyset) + ba \hspace{5em} (\text{Unions 5.}) \\ \end{aligned}\]

Sequential Composition

Here is another example of a pair of language equivalent regular expressions: \(a(bc) =_{\mathcal L} (ab)c\). Here, \[\begin{aligned} \mathcal L(a (bc)) &= \mathcal L(a) \cdot \mathcal L(bc) \\ &= \mathcal L(a) \cdot (\mathcal L(b) \cdot \mathcal L(c)) \\ &= \{a\} \cdot (\{b\} \cdot \{c\}) \\ &= \{a\} \cdot (\{bc\}) \\ &= \{abc\} \\ \end{aligned}\] This is what we should expect; the language semantics of a word, represented as a regular expression, should be the set containing only that word. But there is also the other way of forming the word "\(abc\)", namely \((ab)c\), which also has the language semantics \[\mathcal L((a b)c) = \{abc\}\] Therefore, \(a(bc) =_{\mathcal L} (ab)c\).

There are a few more equations to do with sequential composition that are going to be useful later.

(Everything About Sequential Composition) Let \(r,r_1,r_2,r_3 \in \mathit{RExp}\) be any regular expressions. We have

\(r \cdot \varepsilon =_{\mathcal L} r\) and \(\varepsilon \cdot r =_{\mathcal L} r\)
\(r\cdot \emptyset =_{\mathcal L} \emptyset\) and \(\emptyset \cdot r =_{\mathcal L} \emptyset\)
\(r_1 \cdot (r_2 \cdot r_3) =_{\mathcal L} (r_1 \cdot r_2) \cdot r_3\)
\(r_1 \cdot (r_2 + r_3) =_{\mathcal L} (r_1 \cdot r_2) + (r_1 \cdot r_3)\)
\((r_1 + r_2) \cdot r_3 =_{\mathcal L} (r_1 \cdot r_3) + (r_2 \cdot r_3)\)

(Do Expression Commute?) In artithmetic, we have the lovely equation \(n \times m = m \times n\) for all \(n,m \in \mathbb N\). For which regular expressions \(r_1,r_2 \in \mathit{RExp}\) is the equation \(r_1 \cdot r_2 =_{\mathcal L} r_2 \cdot r_1\) true? Assume an alphabet of \(A = \{a,b\}\).

(Distributing on the Left) Let \(r_1,r_2,r_3 \in \mathit{RExp}\). Prove the equation \[r_1 \cdot (r_2 + r_3) =_{\mathcal L} (r_1 \cdot r_2) + (r_1 \cdot r_3)\] by calculating the language on either side of the equation and arguing that these two languages are equal.

Let \(L_i = \mathcal L(r_i)\) for \(i=1,2,3\). An element of \(L_1\cdot(L_2 \cup L_3)\) is a word of the form \(wu\) where \(w \in L_1\) and either \(u \in L_2\) or \(u \in L_3\). What does an element of \((L_1 \cdot L_2) \cup (L_1 \cdot L_3)\) look like?

Kleene Star

So far, we have dealt with unions and sequential composition. The last operation on our list to deal with is the Kleene star, which is... let's just say, a lot less familiar. The gist of the equations we are about to see is this: The Kleene star of a language consists of the empty word (if it is not already there), as well as any concatenation of the words in the language (including repetitions).

For example, for a letter \(a \in A\), unraveling the definition of the language semantics of \(a^*\) gives \[\mathcal L(a^*) = \{a^n \mid n \in \mathbb N\}\] For \(n = 0\), \(a^n = a^0 = \varepsilon\). For \(n > 0\), \(a^n = a a^{n-1}\). Unravelling the equation above, we can write down the following calculation: \[\begin{aligned} \mathcal L(a^*) &= \{a^0\} \cup \{a^n \mid n \in \mathbb N \text{ and } n > 0\} \\ &= \{a^0\} \cup \{aa^{n-1} \mid n \in \mathbb N \text{ and } n > 0\} \\ &= \{\varepsilon\} \cup \{aa^n \mid n \in \mathbb N\} \\ &= \{\varepsilon\} \cup (\{a\} \cdot \{a^n \mid n \in \mathbb N\}) \\ &= \mathcal L(\varepsilon) \cup (\mathcal L(a) \cdot \mathcal L(a^*)) \\ &= \mathcal L(\varepsilon + (a \cdot a^*)) \\ \end{aligned}\] In other words, \( a^* =_{\mathcal L} \varepsilon + aa^* \).

(Superfluous Addition) Let \(a \in A\). Calculate the language semantics of \((\varepsilon + a)^*\). Is there a simpler regular expression that this one is equivalent to?

The exercise and example above point to the following two equations.

(Basics About the Kleene Star) Let \(r \in \mathit{RExp}\). Then

\(\varepsilon + rr^* =_{\mathcal L} r^*\)
\(\varepsilon + r^*r =_{\mathcal L} r^*\)
\((\varepsilon + r)^* =_{\mathcal L} r^*\)

(Using the Basic Equations) Let \(a,b \in A\). Prove the following identities.

\((\varepsilon + a)^* =_{\mathcal L} (\emptyset + a)^*\)
\(\varepsilon + a^* =_{\mathcal L} a^*\)
\(\emptyset^* =_{\mathcal L} \varepsilon\)
\((a + b)^* =_{\mathcal L} a(a + b + \varepsilon)^* + b(\emptyset + b + a)^* + \varepsilon\)
for any \(r_i,s_i \in \mathit{RExp}\), \((r_1 + r_2)(s_1 + s_2) =_{\mathcal L} r_1s_1 + r_1s_2 + r_2s_1 + r_2s_2\)

Arden's Rule

There is one equation we are missing from our toolset so far. It's not quite an equation so much as it is a rule, since it only applies in certain situations. The situations it applies to have to do with the empty word property.

(Empty Word Property) Let \(r \in \mathit{RExp}\). Then \(r\) has the empty word property if \(\varepsilon \in \mathcal L(r)\).

So, for example, the regular expression \(\varepsilon\) does have the empty word property, while \(\emptyset\) does not. And for any \(a \in A\), we have \(\mathcal L(a) = \{a\}\), so \(a\) does not have the empty word property. On the other hand, for every regular expression \(r \in \mathit{RExp}\), \(r^*\) does have the empty word property.

(Arden's Rules) Let \(r,s,t \in \mathit{RExp}\) be regular expressions. Assume that \(r\) does not have the empty word property. Then

(Left Rule) if \(s =_{\mathcal L} t + r\cdot s\), then \(s =_{\mathcal L} r^* \cdot t\)
(Right Rule) if \(s =_{\mathcal L} t + s\cdot r\), then \(s =_{\mathcal L} t \cdot r^*\)

Typically, Arden's Left Rule is just called "Arden's Rule". Arden's Left Rule is all that's needed for the next lecture, but the Right rule is going to make things a lot easier for you in the exercises!

Remember the equation \(r^* =_{\mathcal L} \varepsilon + rr^*\) from the Basics of the Kleene Star. This equation reveals to us that \(r^*\) solves the following equation for an unknown variable \(x\), \[ x =_{\mathcal L} \varepsilon + r \cdot x \] Arden's rule tells us more: it says that if \(r\) does not have the empty word property, then \(r^*\) is the only solution to the equation above.

(Deleting Pluses) Let \(a,b \in A\). Here is a pretty whacky equation you might not expect: \[ (a + b)^* =_{\mathcal L} (a^*b)^*a^* \] But it's actually not a difficult application of Arden's rule. Let \(s = (a + b)^*\). Then we have \[\begin{aligned} s &= (a + b)^* && \text{(definition of \(s\))}\\ &= \varepsilon + (a + b)(a + b)^* && \text{(Kleene Star 1)}\\ &= \varepsilon + a(a + b)^* + b(a + b)^* && \text{(Seq Comp 4)}\\ &= \varepsilon + as + bs && \text{(definition of \(s\))}\\ &= \varepsilon + bs + as && \text{(Union 1)}\\ &= a^*(\varepsilon + bs) && \text{(Arden's Rule with \(r = a\) and \(t = \varepsilon + bs\))}\\ &= a^*\varepsilon + a^*bs && \text{(Seq Comp 4)}\\ &= a^* + a^*bs && \text{(Seq Comp 1)}\\ &= (a^*b)^*a^* && \text{(Arden's Rule with \(r = a^*b\) and \(t = a^*\))} \end{aligned}\] Above, Arden's Rule is applied twice: once with \(r = a\) and once with \(r = a^*b\) (\(s\) is \((a + b)^*\) throughout). Neither \(r\) has the empty word property: it can be seen directly that \(\mathcal L(a)\) does not contain \(\varepsilon\), and by definition every word in \(\mathcal L(a^*b)\) must end with a \(b\), so \(a^*b\) doesn't have the empty word property either.

(Using Arden's Rules) Use the equations above to prove the following equations.

\(aa^* = a^* a\)
\(a^* =_{\mathcal L} (aa)^*(a + \varepsilon)\).

(Breaking Arden's Rule) What does the empty word property have to do with Arden's Rule at all? Find two solutions \(r_1,r_2 \in \mathit{RExp}\) to the equation \[ x =_{\mathcal L} \emptyset + \varepsilon \cdot x \] such that \(r_1 \neq_{\mathcal L} r_2\). What went wrong?

(aa...Aaa....AAAA) Let \(a \in A\). Use the equations in the Lemmas above to prove that \[a^* =_{\mathcal L} (a^*)^*\] Label each equation you use in your proof to indicate which lemma was used where.

(Air Flare) Let \(a,b \in A\). Use the equations in the Lemmas above to prove the following equations:

\(a^*a^* =_{\mathcal L} a^*\)
\((a + b)^* =_{\mathcal L} b^*(ab^*)^*\)

Label each equation you use in your proof to indicate which lemma was used where.

Using Arden's Rules 1. is very helpful in the first one!

Proof of Arden's Rule

We are now ready to prove Arden's Rule, but it is worth mentioning that this is an important place where strong induction comes up.

Remember that induction states that if a subset \(S \subseteq \mathbb N\) is upwards-closed and contains \(0\), then \(S = \mathbb{N}\). There are actually two ways of establishing that such a set \(S\) is upwards-closed and contains \(0\):

The first way is what you are used to: showing that \(0 \in S\) and also that if \(n \in S\), then \(n + 1 \in S\) as well. This way is ordinary induction.
The second way is maybe more convoluted-feeling, but equivalent: the second way is to show that for any \(n \in \mathbb N\), if \(m \in S\) for all \(m < n\), then \(n \in S\). To see why this works, first note that there are no natural numbers \(m < 0\), so vaccuously \(0 \in S\). Then for any \(m < 1\), \(m \in S\), because only \(0 < 1\). Likewise, if \(0, 1, 2, \dots, n \in S\), then \(n + 1 \in S\) as well. This shows that \(S\) is upwards-closed, so by induction, \(S = \mathbb N\).

(of Arden's Rule) Let \(r,s,t \in \mathit{RExp}\) and let \(L = \mathcal L(s)\), \(U = \mathcal L(r)\), and \(V = \mathcal L(t)\). Assume that \(r\) does not have the empty word property, and that \(s =_{\mathcal L} t + rs\). In other words, we are assuming that \(\varepsilon \notin U\) and \(L = V \cup U\cdot L\). We need to show that \(L = U^* \cdot V\). We are going to show that (1) \(L \subseteq U^* \cdot V\) and (2) that \(L \supseteq U^* \cdot V\).

Let's start by proving (1). Let \(w \in L\). We need to prove that \(w \in U^*\cdot V\). We are going to do this by strong induction on the length of \(w\).

A proof by strong induction starts with the induction hypothesis: suppose that for any word \(u \in L\) with \(|u| < |w|\), we have \(u \in U^* \cdot V\). We now need to show that it follows from this supposition that \(w \in U^* \cdot V\). Since \(L = V \cup U\cdot L\), \(w\) is of one of two forms: either \(w \in V\) or \(w \in U\cdot L\). This means there are two cases to consider.

In the first case, \(w \in V\), and we need to show that \(w \in U^*\cdot V\). This follows from the definition of the Kleene star for a language: \(\varepsilon \in U\), so \(\varepsilon w \in U^* \cdot V\). Since \(\varepsilon w = w\), \(w \in U^*\cdot V\) and we are done with this case.
In the second case, \(w \in U \cdot L\). In this case, \(w\) is of the form \(w = uv\) for some \(u \in U\) and \(v \in L\). Again, we need to argue that \(w \in U^*\cdot V\). But \(u \in U\) implies that \(u \neq \varepsilon\), because \(r\) does not have the empty word property and \(U = \mathcal L(r)\). This means that \(|u| > 0\), so \(|v| < |u| + |v| = |uv| = |w|\). By the induction hypothesis, \(v \in L\) and \(|v| < |w|\) means that \(v \in U^* \cdot V\). Therefore, there is a \(u' \in U^*\) and a \(v' \in V\) such that \(v = u'v'\). Then \(w = uu'v'\). Since \(u,u' \in U^*\), their concatenation \(uu' \in U^*\) as well. It follows that \(w = uu'v \in U^*V\), and we are done with this case.

Now we show (2), that \(L \supseteq U^* \cdot V\). Let \(w \in U^* \cdot V\). Then \(w = uv\) for some \(u\in U^*\) and \(v \in V\). We are going to show that \(uv \in L\) by strong induction on \(u\). This means that our induction hypothesis states that if \(u'v \in L\) for all \(u' \in U\) with \(|u'| < |u|\), then \(uv \in L\).

If \(u = \varepsilon\), then \(uv = v \in V \cup U\cdot L = L\), so \(w = uv \in L\). Otherwise, \(u = u_1u_2\) for some \(u_1,u_2 \in U\) with \(|u_1| > 0\). This means that \(u_2v \in U^*V\) and \(|u_2| < |u|\), so the induction hypothesis tells us that \(u_2v \in L\). But concatenating with \(u_1\), we get \[ w = uv = u_1u_2v \in U \cdot L \subseteq V \cup U \cdot L = L \] so that \(w \in L\). This shows that \(L \supseteq U^* \cdot V\).

Since (1) \(L \subseteq U^* \cdot V\) and (2) \(L \supseteq U^* \cdot V\), \(L = U^* \cdot V\). Therefore, \(r =_{\mathcal L} s^* \cdot t\).

← antimirov derivatives kleenes theorem →

Top