Nonregular Languages: Pumping Lemma

“The pumping lemma, that \forall \exists \forall\exists\forall monstrosity, is equivalent to the statement ‘every sufficiently-long walk over a finite graph contains a cycle.’ Not only is that intuitive, it’s obvious, unlike the pumping lemma.”
Hillel Wayne

We now consider a different method for proving a language LL is not regular. It starts with a definition that may seem odd, but turns out to be useful.

Definition

Given a language LL and a “pumping length” p1p\geq 1,
we say a string sLs\in L (with sp|s| \geq p) can be pumped if

  1. (there’s a substring yy inside ss)
    s=xyzs = xyz
  2. (yy is nonempty and not too far from the front)
    yϵy \not= \epsilon and xyp|xy| \leq p
  3. (xzxz, xyzxyz, xyyzxyyz, xyyyzxyyyz, … are all in LL)
    xyizLxy^iz \in L for every i0i\geq 0

Example

If L:={ anbm  n,m0 }L := \{\ a^nb^m\ |\ n,m\geq 0\ \} and p:=3p := 3,
then   s:=abbbbbb  ~~s:= \mathrm{abbbbbb~~} can be pumped.

There are many ways to show this, including

  • Set x:=ax:=\mathrm{a}, y:=bby:=\mathrm{bb}, and z:=bbbbz:=\mathrm{bbbb}
  • Set x:=abx:=\mathrm{ab}, y:=by:=\mathrm{b}, and z:=bbbbz:=\mathrm{bbbb}
  • Set x:=ax:=\mathrm{a}, y:=by:=\mathrm{b}, and z:=bbbbbz:=\mathrm{bbbbb}
  • Set x:=ϵx:=\epsilon, y:=ay:=\mathrm{a}, and z:=bbbbbbz:=\mathrm{bbbbbb}

In all cases xyizLxy^iz\in L for every i0i\geq 0.

Example

If L:={ w{0,1}  w has even 0s and odd 1s }L := \{\ w\in\{0,1\}^*\ |\ w \mathrm{\ has\ even\ 0's\ and\ odd\ 1's}\ \} and p:=4p := 4
then   s:=01000100111  ~~s := 01000100111~~ can be pumped.

There is exactly one way to do so, namely

  • Set x:=01x:=\mathrm{01}, y:=00y:=\mathrm{00}, and z:=0100111z:=\mathrm{0100111}

Then xyizLxy^iz\in L for every i0i\geq 0.

Example

If LL is the set of strings of balanced parentheses, and p:=4p := 4
then   s:=(()())  ~~s := (()())~~ can be pumped.

There are two ways to do so

  • Set x:=(x:=\mathrm{(}, y:=()y:=\mathrm{()}, and z:=())z:=\mathrm{())}
  • Set x:=((x:=\mathrm{((}, y:=)(y:=\mathrm{)(}, and z:=)))z:=\mathrm{)))}

In both caes, xyizLxy^iz\in L for every i0i\geq 0.

The Pumping Lemma

The utility of this definition comes from the following fact:

Lemma (Pumping Lemma)

If LL is a regular language, then there exists p1p\geq 1 such that every string sLs\in L with sp|s|\geq p can be pumped.

More importantly the contrapositive holds; if for every pp\geq there is a string sLs\in L with sp|s|\geq p that cannot be pumped, then LL is not regular.

That is, in a regular language, every string in that language (larger than some fixed size) can be pumped.

Equivalently, if a language contains arbitrarily long unpumpable strings (i.e., unpumpable strings with more than pp characters for any choice of pp), then that language cannot be regular. We can use this fact to prove that many nonregular languages are nonregular.

(Warning: The Pumping Lemma does not say “if and only if.” The presence of long unpumpable strings proves a language is not regular, but the absence of such strings tells us nothing. There are regular and nonregular languages where all long strings are pumpable.)

Example

The language L:={ anbn  n0  }L := \{\ a^nb^n\ |\ n \geq 0\ \ \} is not regular.

Proof.
Let p1p \geq 1 be given.
Put s:=apbps := a^pb^p. Note that sLs\in L and sp|s|\geq p.
To show: ss is not pumpable.
Let s:=xyzs := xyz be any partition of ss into three parts with yϵy\not=\epsilon and xyp|xy|\leq p.
Then yy must consist of one or more a‘s.
So xy2zxy^2z will have more a ’s than b’s, and xy2z∉Lxy^2z\not\in L.
Thus no partition works to pump ss.
By the Pumping Lemma, LL is not regular.

Example

The language L:={ anbn  n0  }L := \{\ a^nb^n\ |\ n \geq 0\ \ \} is not regular.

Alternate Proof (with a less clever choice of ss).
Let p1p \geq 1 be given.
Put s:=ap/2bp/2s := a^{\lceil p/2\rceil}b^{\lceil p/2\rceil}. Note that sLs\in L and sp|s|\geq p.
To show: ss is not pumpable.
Let s:=xyzs := xyz be any partition of ss into three parts with yϵy\not=\epsilon and xyp|xy|\leq p.

There are three cases to consider.

  • If yy consists of one or more a’s, then xy2zxy^2z will have more a ’s than b’s, and xy2z∉Lxy^2z\not\in L
  • If yy consists of one or more b’s, then xy2zxy^2z will have more b’s than a’s, and xy2z∉Lxy^2z\not\in L
  • If yy consists of both a’s and b’s, then xy2zxy^2z will not have all its a’s before all its b’s, and xy2z∉Lxy^2z\not\in L

Thus no partition works to pump ss.
By the Pumping Lemma, LL is not regular.

Example

The language L:={ w{a,b}   w has the same number ofas and bs}L := \{\ w\in\{a,b\}^{*}\ \ |\ w \mathrm{\ has\ the\ same\ number\ of\\ a's\ and\ b's} \} is not regular.

Proof.
Let p1p \geq 1 be given.
Put s:=apbps := a^pb^p. Note that sLs\in L and sp|s|\geq p.
To show: ss is not pumpable.
Let s:=xyzs := xyz be any partition of ss into three parts with yϵy\not=\epsilon and xyp|xy|\leq p.
Then yy must consist of one or more a‘s.
So xy2zxy^2z will have more a ’s than b’s, and xy2z∉Lxy^2z\not\in L.
Thus no partition works to pump ss.
By the Pumping Lemma, LL is not regular.

Example

The language L:={ 0n1m  nm}L := \{\ 0^n1^m\ |\ n\leq m\} is not regular.

Proof.
Let p1p \geq 1 be given.
Put s:=0p1ps := 0^p1^p. Note that sLs\in L and sp|s|\geq p.
To show: ss is not pumpable.
Let s:=xyzs := xyz be any partition of ss into three parts with yϵy\not=\epsilon and xyp|xy|\leq p.
Then yy must consist of one or more 0‘s.
So xy2zxy^2z will have more 0 ’s than 1’s, and xy2z∉Lxy^2z\not\in L.
Thus no partition works to pump ss.
By the Pumping Lemma, LL is not regular.

Example

The language L:={ 0n1m  nm}L := \{\ 0^n1^m\ |\ n\geq m\} is not regular.

Proof.
Let p1p \geq 1 be given.
Put s:=0p1ps := 0^p1^p. Note that sLs\in L and sp|s|\geq p.
To show: ss is not pumpable.
Let s:=xyzs := xyz be any partition of ss into three parts with yϵy\not=\epsilon and xyp|xy|\leq p.
Then yy must consist of one or more 0‘s.
So xzxz will have fewer 0 ’s than 1’s, and xz∉Lxz\not\in L.
Thus no partition works to pump ss.
By the Pumping Lemma, LL is not regular.

Example

The language L:={ anbncn  n0 }L := \{\ a^nb^nc^n\ |\ n \geq 0\ \} is not regular.

Proof.
Let p1p \geq 1 be given.
Put s:=apbpcps := a^pb^pc^p. Note that sLs\in L and sp|s|\geq p.
To show: ss is not pumpable.
Let s:=xyzs := xyz be any partition of ss into three parts with yϵy\not=\epsilon and xyp|xy|\leq p.
Then yy must consist of one or more a‘s.
So xy2zxy^2z will have more a ’s than b’s or c’s, and xy2z∉Lxy^2z\not\in L.
Thus no partition works to pump ss.
By the Pumping Lemma, LL is not regular.

Example

The language L:={ w{a,b}   w is a palindrome }L := \{\ w\in\{a,b\}^{*}\ \ |\ w \mathrm{\ is\ a\ palindrome}\ \} is not regular.

Proof.
Let p1p \geq 1 be given.
Put s:=apbaps := a^pba^p. Note that sLs\in L and sp|s|\geq p.
To show: ss is not pumpable.
Let s:=xyzs := xyz be any partition of ss into three parts with yϵy\not=\epsilon and xyp|xy|\leq p.
Then yy must consist of one or more a‘s.
So xy2zxy^2z will have more a ’s before the b than after, and xy2z∉Lxy^2z\not\in L.
Thus no partition works to pump ss.
By the Pumping Lemma, LL is not regular.

Conclusion

  • The Pumping Lemma can be used to prove a language is not regular, but it cannot be used to prove a language is regular.
  • To use the Pumping Lemma to prove a language LL is not regular, we need to find a string that is a member of LL (or at least one such string for every value of pp). This is different from the distinguishable-strings method, where the elements of our set SS are generally not members of LL.

Extra: Justifying the Pumping Lemma

We don’t need to know the proof of the Pumping Lemma in order to use it effectively. But it is not hard to see why it is true, once you know the trick

Lemma (Pumping Lemma)

If LL is a regular language, then there exists p1p\geq 1 such that every string sLs\in L with sp|s|\geq p can be pumped.

Proof.
Suppose LL is a regular language.
Then there is a DFA whose language is LL.
Let pp be the number of states in that DFA.
Let ss be an arbitrary string in LL such that sp|s|\geq p.

Obviously the DFA must accept ss.

But running ss through the DFA visits at least p+1p+1 states, which means that some state was visited at least twice (and at least one such repetition happened within the first pp characters of the input).

That is, the string ss went through a loop on the way from the start state to an accepting state.

Call the part of ss before the first loop xx, the part of ss that takes us around the first loop yy, and call the rest of ss zz.

Then the strings xyizxy^iz take us through the DFA from the start state to the same accepting state for every i0i\geq 0, because these strings take the same path as ss except they avoid the loop entirely, or take the loop multiple times.

Since all these strings are accepted, they must all belong to LL.
Thus, we have shown that ss is pumpable.

Since ss was an arbitrary string in LL with length p\geq p,
all such strings are pumpable.