The Context-Free Languages include the Regular languages and more, but not all languages are Context-Free.
For example, is Context-Free (but not Regular), but is neither Regular nor Context-Free.
Similarly, the language of all even-length binary palindromes is Context-Free (but not Regular), but is neither Regular nor Context-Free.
This makes a certain intuitive sense; is context free because a PDA can use the stack to store the ’s, and then match them up with the ‘s. But is not context-free because once we’ve matched the ’s with the ’s, the stack is empty and we don’t know how to check the number of ‘s.
But that’s not a proof; how do we know there’s no cleverer way to store ’s on the stack that lets us match the number of ’s and ’s? To get a proof, we need to find a property shared by all context-free languages, and then show that the lacks that property.
There is no nice analogue of “Distinguishable Strings” for context-free languages. but context-free languages still have a property similar to (but slightly different from) the pumping property of regular languages, and so there is a pumping lemma for context-free languages.
Given a language and a “pumping length” ,
we say a string (with ) can be
CF-pumped if
If and ,
then can be CF-pumped.
There are many ways to show this, including
In all cases and for every .
The utility of this definition comes from the following fact:
If is a context-free language, then there exists such that every string with can be CF-pumped.
More importantly the contrapositive holds; if for every there is a string with that cannot be CF-pumped, then is not context-free.
The language is not context-free.
Proof.
Let be given.
Put . Note that and .
To show: is not CF-pumpable.
Let be any partition of into five parts with
and .\
There are five cases for :
cannot contain ’s and ’s and ’s, because and the ’s and ’s have ’s in-between them. In all five cases, it is clear that will increase the count of one or two letters (whatever letters occur in ) but not all three, and so .\
Thus no partition works to pump .
By the Pumping Lemma for Context-Free Languages, is not regular.
The language is not context-free.
Proof.
Let be given.
Put . Note that and .
To show: is not CF-pumpable.
Let be any partition of into five parts with
and .\
There are three cases for :
The fact that rules out any other possibility.
Thus no partition works to pump .
By the Pumping Lemma for Context-Free Languages, is not regular.
We don’t need to know the proof of the Pumping Lemma for CFLsin order to use it effectively. But it is not hard to see why it is true, once you know the trick.
If is a context-free language, then there exists such that every string with can be CF-pumped.
Proof.
Suppose is a context-free language with grammar
.
Then there is a value (depending on the number of nonterminals in the grammar and the maximum “branchiness” of parse trees in this grammar) such that every string with is so long, that its parse tree is so tall, that there is a path with at least nonterminals.
By the Pigeonhole Principle, this means that given an arbitrary , a parse tree for has a path where some nonterminal appears more than once:

So there is a big legal subtree with at the root, and a small legal subtree with at the root.
When a parse tree contains a node , any valid subtree (anything the grammar allows us to derive from ) can appear below it. It follows (by “tree surgery”) that infinitely many other legal parse trees must exist, including:

Since these are all valid parse trees, the corresponding strings (of the form ) must also be in the language .