2.1 Predicate Logic

“Todd, trust math. As in Matics, Math E. First-order predicate logic. Never fail you.” - David Foster Wallace

We argued that natural language could be misleading, and that therefore we wanted to think about valid arguments abstractly. However, only fairly simple arguments can be modeled using propositional logic and we will need something more.

For example, we can model this argument and verify its correctness:

\[ \begin{array}{ll} \hbox{If it is raining, then I take an umbrella.} & P\to Q \\ \hbox{It is raining.} & P \\ \hbox{Therefore, I take an umbrella.} & \therefore Q\\ \end{array} \]

but for this next argument (which involves three atomic propositions), the best we can do in formalizing the argument is to propose a conclusion that doesn’t follow from the premises:

\[ \begin{array}{ll} \hbox{All men are mortal.} & P\to Q \\ \hbox{Socrates is a man.} & P \\ \hbox{Therefore, Scorates is mortal.} & \therefore R\\ \end{array} \]

Moving to Predicate Logic

Predicate Logic (also known as Predicate Calculus or First-Order Logic) extends Propositional Logic with:

Specifically, we can formalize the second argument above as:

\[ \begin{array}{ll} \hbox{All men are mortal.} & \forall x\ (\, P(x)\to Q(x)\,) \\ \hbox{Socrates is a man.} & P(s) \\ \hbox{Therefore, Socrates is mortal.} & \therefore Q(s)\\ \end{array} \]

Terms

Definition

A term is an expression whose value is an individual. Terms can involve constants, variables, and/or function symbols.

More formally, the terms of Predicate Logic are defined inductively as follows:

Example

Here are three terms. Note that terms represent individuals (entities), and do not have a truth value. >\[ >\mathrm{canada} > \quad\qquad > \mathrm{max}(x,\mathrm{sqrt}(7)) > \qquad\qquad > \mathrm{fatherOf}(\mathrm{mary}) > \]

Extending Well-Formed Formulas

Definition

The Well-Formed Formulas (WFFs) of Predicate Logic extend the WFFs of Propositional Logic with predicates and relations on terms, as well as universal and existential quantifiers. > More formally, the WFFs of Predicate Logic are defined inductively as follows: > >- If \(P\) is an n-ary predicate symbol and \(t_1,\ldots, t_n\) are terms, then \(P(t_1,\ldots, t_n)\) is a WFF. >- \(\top\) and \(\bot\) are WFFs. >- If \({\cal A}\) is some WFF, then so is \(\lnot {\cal A}\) >- If \({\cal A}\) and \({\cal B}\) are WFFs, so is \(({\cal A}\land {\cal B})\) >- If \({\cal A}\) and \({\cal B}\) are WFFs, so is \(({\cal A}\lor {\cal B})\) >- If \({\cal A}\) and \({\cal B}\) are WFFs, so is \(({\cal A}\to {\cal B})\) >- If \({\cal A}\) is a WFF, so are \(\forall x\ {\cal A}\) and \(\exists x\ {\cal A}\). > As a special case, we consider propositional-logic variables \(P\) to be 0-ary predicate symbols (i.e., taking no arguments).

Example

Here are some WFFs of Predicate Logic. As in Propositional Logic, WFFs are propositions that can be true or false. > >- \(\lnot\mathrm{inhabited}(\mathrm{canada})\) >- \(7 = \mathrm{max}(x, \mathrm{sqrt}(7))\) >- \(\exists x\ (P(x) \land Q(x^2+1))\) >- \(\forall y\ (\exists z\ (z > y))\) >- \(\forall x\ (\mathrm{knows}(x, \mathrm{bob})\ \to\ \mathrm{age}(x) < 99)\)

English vs. Predicate Logic

The notation of Predicate Logic should be fairly familiar from other math and computer science courses. We therefore immediately jump into some examples comparing English-language predicates to Predicate Logic formulas.

Example

Suppose \(G(x)\) means that \(x\) is a grutor, \(S(x)\) means that \(x\) is a student, and \(L(x)\) means that \(x\) is a lecture, and that the constant \(m\) represents the student Mary. Then we can translate formulas as follows: > >1. \(S(m) \land \lnot G(m)\) > > * Mary is a student and not a grutor. > * Or: Mary is a student but not a grutor. > >2. \(\exists x\ L(x)\) > > * There is a lecture. > * Or: There is at least one lecture. > * Or: There exists a lecture.

Example

Now suppose \(K(x,y)\) means that \(x\) knows \(y\), \(A(x,y)\) means \(x\) attended \(y\), and again \(G(x)\) means that \(x\) is a grutor, \(S(x)\) means that \(x\) is a student, and \(L(x)\) means that \(x\) is a lecture, and that the constant \(m\) represents the student Mary. We can translate sentences as follows: > >1. Mary knows herself. > > * \(K(m,m)\) > >2. Every grutor knows Mary. > > * \(\forall x\ (G(x)\to K(x,m))\) > >3. Some grutor knows Mary. > > * \(\exists x\ (G(x)\land K(x,m))\) > >4. Mary knows every grutor. > > * \(\forall x\ K(m,x)\)

There are a couple of points that should be noted about these examples.

First, “Mary knows every grutor” can’t be translated as \(K(m, \forall x\ G(x))\) in an attempt to apply the \(K\) predicate to two arguments representing “Mary” and “every grutor” respectively. From a programming perspective, this formula involves a type error: the arguments to the \(K\) predicate must be terms representing individuals, whereas \(\forall x\ G(x)\) is not an individual but rather a logical predicate (stating that everthing in the universe is a grutor!).

Second, there is a subtle difference between the translation of “every grutor” \[ \forall x\ (G(x)\to \cdots) \] and “some grutor” \[ \exists x\ (G(x)\land \cdots).\]In “Every grutor knows Mary,” the formula is quantifying over every \(x\) in the universe, and saying if an \(x\) happens to be a grutor then \(x\) knows Mary. But in “Some grutor knows Mary” the formula is saying that there is some \(x\) in the universe that both is a grutor and knows Mary.

Trying to use the same logical connective in both translations would be incorrect. For example, \(\forall x\ (G(x)\land K(x,m))\) states that “everything in the universe is a grutor who also knows Mary.”

And \(\exists x\ (G(x)\to K(x,m))\) states that “there is something in the universe such that if it is a grutor then it knows Mary.” The problem is that the implication is immediately true as soon as we find an \(x\) that is not a grutor, so saying some \(x\) exists such that if they are a grutor then they know Mary doesn’t tell use whether or not any grutors know Mary.

Example

Now suppose \(K(x,y)\) mean that \(x\) knows \(y\), \(A(x,y)\) means \(x\) attended \(y\), and again \(G(x)\) means that \(x\) is a grutor, \(S(x)\) means that \(x\) is a student, and \(L(x)\) means that \(x\) is a lecture, and that the constant \(m\) represents the student Mary. We can translate sentences as follows. (There are often several equivalent possibilities.) > >1. Mary attended every lecture. > > * \(\forall x\ A(m,x)\) >
>2. No student attended every lecture. > > * \(\lnot\exists x\ (S(x)\land \forall y\ (L(y)\to A(x,y)))\) > * \(\forall x\ \bigl(S(x)\to \exists y\ (L(y)\land \lnot A(x,y))\bigr)\) > >3. No lecture was attended by every student. > > * \(\lnot\exists x\ (L(x)\land \forall y\ (S(y)\to A(y,x)))\) > * \(\forall x\ \bigl(L(x)\to \exists y\ (S(y)\land \lnot A(y,x))\bigr)\) > >4. No lecture was attended by any student. > > * \(\forall x\ \forall y\ \lnot A(x,y)\) > * \(\lnot\exists x\ \exists y\ A(x,y)\)

Free and Bound Variables

Definition

A use of a variable in a term is said to be bound if it inside a quantifier for that variable, and free otherwise.

Example

In the formula \[P(x) \land \forall x\ (R(x) \to T(x,y))\] >- the \(x\) in the \(P(x)\) is free; >- the \(x\) in the \(\forall x\) introduces a local variable \(x\) rather than > uses the variable \(x\) so this \(x\) is neither free nor bound; >- the \(x\) in the \(R(x)\) is bound; >- the \(x\) in the \(T(x,y)\) is bound >- but the \(y\) in the \(T(x,y)\) is free.

Why do we care? For two reasons.

  1. Formulas with no free variables are simply true or false (depending on the model, as we’ll see shortly) \[ \forall x\ \exists s\ (x\in s) \]In contrast, formulas with free variables are true or false, depending on the model and the values of the free variables. So the following formula is not a simple true/false proposition, but rather a proposition “about \(s\)\[ \exists x\ (x\in s) \]
  2. Secondly, the names of free variables matter, but the names of bound variables (mostly) don’t matter: \[ \begin{array}{c} \exists x\ (x > w) \\ = \\ \exists y\ (y > w) \\ = \\ \exists z\ (z > w) \\ \not= \\ \exists w\ (w > w) \\ \end{array} \qquad \begin{array}{c} \exists x\ (x > y) \\ \not= \\ \exists x\ (x > n) \\ \not= \\ \exists x\ (x > m) \\ \not= \\ \exists w\ (w > i) \\ \end{array} \]The first three formulas on the left side are ways of saying “there’s something bigger than \(w\),” but the last formula on the left says “there’s something bigger than itself” (if we accidentally choose the wrong name for our bound variable and capture a free variable), and the formulas on the right column are different claims, i.e., that there’s something bigger than \(y\), that there’s something bigger than \(n\), etc.

First-Order Logic?

What makes this a first-order logic? As defined, we can quantify only over individuals \[ \forall x\ (P(x)\lor\lnot P(x)). \] Higher-Order Logic can quantify over predicates/relations as well: \[ \forall x\ \forall y\ (\, x=y \ \leftrightarrow \ \forall P\ (P(x)\leftrightarrow P(y))\,) \] In practice, first-order logic is enough for most CS purposes.

Models

Suppose we want to know whether \(\forall x\ P(c,x)\) is true. This will depend on our model: - If the domain of individuals is the natural numbers, the constant \(c\) represents the number 0, and \(P\) is the “less-than-or-equal” relation, then \(\forall x\ P(c,x)\) is true (because every natural number is greater than or equal to zero). - if the domain of individuals is people currently alive, \(c\) is Prof. Stone’s cousin Ken, and \(P\) is the “lives-next-door-to” relation, then \(\forall x\ P(c,x)\) is false (because not everyone lives next door to Ken).

Definition

A model in Predicate Logic consists of: > >1. The domain, a nonempty set of individuals being quantified over >2. For each function or constant symbol, a corresponding mathematical function on the domain or constant taken from the domain >3. For each predicate or relation symbol, a corresponding mathematical predicate or relation on the domain (specifying which individuals have the property, or are in the relation)

Example

Suppose we want to know whether the following are true: > >1. \(R(c, k(d))\) >2. \(R(c, d)\lor R(d,c)\) >3. \(\exists x\ R(x, x)\) >4. \(\forall x\ R(x, x)\) > >The answer will depend on our model. > >— > >Suppose we choose a model where the domain is the mathematical integers; the constant \(c\) and the constant \(d\) both represent the integer zero; the unary function \(k\) is the successor (a.k.a. add-one) function; and the binary relation \(R\) is the less-than-or-equal relation. In this model, > >1. \(R(c, k(d))\) is true (0 is less-than-or-equal to 1) >2. \(R(c, d)\lor R(d,c)\) is true (0 is less-than-or-equal to 0) >3. \(\exists x\ R(x, x)\) is true (some integer is less-than-or-equal to itself) >4. \(\forall x\ R(x, x)\) is true (all integers are less-than-or-equal to themselves) > >— > >Suppose instead that we choose the model where the domain is books in the library, \(c\) is “Logic in Computer Science”, \(d\) is “Introduction to the Theory of Computation”, \(k\) is the function that given a book, returns the first edition of that book, and \(R\) is the relation on books that is true if both books were authored by a single person. In this model, > >1. \(R(c, k(d))\) is false (“Logic in Computer Science” has two authors) >2. \(R(c, d)\lor R(d,c)\) is false (“Logic in Computer Science” has two authors) >3. \(\exists x\ R(x, x)\) is true (“Introduction to the Theory of Computation” has a single author) >4. \(\forall x\ R(x, x)\) is false (not all books have a single author)

Example

Suppose we define a model where: > >- The domain is just the numbers 0 and 1; >- \(P(0)\) and \(P(1)\) are true; >- \(Q(0)\) and \(Q(1)\) are false. > In this model, it’s easy to see that \(\forall x\ P(x)\) and \(\exists y\ (P(y)\lor Q(y))\) are both true, but \(\exists x\ Q(x)\) and \(\exists y\ (P(y)\to Q(y))\) are both false. > We could consider even smaller domains, just one element. (That’s as small as we can go because in classical logic, domains have to be nonempty by definition.) Then we’d have to specify whether predicates are true or false for this one value. > Or, we could define bigger models For example, this model: > >- The domain is the set of all integers \(\geq\) 2 >- \(P(n)\) is true when \(n\) is prime >- \(Q(n)\) is true when \(n\) is a perfect square > is a model where \(\forall x\ P(x)\) is false but \(\exists x\ Q(x)\) and \(\exists y\ (P(y)\to Q(y))\) are both true.

Entailment

Definition

We say that assumptions \(A_1,\ldots,A_n\) entail a conclusion \({\cal B}\) if in every situation (model) where the assumptions are all true, the conclusion is true as well. In this case we write \[ A_1,\ldots,A_n \ \vDash \ {\cal B} \]

The definition of entailment remains unchanged, except now we have much more freedom in defining our models (e.g., any nonempty set of individuals as the domain), so checking entailment directly now means checking infinitely many models!

Logical Equivalences

The definition of logical equivalence is also unchanged, except now we are using the new Predicate-Logic version of entailment:

Definition

WFFs \({\cal A}\) and \({\cal B}\) are logically equivalent if \({\cal A} \vDash {\cal B}\) and \({\cal B} \vDash {\cal A}\). In this case, we write \[ {\cal A} \equiv {\cal B}.\] Another way to say this is that \({\cal A} \equiv {\cal B}\) whenever \({\cal A}\) and \({\cal B}\) have the same truth value in every model.

Example

The following equivalences are important, and should be memorized. > >- \(\forall x\ \forall y\ P(x,y) \ \ \equiv \ \ \forall y\ \forall x\ P(x,y)\) >- \(\exists x\ \exists y\ P(x,y) \ \ \equiv \ \ \exists y\ \exists x\ P(x,y)\) >- \(\lnot (\forall x\ P(x)) \ \ \equiv \ \ \exists x\ \lnot P(x)\) >- \(\lnot (\exists x\ P(x)) \ \ \equiv \ \ \forall x\ \lnot P(x)\)

Note that the first two equivalences let us swap two instances of the same quantifier, but in general \(\forall x\ \exists y\) is different from \(\exists y\ \forall x\).

Previous: Natural Deduction - Classical Rules

Next: 2.2 Predicate Formulae