2.1 Predicate Logic

We argued that natural language could be misleading, and that therefore we wanted to think about valid arguments abstractly. However, only fairly simple arguments can be modeled using propositional logic and we will need something more.

\[ \begin{array}{ll} \hbox{If it is raining, then I take an umbrella.} & P\to Q \\ \hbox{It is raining.} & P \\ \hbox{Therefore, I take an umbrella.} & \therefore Q\\ \end{array} \]

but for this next argument (which involves three atomic propositions), the best we can do in formalizing the argument is to propose a conclusion that doesn’t follow from the premises:

\[ \begin{array}{ll} \hbox{All men are mortal.} & P\to Q \\ \hbox{Socrates is a man.} & P \\ \hbox{Therefore, Scorates is mortal.} & \therefore R\\ \end{array} \]

Moving to Predicate Logic

Predicate Logic (also known as Predicate Calculus or First-Order Logic) extends Propositional Logic with:

\[ \begin{array}{ll} \hbox{All men are mortal.} & \forall x\ (\, P(x)\to Q(x)\,) \\ \hbox{Socrates is a man.} & P(s) \\ \hbox{Therefore, Socrates is mortal.} & \therefore Q(s)\\ \end{array} \]

Terms

Definition

A term is an expression whose value is an individual. Terms can involve constants, variables, and/or function symbols.

More formally, the terms of Predicate Logic are defined inductively as follows:

Any constant symbol \(c, d, \ldots\) is a term.
Any variable \(x, y, \ldots\) is a term.
If \(f\) is an \(n\)-ary function symbol and \(t_1,\ldots, t_n\) are terms, then \(f(t_1,\ldots, t_n)\) is a term.

Here are three terms. Note that terms represent individuals (entities), and do not have a truth value. >\[ >\mathrm{canada} > \quad\qquad > \mathrm{max}(x,\mathrm{sqrt}(7)) > \qquad\qquad > \mathrm{fatherOf}(\mathrm{mary}) > \]

Extending Well-Formed Formulas

The Well-Formed Formulas (WFFs) of Predicate Logic extend the WFFs of Propositional Logic with predicates and relations on terms, as well as universal and existential quantifiers. > More formally, the WFFs of Predicate Logic are defined inductively as follows: > >- If \(P\) is an n-ary predicate symbol and \(t_1,\ldots, t_n\) are terms, then \(P(t_1,\ldots, t_n)\) is a WFF. >- \(\top\) and \(\bot\) are WFFs. >- If \({\cal A}\) is some WFF, then so is \(\lnot {\cal A}\) >- If \({\cal A}\) and \({\cal B}\) are WFFs, so is \(({\cal A}\land {\cal B})\) >- If \({\cal A}\) and \({\cal B}\) are WFFs, so is \(({\cal A}\lor {\cal B})\) >- If \({\cal A}\) and \({\cal B}\) are WFFs, so is \(({\cal A}\to {\cal B})\) >- If \({\cal A}\) is a WFF, so are \(\forall x\ {\cal A}\) and \(\exists x\ {\cal A}\). > As a special case, we consider propositional-logic variables \(P\) to be 0-ary predicate symbols (i.e., taking no arguments).

Here are some WFFs of Predicate Logic. As in Propositional Logic, WFFs are propositions that can be true or false. > >- \(\lnot\mathrm{inhabited}(\mathrm{canada})\) >- \(7 = \mathrm{max}(x, \mathrm{sqrt}(7))\) >- \(\exists x\ (P(x) \land Q(x^2+1))\) >- \(\forall y\ (\exists z\ (z > y))\) >- \(\forall x\ (\mathrm{knows}(x, \mathrm{bob})\ \to\ \mathrm{age}(x) < 99)\)

English vs. Predicate Logic

The notation of Predicate Logic should be fairly familiar from other math and computer science courses. We therefore immediately jump into some examples comparing English-language predicates to Predicate Logic formulas.

Suppose \(G(x)\) means that \(x\) is a grutor, \(S(x)\) means that \(x\) is a student, and \(L(x)\) means that \(x\) is a lecture, and that the constant \(m\) represents the student Mary. Then we can translate formulas as follows: > >1. \(S(m) \land \lnot G(m)\) > > * Mary is a student and not a grutor. > * Or: Mary is a student but not a grutor. > >2. \(\exists x\ L(x)\) > > * There is a lecture. > * Or: There is at least one lecture. > * Or: There exists a lecture.

Now suppose \(K(x,y)\) means that \(x\) knows \(y\), \(A(x,y)\) means \(x\) attended \(y\), and again \(G(x)\) means that \(x\) is a grutor, \(S(x)\) means that \(x\) is a student, and \(L(x)\) means that \(x\) is a lecture, and that the constant \(m\) represents the student Mary. We can translate sentences as follows: > >1. Mary knows herself. > > * \(K(m,m)\) > >2. Every grutor knows Mary. > > * \(\forall x\ (G(x)\to K(x,m))\) > >3. Some grutor knows Mary. > > * \(\exists x\ (G(x)\land K(x,m))\) > >4. Mary knows every grutor. > > * \(\forall x\ K(m,x)\)

First, “Mary knows every grutor” can’t be translated as \(K(m, \forall x\ G(x))\) in an attempt to apply the \(K\) predicate to two arguments representing “Mary” and “every grutor” respectively. From a programming perspective, this formula involves a type error: the arguments to the \(K\) predicate must be terms representing individuals, whereas \(\forall x\ G(x)\) is not an individual but rather a logical predicate (stating that everthing in the universe is a grutor!).

Second, there is a subtle difference between the translation of “every grutor” \[ \forall x\ (G(x)\to \cdots) \] and “some grutor” \[ \exists x\ (G(x)\land \cdots).\]In “Every grutor knows Mary,” the formula is quantifying over every \(x\) in the universe, and saying if an \(x\) happens to be a grutor then \(x\) knows Mary. But in “Some grutor knows Mary” the formula is saying that there is some \(x\) in the universe that both is a grutor and knows Mary.

Trying to use the same logical connective in both translations would be incorrect. For example, \(\forall x\ (G(x)\land K(x,m))\) states that “everything in the universe is a grutor who also knows Mary.”

And \(\exists x\ (G(x)\to K(x,m))\) states that “there is something in the universe such that if it is a grutor then it knows Mary.” The problem is that the implication is immediately true as soon as we find an \(x\) that is not a grutor, so saying some \(x\) exists such that if they are a grutor then they know Mary doesn’t tell use whether or not any grutors know Mary.

Now suppose \(K(x,y)\) mean that \(x\) knows \(y\), \(A(x,y)\) means \(x\) attended \(y\), and again \(G(x)\) means that \(x\) is a grutor, \(S(x)\) means that \(x\) is a student, and \(L(x)\) means that \(x\) is a lecture, and that the constant \(m\) represents the student Mary. We can translate sentences as follows. (There are often several equivalent possibilities.) > >1. Mary attended every lecture. > > * \(\forall x\ A(m,x)\) >
>2. No student attended every lecture. > > * \(\lnot\exists x\ (S(x)\land \forall y\ (L(y)\to A(x,y)))\) > * \(\forall x\ \bigl(S(x)\to \exists y\ (L(y)\land \lnot A(x,y))\bigr)\) > >3. No lecture was attended by every student. > > * \(\lnot\exists x\ (L(x)\land \forall y\ (S(y)\to A(y,x)))\) > * \(\forall x\ \bigl(L(x)\to \exists y\ (S(y)\land \lnot A(y,x))\bigr)\) > >4. No lecture was attended by any student. > > * \(\forall x\ \forall y\ \lnot A(x,y)\) > * \(\lnot\exists x\ \exists y\ A(x,y)\)

Free and Bound Variables

A use of a variable in a term is said to be bound if it inside a quantifier for that variable, and free otherwise.

In the formula \[P(x) \land \forall x\ (R(x) \to T(x,y))\] >- the \(x\) in the \(P(x)\) is free; >- the \(x\) in the \(\forall x\) introduces a local variable \(x\) rather than > uses the variable \(x\) so this \(x\) is neither free nor bound; >- the \(x\) in the \(R(x)\) is bound; >- the \(x\) in the \(T(x,y)\) is bound >- but the \(y\) in the \(T(x,y)\) is free.

First-Order Logic?

What makes this a first-order logic? As defined, we can quantify only over individuals \[ \forall x\ (P(x)\lor\lnot P(x)). \] Higher-Order Logic can quantify over predicates/relations as well: \[ \forall x\ \forall y\ (\, x=y \ \leftrightarrow \ \forall P\ (P(x)\leftrightarrow P(y))\,) \] In practice, first-order logic is enough for most CS purposes.

Models

Suppose we want to know whether \(\forall x\ P(c,x)\) is true. This will depend on our model: - If the domain of individuals is the natural numbers, the constant \(c\) represents the number 0, and \(P\) is the “less-than-or-equal” relation, then \(\forall x\ P(c,x)\) is true (because every natural number is greater than or equal to zero). - if the domain of individuals is people currently alive, \(c\) is Prof. Stone’s cousin Ken, and \(P\) is the “lives-next-door-to” relation, then \(\forall x\ P(c,x)\) is false (because not everyone lives next door to Ken).

A model in Predicate Logic consists of: > >1. The domain, a nonempty set of individuals being quantified over >2. For each function or constant symbol, a corresponding mathematical function on the domain or constant taken from the domain >3. For each predicate or relation symbol, a corresponding mathematical predicate or relation on the domain (specifying which individuals have the property, or are in the relation)

Suppose we want to know whether the following are true: > >1. \(R(c, k(d))\) >2. \(R(c, d)\lor R(d,c)\) >3. \(\exists x\ R(x, x)\) >4. \(\forall x\ R(x, x)\) > >The answer will depend on our model. > >— > >Suppose we choose a model where the domain is the mathematical integers; the constant \(c\) and the constant \(d\) both represent the integer zero; the unary function \(k\) is the successor (a.k.a. add-one) function; and the binary relation \(R\) is the less-than-or-equal relation. In this model, > >1. \(R(c, k(d))\) is true (0 is less-than-or-equal to 1) >2. \(R(c, d)\lor R(d,c)\) is true (0 is less-than-or-equal to 0) >3. \(\exists x\ R(x, x)\) is true (some integer is less-than-or-equal to itself) >4. \(\forall x\ R(x, x)\) is true (all integers are less-than-or-equal to themselves) > >— > >Suppose instead that we choose the model where the domain is books in the library, \(c\) is “Logic in Computer Science”, \(d\) is “Introduction to the Theory of Computation”, \(k\) is the function that given a book, returns the first edition of that book, and \(R\) is the relation on books that is true if both books were authored by a single person. In this model, > >1. \(R(c, k(d))\) is false (“Logic in Computer Science” has two authors) >2. \(R(c, d)\lor R(d,c)\) is false (“Logic in Computer Science” has two authors) >3. \(\exists x\ R(x, x)\) is true (“Introduction to the Theory of Computation” has a single author) >4. \(\forall x\ R(x, x)\) is false (not all books have a single author)

Suppose we define a model where: > >- The domain is just the numbers 0 and 1; >- \(P(0)\) and \(P(1)\) are true; >- \(Q(0)\) and \(Q(1)\) are false. > In this model, it’s easy to see that \(\forall x\ P(x)\) and \(\exists y\ (P(y)\lor Q(y))\) are both true, but \(\exists x\ Q(x)\) and \(\exists y\ (P(y)\to Q(y))\) are both false. > We could consider even smaller domains, just one element. (That’s as small as we can go because in classical logic, domains have to be nonempty by definition.) Then we’d have to specify whether predicates are true or false for this one value. > Or, we could define bigger models For example, this model: > >- The domain is the set of all integers \(\geq\) 2 >- \(P(n)\) is true when \(n\) is prime >- \(Q(n)\) is true when \(n\) is a perfect square > is a model where \(\forall x\ P(x)\) is false but \(\exists x\ Q(x)\) and \(\exists y\ (P(y)\to Q(y))\) are both true.

Entailment

We say that assumptions \(A_1,\ldots,A_n\) entail a conclusion \({\cal B}\) if in every situation (model) where the assumptions are all true, the conclusion is true as well. In this case we write \[ A_1,\ldots,A_n \ \vDash \ {\cal B} \]

The definition of entailment remains unchanged, except now we have much more freedom in defining our models (e.g., any nonempty set of individuals as the domain), so checking entailment directly now means checking infinitely many models!

Logical Equivalences

The definition of logical equivalence is also unchanged, except now we are using the new Predicate-Logic version of entailment:

WFFs \({\cal A}\) and \({\cal B}\) are logically equivalent if \({\cal A} \vDash {\cal B}\) and \({\cal B} \vDash {\cal A}\). In this case, we write \[ {\cal A} \equiv {\cal B}.\] Another way to say this is that \({\cal A} \equiv {\cal B}\) whenever \({\cal A}\) and \({\cal B}\) have the same truth value in every model.

The following equivalences are important, and should be memorized. > >- \(\forall x\ \forall y\ P(x,y) \ \ \equiv \ \ \forall y\ \forall x\ P(x,y)\) >- \(\exists x\ \exists y\ P(x,y) \ \ \equiv \ \ \exists y\ \exists x\ P(x,y)\) >- \(\lnot (\forall x\ P(x)) \ \ \equiv \ \ \exists x\ \lnot P(x)\) >- \(\lnot (\exists x\ P(x)) \ \ \equiv \ \ \forall x\ \lnot P(x)\)

Note that the first two equivalences let us swap two instances of the same quantifier, but in general \(\forall x\ \exists y\) is different from \(\exists y\ \forall x\).