First-order logic, semantically speaking

Peter Susanszky

First-order logic, semantically speaking

To save space and time, we will immediately formulate our notions for sets of formulas. As noted above, this is not a problem since single formulas are just special sets of formulas. Namely, for any formula $X$ , there is a singleton set $\{X\}$ .

Satisfiability

As before, we will start with the notion of satisfiability for sets of formulas (including singleton sets), and define validity using the notion of satisfiability. Thus, we have:

Definition 8.1 (Satisfiability). A set $S$ of formulas of $\mathcal{L}_1$ is first-order satisfiable or semantically consistent iff there is a structure $\mathbf{S}$ and an assignment $\mathbf{a}$ such that $\mathbf{S} \models X[\mathbf{a}]$ for each $X \in S$ . We say $S$ is first-order unsatisfiable or semantically inconsistent provided it is not semantically consistent.

Exercise 8.1. By the above, it follows that a set of formulas is semantically inconsistent provided there is no structure $\mathbf{S}$ and assignment $\mathbf{a}$ under which every formula in the set is simultaneously satisfiable. Explain why this is the case in your own words using the definition above.

Checking whether a set of formulas is first-order satisfiable is checking whether there is a structure and an assignment that assigns a member of the domain of that structure to the variables of the language such that every formula in the set comes out satisfied. This sounds very complicated, but as usual, the underlying idea is very simple.

Take, for example, the following formula (considered as a singleton set): $P(x) \wedge \forall y P(y)$

Is this formula satisfiable? To answer this in the positive, we need to find a structure and a variable assignment in which it is satisfied. In fact, there are many structures and assignments that satisfy this formula, so it is satisfiable. Here is one.

Let $\mathbf{S}$ be a structure such that $\mathbf{D}=\{\mathsf{a}\}$ , and $\mathbf{I}(P)=\{\mathsf{a}\}$ . For any variable $x$ , let $\mathbf{a}(x)=\mathsf{a}$ . So there is a single object $\mathsf{a}$ in the structure, and that single object is in the set of things that are $P$ . Moreover, every variable is assigned $\mathsf{a}$ under the variable assignment $\mathbf{a}$ .

Based on this, we have that $\mathbf{S} \models P(x)[\mathbf{a}]$ , since $\mathbf{a}(x) \in \mathbf{I}(P)$ . Moreover, we also have that $\mathbf{S} \models \forall y P(y)[\mathbf{a}]$ , since under every $y$ -variant assignment to $\mathbf{a}$ , $P(y)$ is satisfied. In other words (and some symbols), $\mathbf{S} \models P(y)[\mathbf{a}']$ for every $y$ -variant assignment $\mathbf{a}'$ , simply because every assignment points to $a$ (as there is nothing else in the domain). So since both are satisfied under $\mathbf{a}$ in $\mathbf{S}$ , their conjunction $P(x) \wedge \forall y P(y)$ is also satisfied in the same structure under the same assignment. So the formula $P(x) \wedge \forall y P(y)$ is satisfiable.

There are many other formulas that are satisfiable because they are satisfied in the above structure $\mathbf{S}$ and assignment $\mathbf{a}$ .

Exercise 8.2. Show that each of the following formulas are first-order satisfiable by showing that they are satisfied in $\mathbf{S}$ and $\mathbf{a}$ , as defined above.

$\exists x_1 P(x) \wedge \neg \forall y \neg P(y)$
$\forall x(P(x) \vee \neg P(x))$
$\forall x P(x) \vee \neg P(x)$
$\neg P(y) \rightarrow \neg P(y)$
$\forall x (P(x) \rightarrow \exists y P(y))$

Remark 8.1. Note that this entails that the set of formulas $\{\exists x_1 P(x) \wedge \neg \forall y \neg P(y), \forall x(P(x) \vee \neg P(x)), \forall x P(x) \vee \neg P(x), \neg P(y) \rightarrow \neg P(y), \forall x (P(x) \rightarrow \exists y P(y))\}$ is satisfiable, since you just showed that they are each satisfied in the same structure under the same assignment.

Exercise 8.3. Write five formulas using only the predicate $P$ and no constants such that they are each satisfied in $\mathbf{S}$ under $\mathbf{a}$ , and thus satisfiable in first-order logic.

Of course, some sets of formulas require more elaborate structures and assignments to be satisfied, and hence be satisfiable. And in general, just because a set of formulas is not satisfied in a given structure under a given assignment does not entail that it is not satisfiable. For the latter, what you would have to show is that there is no structure and assignment whatsoever under which the set of formulas are satisfied.

Take, for example, the formula $P(x) \wedge \neg P(y)$ . This is not satisfied in $\mathbf{S}$ under any variable assignment $\mathbf{a}^*$ , simply because it requires an object to not be $P$ . So $\mathbf{S} \not\models P(x) \wedge \neg P(y)[\mathbf{a}^*]$ for any $\mathbf{a}^*$ . On the other hand, this does not mean that $P(x) \wedge \neg P(y)$ is unsatisfiable. Indeed, it is easy to think of a structure and an assignment that satisfies it.

Let $\mathbf{W}$ be the structure such that $\mathbf{D}=\{\textsf{a}, \textsf{b}\}$ , $\mathbf{I}(P)=\{\textsf{a}\}$ . Moreover, let $\mathbf{b}(x)=\textsf{a}$ and $\mathbf{b}(y)=\textsf{b}$ . Now $\mathbf{W} \models P(x) \wedge \neg P(y)[\mathbf{b}]$ . Why? It is because $\mathbf{W} \models P(x)[\mathbf{b}]$ , for $\mathbf{b}(x) \in \mathbf{I}(P)$ , and $\mathbf{W} \models \neg P(y)[\mathbf{b}]$ , since $\mathbf{b}(y) \notin \mathbf{I}(P)$ . So $P(x) \wedge \neg P(y)$ is satisfiable, since there is a structure and an assignment that satisfy it, as just demonstrated.

Note that when it comes to open formulas, assignments matter! On the other hand, when it comes to sentences, we have an easier time. You may remember that a sentence is either satisfied under every assignment in a structure or none, and therefore, is true or false in that structure. So the only things you need to care about in these cases is whether the sentence is true in the structure or not. This is because if the sentence is true in a structure, it is satisfied under every assignment in that structure by definition, so it is satisfiable, as per our definition of satisfiability. Not coincidentally, this is exactly how zeroth-order satisfiability was defined, as zeroth-order languages do not have open formulas.

Here is a simple example. Suppose you have a first-order sentence $P(b) \vee \neg P(a)$ . Let the interpretation $\mathbf{I}$ of $\mathbf{S}$ be such that $\mathbf{I}(b)=\mathbf{I}(a)=\mathsf{a}$ (for there is nothing else in the domain of $\mathbf{S}$ ). Now since $\mathsf{a} \in \mathbf{I}(P)$ , $\mathbf{S} \models P(b)$ , so $\mathbf{S} \models P(b) \vee \neg P(a)$ , and the sentence is true in $\mathbf{S}$ . So the sentence is satisfiable in $\mathbf{S}$ . So it is satisfiable.

Remember: for a set of formulas to be satisfiable, it is enough to find a single structure and variable assignment under which each formula is satisfied. Moreover, if a sentence (closed formula) is true in a structure, by definition, it is satisfied under every assignment in that structure, and thus it is satisfied in a structure under an assignment, and thus it is satisfiable full stop. So if a set of sentences are all true in a structure, they are satisfied in that structure, so they are first-order satisfiable.

So some sets of formulas are satisfiable, and for each set of satisfiable formulas, there needs to be only one structure and assignment under which each formula in the set is satisfied (in the same structure, under the same assignment). Now let’s inspect some examples of sets of formulas that are not satisfiable. In other words, sets of formulas for which a structure and assignment cannot be found under which they are all satisfied. As noted, these formulas we will call (first-order) unsatisfiable or semantically inconsistent.

As a simple example, take the sentence $P(a) \wedge \neg P(a)$ . You may already see that this is a sentence of form $X \wedge \neg X$ , which is the most basic form of a contradiction. As we have demonstrated previously, there is no structure $\mathbf{S}$ in which it could be true, simply by the way $\wedge$ and $\neg$ is defined. But this also means that no matter what structure $\mathbf{S}^*$ and assignment $\mathbf{a}^*$ we choose, $\mathbf{S}^* \not\models P(a) \wedge \neg P(a)[\mathbf{a}^*]$ , so indeed, this formula is first-order unsatisfiable.

Note that the same goes for first-order sentences that include quantifiers, like $\exists x P(x) \wedge \neg \exists x P(x)$ . This sentence is unsatisfiable precisely for the same reason as $P(a) \wedge \neg P(a)$ is unsatisfiable. Simply because if $\exists x P(x) \wedge \neg \exists x P(x)$ were satisfied in a structure under an assignment, then $\exists x P(x)$ would have to be satisfied, and $\neg \exists x P(x)$ would also have to be satisfied. But by definition, if $\exists x P(x)$ is satisfied, $\neg \exists x P(x)$ is not, and if $\neg \exists x P(x)$ is satisfied, $\exists x P(x)$ is not. So there is no way to satisfy this formula. In general, any formula of form $X \wedge \neg X$ , where $X$ is any formula, is first-order unsatisfiable for this reason.

Note: just as any set of sentences is first-order satisfiable if it is true in a structure, any set of sentences is first-order unsatisfiable if it is false in every structure. So in both cases, variable assignments may be disregarded, as there are no free variables.

It is important to note that some first-order unsatisfiable sets of formulas are unsatisfiable not solely because of how the connectives are defined (which coincides with zeroth-order unsatisfiability), but essentially because of how the variables and quantifiers function. For example, take $P(x) \wedge \forall y \neg P(y)$ . This formula is not of the form $X \wedge \neg X$ , or any zeroth-order contradiction for that matter, yet it is first-order unsatisfiable. It is not hard to see why this is the case.

Suppose $P(x) \wedge \forall y \neg P(y)$ were first-order satisfiable. Then, there would be a structure $\mathbf{S}$ and assignment $\mathbf{a}$ such that $\mathbf{S} \models P(x) \wedge \forall y \neg P(y)[\mathbf{a}]$ . In turn, this would mean that $\mathbf{S} \models P(x)[\mathbf{a}]$ , that is, that $\mathbf{a}(x) \in \mathbf{I}(P)$ . On the other hand, we would also have that $\mathbf{S} \models \forall y \neg P(y)[\mathbf{a}]$ , which would mean that for every $y$ -variant $\mathbf{a}'$ , $\mathbf{S} \models \neg P(y)[\mathbf{a}']$ , so $\mathbf{S} \not\models P(y)[\mathbf{a}']$ for every $y$ -variant $\mathbf{a}'$ , and so $\mathbf{a}'(y) \notin \mathbf{I}(P)$ for every $y$ -variant $\mathbf{a}'$ . But this is only true if $\mathbf{I}(P)$ is empty, otherwise we would be able to find a $y$ -variant assignment $\mathbf{a}'$ such that $\mathbf{a}'(y) \in \mathbf{I}(P)$ . On the other hand, we also showed that $\mathbf{a}(x) \in \mathbf{I}(P)$ , so it should not be empty! In other words, these two conjuncts cannot be satisfied at the same time, since one requires $\mathbf{I}(P)$ to have at least one member, and the other requires it to be empty. So $P(x) \wedge \forall y \neg P(y)$ cannot be satisfied, and so is not first-order satisfiable.

Exercise 8.4. Show that the first-order sentence $\exists x P(x) \wedge \forall y \neg P(y)$ is unsatisfiable. Hint: it is unsatisfiable for essentially the same reasons as $P(x) \wedge \forall y \neg P(y)$ , with a slight twist.

Validity

Now that we have a grasp on first-order satisfiability, we are in the position to formulate what it means for an argument to be first-order valid, and in connection, what it means for a formula to be a first-order validity. As we did with zeroth-order validity, we will define first-order validity using the notion of first-order satisfiability. In particular, we shall say:

Definition 8.2 (Validity, first-order). Let $A=\{X_1, X_2, …\}$ be any set of formulas and let $Y$ be any formula of $\mathcal{L}_1$ . Then, the argument from premises $A$ to the conclusion $Y$ is first-order valid iff the set $\{X_1, X_2, …, \neg Y\}$ is not first-order satisfiable. If an argument from $A$ to $Y$ is first-order valid, we write $A \models Y$ , or say that $Y$ is a (first-order) semantic consequence of, or (first-order) semantically entailed by $A$ . If the argument is not valid, we say it is invalid, and write $A \not\models Y$ . If $A=\emptyset$ and $A \models Y$ , we say $Y$ is a first-order validity, and simply write $\models Y$ .

Similar to our zeroth-order definition, the reasoning behind this is as follows. An argument is valid if whenever its premises are satisfied, its conclusion is also satisfied. This is equivalent to saying that it is impossible for an argument’s premises to be satisfied but its conclusion to be unsatisfied. This latter means that if an argument is valid, we should not be able to find a structure and a variable assignment which satisfies all its premises but does not satisfy its conclusion. Put another way, we should not be able to find a structure and variable assignment under which the premises and the negation of the conclusion is satisfied. So put another way, we should not find that the set of the premises and the negation of the conclusion together is satisfiable. Which is exactly what the definition says.

As far as first-order sentences are concerned, the definition of validity essentially reduces to that of zeroth-order validity. In particular, a set of sentences $A=\{X_1, X_2, …\}$ entails a sentence $Y$ provided in every structure in which $A$ is true, $Y$ is also true, which is once again equivalent to $\{X_1, X_2, …, \neg Y\}$ not being true in any structure. Keep in mind, however, that these sentences may include quantifiers, as long as every variable is bound. The only additional complication when including open formulas is that we need to include into our calculations the assignments. But again, this just means that if an argument is valid, then whenever its premises are satisfied in a structure under an assignment, its conclusion will also always be satisfied in that structure under that assignment.

Note that by definition, an argument is invalid if it not valid. And validity for an argument with premise set

\{X_1, X_2, …\}

and conclusion

Y

is defined by the set

\{X_1, X_2, …, \neg Y\}

being unsatisfiable. From this, it readily follows that an argument is not valid (or simply, invalid) if

\{X_1, X_2, …, \neg Y\}

is satisfiable. In other words, if you want to show that an argument is invalid, you need to provide a structure and an assignment under which the premises are satisfied, but the conclusion is not (or equivalently, its negation is).

Exercise 8.5. Show that the following facts hold in first-order logic:

$\exists x \neg P(x) \not\models \forall y \neg P(y)$
$\neg \forall x \neg P(x) \not\models \neg \exists x\neg P(x)$
$\neg \forall x P(x) \not\models \neg \exists x P(x)$
$\forall x P(x) \rightarrow \forall x Q(x) \not\models \forall x (P(x) \rightarrow Q(x))$
$\exists x P (x) \land \exists x Q (x) ⊭ \exists x (P (x) \land Q (x))$
$\forall x(P(x) \vee Q(x)) \not\models \forall x P(x) \vee \forall x Q(x)$
$\exists x (P(x) \rightarrow Q(a)) \not \models \exists x P(x) \rightarrow Q(a)$
$\{\exists x (P(x) \rightarrow Q(x)), \exists x P(x)\} \not \models \exists x Q(x)$

In general, every zeroth-order valid argument is first-order valid. In fact, zeroth-order satisfiability, unsatisfiability and validity entail first-order satisfiability, unsatisfiability and validity, respectively (for zeroth-order arguments). But once again, there are first-order valid arguments whose validity essentially depends on the particular use of variables and quantifiers. Most importantly, there are two fundamental properties of quantifiers, connected to negation:

Proposition 8.1 (Negation push-through; $\exists$ ). $\neg \exists x Y \models \forall x \neg Y$ for any formula $Y$ of $\mathcal{L}_1$ .

Proof. Suppose $\mathbf{S} \models \neg \exists x Y[\mathbf{a}]$ . By definition, $\mathbf{S} \not \models \exists x Y[\mathbf{a}]$ . Since $\mathbf{S} \models \exists x Y[\mathbf{a}]$ means that for some $x$ -variant $\mathbf{a}'$ , $\mathbf{S} \models Y[\mathbf{a}']$ , its negation means that there is no $x$ -variant $\mathbf{a}'$ for which we have $\mathbf{S} \models Y[\mathbf{a}']$ . That is, for every $x$ -variant $\mathbf{a}'$ , $\mathbf{S} \not\models Y[\mathbf{a}']$ . That is, for every $x$ -variant $\mathbf{a}'$ , $\mathbf{S} \models \neg Y [\mathbf{a}']$ . But this just means that $\mathbf{S} \models \forall x \neg Y[\mathbf{a}]$ . ◻

It is easy to see intuitively why this holds. Simply, if there is no $x$ such that $Y$ , then obviously, every $x$ must not be $Y$ , for otherwise, there would be an $x$ that is $Y$ . For example, if among a group of students, there is no student who received a C on their exam, then for every student, it is true that they did not receive a C on their exam. That is, if $\neg \exists x C(x)$ , then $\forall x \neg C(x)$ .

Note also that $Y$ may be any formula above, not just an atomic one. So it is also the case, for example, that if there is no person who has a car and and has a sister, then for every person, it is not the case that they have a car and a sister. So if $\neg \exists x (H(x) \wedge S(x))$ , then $\forall x \neg (H(x) \wedge S(x))$ . It is important here where the parentheses are. For no one having a car and a sister does not mean that no one has a car or a sister. It only means that they cannot have both. And indeed, this follows, for by DeMorgan’s law, $\forall x \neg (H(x) \wedge S(x))$ entails $\forall x (\neg H(x) \vee \neg S(x))$ . That is, it is true of everyone that they either do not have a car, or they do not have a sister, or they neither have a car nor a sister. The only possible configuration excluded by this sentence is that they have both have a car and a sister.

Next, we also have the converse of the above. Namely, that if it is not true for all $x$ that $Y$ , then there must be at least one $x$ for which $Y$ is the case.

Proposition 8.2 (Negation push-through; $\forall$ ). $\neg \forall x Y \models \exists x \neg Y$ for any formula $Y$ of $\mathcal{L}_1$ .

Proof. Suppose $\mathbf{S} \models \neg \forall x Y[\mathbf{a}]$ . By definition, $\mathbf{S} \models \neg \forall x Y[\mathbf{a}]$ iff $\mathbf{S} \not \models \forall x Y[\mathbf{a}]$ . Now $\mathbf{S}\models \forall x Y[\mathbf{a}]$ provided for every $x$ -variant assignment $\mathbf{a}'$ , we have $\mathbf{S}\models Y[\mathbf{a}']$ . So on the contrary, $\mathbf{S} \not \models \forall x Y[\mathbf{a}]$ provided there is some $x$ -variant assignment $\mathbf{a}'$ for which we have $\mathbf{S} \not \models Y[\mathbf{a}']$ . So there is some $x$ -variant assignment $\mathbf{a}'$ such that $\mathbf{S} \models \neg Y[\mathbf{a}']$ . But then by definition, $\mathbf{S} \models \exists x \neg Y[\mathbf{a}]$ . ◻

Now again, the intuitive content of this is pretty straightforward. For example, if not everyone got an A on their exam, then there is someone who did not get an A on their exam. That is, if $\neg \forall x A(x)$ , then $\exists x \neg A(x)$ . And of course, once again, this holds for any formula whatsoever. So for example, if not everyone has a car and a sister, then there is someone who does not have a car and a sister. That is, if $\neg \forall x (H(x) \wedge S(x))$ , then $\exists x \neg (H(x) \wedge S(x))$ . This then entails, by DeMorgan’s law, $\exists x \neg H(x) \vee \neg S(x))$ . That is, if not everyone has both a car and a sister, then there must be someone who either does not have a car, or does not have a sister, or neither has a car nor a sister.

The above two propositions also work backwards. In particular:

Proposition 8.3 (Negation pull-through; $\exists$ ). $\forall x \neg Y \models \neg \exists x Y$ for any formula $Y$ of $\mathcal{L}_1$ .

Proposition 8.4 (Negation pull-through; $\forall$ ). $\exists x \neg Y \models \neg \forall x Y$ for any formula $Y$ of $\mathcal{L}_1$ .

Exercise 8.6. Prove the two propositions above using the proofs for their push-through variants as blueprint. Hint: think backwards.

Note that the use of DeMorgan’s law above was not accidental. The connectives are defined in first-order logic exactly the same way as they are in zeroth-order logic. Accordingly, the valid argument forms of zeroth-order logic carry over to first-order logic. For example, here is a clever utilization of the above propositions and some zeroth-order logic.

Proposition 8.5. $\exists x Y \models \neg \forall x \neg Y$ and $\neg \forall x \neg Y \models \exists x Y$

Proof. For the first, suppose $\mathbf{S} \models \exists x Y[\mathbf{a}]$ . Then by zeroth-order reasoning, $\mathbf{S} \models \neg \neg \exists x Y[\mathbf{a}]$ . Then, by pushing the negation through $\exists$ , we have $\mathbf{S} \models \neg \forall \neg x Y[\mathbf{a}]$ . So $\exists x Y \models \neg \forall x \neg Y$ .

For the second, suppose $\mathbf{S} \models \neg \forall x \neg Y[\mathbf{a}]$ . Then, pulling the negation through $\forall$ , we get $\mathbf{S} \models \neg\neg \exists x Y[\mathbf{a}]$ . Then, by zeroth-order reasoning, we get $\mathbf{S} \models \exists x Y[\mathbf{a}]$ . So $\neg \forall x \neg Y \models \exists x Y$ . ◻

Proposition 8.6. $\forall x Y \models \neg \exists x \neg Y$ and $\neg \exists x \neg Y \models \forall x Y$

Exercise 8.7. Prove the above proposition. Hint: take the proof of its inverted version above as blueprint.

Here is another example of how one can reason about validity in first-order logic:

Proposition 8.7. $\{\forall x (P (x) \rightarrow Q(x)), \exists y P(y)\} \models \exists y Q(y)$

Proof. Suppose $\mathbf{S} \models \forall x (P (x) \rightarrow Q(x))[\mathbf{a}]$ , and $\mathbf{S} \models \exists y P(y)[\mathbf{a}]$ . By the latter, we know that there is a $y$ -variant assignment $\mathbf{a}'$ such that $\mathbf{S} \models Q(y)[\mathbf{a}']$ . In other words, $\mathbf{a}'(y) \in \mathbf{I}(Q)$ . Let’s denote the object $\mathbf{a}'(y)$ by $\mathsf{c}$ , so that we know $\mathsf{c} \in \mathbf{I}(Q)$ . Now because $\mathbf{S} \models \forall x (P (x) \rightarrow Q(x))[\mathbf{a}]$ , we know that for every $x$ -variant assignment $\mathbf{a}''$ , $\mathbf{S} \models P (x) \rightarrow Q(x)[\mathbf{a}'']$ . So we also have that $\mathbf{S} \models P (x) \rightarrow Q(x)[\mathbf{a}^x_\mathsf{c}]$ (the $x$ -variant assignment where $x$ is sent to $\mathsf{c}$ ). Now we already know that $\mathbf{S} \models P(x)[\mathbf{a}^x_\mathsf{c}]$ , since we established that $\mathsf{c} \in \mathbf{I}(P)$ . But then $\mathbf{S} \models Q(x)[\mathbf{a}^x_\mathsf{c}]$ as well, by the conditional. So $\mathsf{c} \in \mathbf{I}(Q)$ . This, in turn, means that $\mathbf{S} \models Q(y)[\mathbf{a}^y_\mathsf{c}]$ . So by definition, $\mathbf{S} \models \exists y Q(y)[\mathbf{a}]$ . ◻

Now as you can see, reasoning about validity is rather painful when done semantically. So it is time to extend our tableau system with new rules so that it can handle arguments in first-order logic.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Satisfiability

Validity

License

Share This Book