This is ericpony's blog

Wednesday, November 15, 2017

Liveness to safety reduction with predicate abstraction

An automata-based approach for LTL model checking

Linear-time Temporal Logic (LTL) can be model checked for finite-state systems using symbolic automata-based techniques [6]. Given a transition system $S$ and an LTL formula $φ$, the model checking problem (denoted as $S ⊨ φ$) asks whether $φ$ is true on all infinite execution paths of $S$. A classical automata-based approach [6] toward this problem consists of building a Buchi automaton $S_{¬φ}$ with a fairness condition $f_{¬φ}$ such that $S ⊨ φ$ iff $S × S_{¬φ} ⊨ FG\ ¬f_{¬φ}$. Therefore, when we model check an LTL formula, we may assume without loss of generality that the automata-based transformation has been applied and consider the problem in form of $S ⊨ FG\ ¬f_{¬φ}$. In the following, we shall drop the subscript $¬φ$ and simply regard $FG\ ¬f$ as the target property to verify.

Liveness-to-safety reduction for finite-state systems

Liveness-to-safety reduction [4, 5] is a technique to translate an LTL model checking problem to an invariant verification problem. The invariant encodes the the absence of counterexamples to $FG\ ¬f$, namely, the system does not have a computation that visits some state $s ⊨ f$ infinitely often. The reduction algorithm augments the states of the system with flags seen, triggered, and loop, as well as maintaining a fresh copy of the state variables. It then simulates the system execution and in the meantime guesses the entry point of a cycle by non-deterministically flagging a state as "seen" and saving the valuation of the state variables. An execution will be "triggered" if it encounters a state $s ⊨ f$ after having flagged a state as "seen". Finally, a loop is detected if a triggered execution enters a seen state. The safety property thus specifies that no loop is detected on any execution. This reduction is sound and complete for finite-state and parameterised systems, since a liveness property is violated in such systems iff a lasso-shaped counterexample is present.

Liveness-to-safety reduction for infinite-state systems

The above liveness-to-safety reduction is unsound for infinite-state systems where an execution can violate a liveness property without visiting the same state twice. To overcome this problem, the authors of [1] propose to incorporate the notion of predicate abstraction in the reduction. The abstraction is supposed to be coarse enough to map a real counterexample (i.e. a path from an initial state that visits the fairness condition infinitely often) to a lasso-shaped abstract path, but at the same time precise enough not to admit spurious abstract cycles. The soundness of this approach is based on the fact that existential abstraction preserves any temporal logic without existential quantification over paths, e.g., $\forall$CTL$^*$ and LTL [7, 8].

Overview of the reduction procedure

Below we introduce the liveness-to-safety reduction proposed in [1] for infinite-state systems. Give a system $S$ and an LTL property $FG\ ¬f$, the procedure generates a sequence of safety verification problems $$S_0 ⊨_{inv} φ_0 ,\quad S_1 ⊨_{inv} φ_1,\quad \dots.$$ We shall denote the encoding of the $i$-th safety property as $$S_i, φ_i := {\rm encode}(S, f, P_i, W_i),$$ where $P_i$ is a set of state predicate, and $W_i$ is a set well-founded relations. Furthermore, the encoding ensures that $S ⊨ FG\ ¬f$ holds if $S_i ⊨_{inv} φ_i$ for some $i$; here $φ_i$ essentially states that cycles consisting of states satisfying $f$ is unreachable. The iteration terminates if some generated safety property is satisfied. Otherwise, suppose we have found $S_i \not⊨_{inv} φ_i$ for some $i$. We analyse a finite counterexample path $π$ of $S_i$ to determine whether it corresponds to an infinite real counterexample for $FG\ ¬f$ in $S$. If so, then we conclude that the property does not hold. Otherwise, if we can conclude that $π$ does not correspond to any real counterexample in $S$, we try to extract a set of new predicates $P'$ and/or well-founded relations $W'$ to produce a refined encoding: $$S_{i+1}, φ_{i+1} := {\rm encode}(S, f, P_i ∪ P', W_i ∪ W'),$$ where $P', W' := {\rm refine}(S_i, π, P_i, W_i)$. It is possible that the algorithm can neither confirm nor refute the existence of real counterexamples, in which case it aborts and returns “unknown”. The algorithm might also diverge and/or exhaust resources in various intermediate steps (e.g. in checking $S_i ⊨_{inv} φ_i$ or during refinement).
In the following, we shall describe the encoding and refinement procedures in more details. We begin with a simplified version that only uses predicates from $P$, i.e., $W = ∅$. We then describe how to extend the encoding to exploit also well-founded relations.

Counterexamples and refinement

Consider a safety checking problem $S_i ⊨_{inv} φ_i$. Each state of $S_i$ is a state of $S$ augmented with flags seen, triggered, and loop as in the case of finite-state system. However, instead of comparing the variation of state variables, the augmented system $S_i$ compares the bit-vector induced by the predicates $P_i$ to determine the equivalence between two states. A counterexample $π$ for a safety property $S_i ⊨_{inv} φ_i$ is a finite sequence of tuples where each tuple consists of an abstract state and a concrete state. The projection of the sequence to abstract states forms a lasso-shaped abstract path, and the projection to concrete states recovers the execution from which this counterexample is obtained. When the invariant is violated, we need to make sure that the obtained counterexample does not result from an insufficient precision of the abstraction induced by $P_i$. To do this, the algorithm first searches for a concrete lasso witnessing a real violation of the LTL property, using standard bounded model checking. If this fails, it tries to prove that the abstract lasso is infeasible by checking an increasing number of finite unrollings of the loop. Upon infeasibility, new predicates can be extracted from sequence interpolants with standard predicate refinement strategies for invariant properties [3].

Enhancement with well-founded relations

In the refinement operation described above, new predicates are extracted at the point when the unrolling of a spurious abstract counterexample becomes infeasible. However, it is possible that all finite unrollings of the abstract loop are feasible, but there is no concrete path executing the loop for an infinite number of times. If this happens, the refinement step might not terminate even though the abstract path cannot be concretised. The following example illustrate one such situations.

Example. Suppose that we want to verify an LTL property $FG\ ¬(x≤y)$ against a transition system $$S=(ℕ \times ℕ, \{(0,y) \mid y≥0\}, \{(x,y) \to (x+1, y)\}).$$ Let $P_0 = \{(x≤y), (y=0)\}$ be the initial set of predicates. Then $S_0 ⊨_{inv} \varphi_0$ admits an abstract counterexample path through the self loop at $\langle(x≤y), ¬(y=0)\rangle$. Unrolling this abstract path, however, will not terminate: any path formula obtained by unrolling the abstract loop for $i$ times is feasible by starting from the concrete initial state $(0,i).$

To address the spurious abstract paths whose finite prefixes are all feasible, we extend the encoding to incorporate well-founded relations over the concrete states. The idea is to block these spurious abstract paths by finding suitable termination arguments in form of (disjunctive) well-founded transition invariants. More precisely, given a finite set $W$ of well-founded relations, we encode $S, φ$ such that a lasso-shaped abstract path leads to a counterexample only if the loop contains at least two concrete states not covered by any relation in $W$. The resulting abstraction is sound as it preserves all concrete infinite paths. Indeed, if there is an infinite path such that any two states on it are covered by some relation in $W$, then by Ramsey's Theorem, there must be a relation in $W$ admitting an infinite descending chain, a contradiction. Below we show how spurious counterexamples can be removed by a suitably chosen well-founded relation.

Example. Consider the spurious abstract counterexample in the previous example. Projecting its loop to concrete states leads to two states $(a,b)$ and $(a+1,b)$ for some integers $a ≤ b$. Including relation $R := \{((x,y), (x',y)) \mid x < x' ≤ y  \}$ in $W$ will remove the path from the abstraction and prove the desired LTL property.

k-liveness as a well-founded relation. There are various heuristics to come up with new abstraction predicates and well-founded relations for refinement (see [1] for the details). That said, it is still possible that at some point all heuristics get stuck, failing to produce new predicates or relations to remove a spurious counterexample. In this situation, we may exploit the notion of k-liveness in hope to make further progress. Intuitively, we can maintain an auxiliary integer variable $k$ in the transition system to count the occurrences of $f$. With this counter, the refinement may always discover a new well-founded relation $R := \{ (s, s') ∣ k < k' ≤ n \}$ where $n$ is the number of occurrences of $f$ in the current counterexample. By adding this relation, we allow the procedure to progress by blocking all counterexamples where the occurrences of $f$ is less than or equal to $n$. This heuristic is sound since $f$ occurs infinitely often in a real counterexample.

References

1. Infinite-state Liveness-to-Safety via Implicit Abstraction and Well-founded Relations
2. Counterexample-guided Abstraction Refinement for Symbolic Model Checking
3. Abstractions from Proofs
4. Liveness Checking as Safety Checking
5. Liveness Checking as Safety Checking for Infinite State Spaces
6. An Automata-Theoretic Approach to Linear Temporal Logic

7. Model Checking and Abstraction
8. Property preserving abstractions for the verification of concurrent systems

Friday, October 20, 2017

Verifying liveness property for array systems

Numerous results in the literature of regular model checking have focused on parameterised concurrent systems, which consists of a parameterised number of identical concurrent processes. However, only a handful of these results have addressed parameterised systems allowing concurrent infinite-state processes, such as processes operating on variables over an unbounded data domain. Such kind of systems are also known as parameterised array systems, for a parameterised number of processes can easily be encoded via a parameterised array containing the local states of the processes. Existing results for verifying such systems have mostly dealt with safety properties. For liveness properties, it is sometimes possible to apply an ad-hoc finitary abstraction so that the system under consideration can be handled in the classical context of regular model checking. The following example demonstrates how finitary abstraction may be applied to verify an array system.

Chang-Robert's algorithm is a leader election protocol for a clockwise directed ring of processes where each process has a unique ID. Initially, all processes are active. The algorithm starts from one initiator sending a message to the next process in the ring, tagged with its ID. When an active process with ID $p$ receives a message tagged with $q$, there are three cases: i) $q < p$ : then $p$ drops the message. ii) $q > p$ : then $p$ becomes passive, and passes on the message. iii) $p = q$ : then $p$ becomes the leader. A passive process simply forwards every message it receives. If a process receives a message while it is holding one, the message with a larger ID will take over the another message.
The idea behind Chang-Robert's algorithm is that only the message with the highest ID will complete the round trip and finish the election by the third case. The parameterised version of the algorithm has two sources of infinity: the number of processes and the number of IDs. However, observe that we do not need the exact ID to run the election correctly---it suffices to know which ID is the highest. Hence, we can apply an abstraction that maps the highest ID to 1 and the other IDs to 0. It is clear that a process is elected as the leader in the concrete system iff it is so in the abstracted system. Verification of the liveness property "a leader will eventually be elected" then reduces to a fair termination problem amenable to off-the-shelf regular model checking tools.
Unfortunately, not every parameterised array system can be successfully simplified using a finitary abstraction. The following example provides an evidence for this fact.

Dijkstra's self-stabilising mutual exclusion algorithm assumes a clockwise directed ring of processes $p_0,\dots,p_{N-1}$ with $N\ge 2$. Each process $p_i$ holds an integer variable $x_i$ that can be read by the process $p_{i-1~{\rm mod}~N}$ in the ring. The values of each $x_i$ ranges in $\{0, \dots, K-1\}$, where $K\ge N$. A process is privileged if one of the following two conditions holds:
  • For $i>0$, $p_i$ is privileged if $x_i \neq x_{i-1}$.
  • $p_0$ is privileged if $x_0 = x_{N-1}$.
A privileged processes $p_i$ is allowed to update its local variable $x_i$, causing the loss of its privilege:
  • For $i>0$, $p_i$ can copy the value of $x_{i-1}$ to $x_i$ if $x_{i-1} \neq x_i$.
  • $p_0$ can set $x_0 \leftarrow (x_0+1)~{\rm mod}~K$ if $x_0 = x_{N-1}$.
Dijkstra's algorithm is self-stabilising in that the systems eventually converges to states where exactly one process is privileged. As before, the parameterised version of Dijkstra's algorithm has two sources of infinity: concurrency and numeric data type. Nevertheless, the liveness property of Dijkstra's algorithm cannot be as easily verified with a finitary abstraction as in the previous example. To see this, consider an abstraction function $\alpha$ that maps the concrete values of the variables $x_0,\dots,x_{N-1}$ to a finite abstract domain. Then for sufficiently large $K$, there is $k\in\{0,\dots,K-1\}$ such that $\alpha(k)=\alpha(k+1)$. Under this abstraction, however, any concrete state where $x_0,\dots,x_{N-1}\in\{k,k+1\}$ will result in a spurious fair path that does not stabilise. Clearly, such kind of spurious counterexamples would appear in any finitary abstraction of the data domain. What if we keep the data intact but instead apply finitary abstraction to the underlying topology? Unfortunately, it turns out that two incomparable processes alone suffice to produce a spurious counterexample.

Instead of verifying the desired property as a whole, we may break it into lemmas and prove them separately. More precisely, we can decompose the liveness property of Dijkstra's algorithm into three lemmas:

Lemma 1: The value of $x_0$ is updated infinitely often.

Lemma 2: Eventually $x_0 \neq x_i$ for all $i \neq 0$.

Lemma 3: Eventually $p_0$ is the only privileged process.

Lemma 1 implies Lemma 2 due to the assumption that $n \le m$ and the pigeonhole principle. Lemma 2 implies Lemma 3 by a simple propagation argument. With appropriate abstraction, these two implication relations can be modelled as liveness properties of simple token-passing rings and are amenable to standard liveness checkers (e.g. [1]) for regular model checking. Below we shall concentrate on how to automate the the proof of Lemma 1.

We first establish an invariant "it always holds that at least one process is privileged". This invariant can be seen by abstracting the value of each $x_i$ to EQ if $x_i = x_0$, and to NE if $x_i \neq x_0$. The invariant then follows from the fact that an abstract state is either in form of [EQ...EQ], where $p_0$ is privileged, or in form of [...EQ NE...], where $p_i$ is privileged for some $i\ge 1$.

With the above invariant, we know that the system never gets stuck. Now we show that $x_0$ is updated infinitely often. Given an infinite run of the system, consider a *maximal* segment $\pi$ of the run such that the status of $x_0$ is not changed on $\pi$ (i.e. $x_0$ is always privileged or unprivileged on $\pi$). We argue that the length of $\pi$ is finite. Consider abstract states w ∈ (P+N)* such that w[i] = P if $p_i$ is privileged, and w[i] = N if $p_i$ is not privileged. Then the transitions on $\pi$ are abstracted to

x·P·N·y → x·N·P·y
x·P·P·y → x·N·P·y
x·P·P·y → x·N·N·y
x·P        → x·N

where x ∈ (P+N)(P+N)* and y ∈ (P+N)*. Now it is clear that $\pi$ is finite, since each abstract transition either moves a P rightward, or eliminates at least one P from the state. Therefore the status of $x_0$ will change infinitely often on any infinite run, which implies that $x_0$ is updated infinitely often.

As a result, we can reduce Lemma 1 to a safety property and a termination property of simple token ring protocols, and verify them separately with safety verifiers (e.g. [2]) and termination verifiers (e.g. [3]).

References

2. Learning to Prove Safety over Parameterised Concurrent Systems
3. Fair Termination for Parameterized Probabilistic Concurrent Systems

Thursday, August 10, 2017

Logic grid puzzles and Beth's Definability Theorem


Logic Grid Puzzles are popular in magazines and newspapers. In such puzzles, the reader is asked to deduce an answer to a question from a set of clues in some scenario. For example, the scenario may be as follows: five guys are having drinks in a room. The clues are often information like "They all come from different countries", "The British is in blue" or "People in red don't drink alcohol". The question may be "What does the German drink?". The reader usually tries to deduce the answer by filling out a matrix with the clues, like the right one.
Puzzles of this kind involve turning an implicit characterisation (i.e. the clues) into an explicit one (i.e. a direct answer to the question). In formal logic, we may define a relation explicitly with a formula. More precisely, given a theory $S$ (i.e. a set of sentences), an n-ary relation $R(x_1,...,x_n)$ is explicitly defined by $\Phi$ w.r.t. $S$ if$$S \models \forall x_1,...,x_n: R(x_1,...,x_n)\equiv\Phi(x_1,...,x_n),$$where $\Phi(x_1,...,x_n)$ is a formula that doesn't refer to $R$. (If $A,B$ are theories, $A\models B$ means that every model of $A$ is also a model of $B.$) In contrast, a theory referring to a relation may not uniquely determine that relation. Intuitively, a theory characterises a relation $R$ iff we cannot find two models of the theory that are only different in their interpretations of the symbol $R$. More formally, if $S$ is a theory of vocabulary $\tau$, then $S$ implicitly defines a non-logical symbol $R\in\tau$ iff for all structures $A$ of vocabulary $\tau\setminus\{R\}$, there exist at most one relation $R^A$ over the domain of $A$ such that $A$ becomes a model of $S$ by interpreting $R$ to $R^A$. There are several equivalent ways to express this concept (e.g. this and this), but we shall use the following one for our convenience of proof:
Definition. A theory $S$ implicitly defines an n-ary relation $R(x_1,...,x_n)$ if$$S\cup S' \models \forall x_1,...,x_n: R(x_1,...,x_n)\equiv R'(x_1,...,x_n),$$where $S'$ is obtained by replacing every occurrence of $R$ in $S$ with a fresh relation symbol $R'$.
If a $R$ is explicitly defined by $\Phi$ w.r.t. $S$, then it is clear that $R$ can also be implicitly defined by $S$. The converse result is proved by Beth for first-order logic.
Theorem. (Beth's Definability Theorem for FOL) If a first-order property is implicitly definable then it can also be defined explicitly.
Beth's result is a corollary of Craig's Interpolation Theorem.
Lemma. If $A$ is a theory and $A\models \varphi$, then there is a finite set $A'\subseteq A$ such that $A'\models \varphi.$
Proof of lemma. Note that $A\models \varphi$ if and only if $A\cup \{\neg \varphi\}$ is unsatisfiable. By the compactness theorem, there is a finite subset $E\subseteq A\cup \{\neg \varphi\}$ that is unsatisfiable. If $\neg \varphi\not\in E$ then let $A' = E$. Otherwise, let $A' = E\setminus\{\neg \varphi\}$. Then $A'$ is the desired set.
Proof of Beth's theorem. Suppose we have$$S\cup S' \models \forall x_1,...,x_n: R(x_1,...,x_n)\equiv R'(x_1,...,x_n).$$By the above lemma, we can assume w.l.o.g. that $S,S'$ are finite. Let $X=\bigwedge S$ and $X'=\bigwedge S'$. Our assumption implies that $$X\wedge X' \rightarrow \forall x_1,...,x_n: R(x_1,...,x_n)\equiv R'(x_1,...,x_n)$$is valid. By introducing fresh variables $p_1,...,p_n$, we can rewrite the above sentence to $X\wedge X' \rightarrow (R(p_1,...,p_n)\equiv R'(p_1,...,p_n))$, which is further equivalent to$$(X\wedge R(p_1,...,p_n)) \rightarrow (X' \rightarrow R'(p_1,...,p_n)).$$Since $X$ doesn't contain $R'$, and $X'$ doesn't contain $R$, by Craig's theorem there exist a formula $\Phi(p_1,...,p_n)$ containing neither $R$ nor $R'$, such that both$$(X\wedge R(p_1,...,p_n)) \rightarrow \Phi(p_1,...,p_n) \quad\mbox{and}\quad \Phi(p_1,...,p_n) \rightarrow (X' \rightarrow R'(p_1,...,p_n))$$ hold. Note that the latter can be rewritten to $\Phi(p_1,...,p_n) \rightarrow (X \rightarrow R(p_1,...,p_n))$ by replacing every occurrences of $R'$ with $R.$ It follows from the currying law that$$X \rightarrow (R(p_1,...,p_n) \rightarrow \Phi(p_1,...,p_n))\quad\mbox{and}\quad X \rightarrow (\Phi(p_1,...,p_n) \rightarrow R(p_1,...,p_n))$$are valid. Conjuncting the two formulas leads to $$X \rightarrow (R(p_1,...,p_n) \equiv \Phi(p_1,...,p_n)),$$ which further implies that $$S \models \forall x_1,...,x_n: R(x_1,...,x_n) \equiv \Phi(x_1,...,x_n).$$ This concludes the proof of the theorem. 
Craig's interpolation can be computed via syntactical transformation, as is shown in Craig's 1957 paper. Hence, we can always construct an explicit definition out of an implicit one. Since explicit definition is a syntactical notion (i.e. defined with a formula) while implicit definition tends to be more semantical (i.e. defined with a theory), Beth's theorem indicates that the syntactical definability of first-order logic is "complete" in some sense.

References

Friday, July 21, 2017

A note on linking ideas in academic writing

Flow of information

Sentences and ideas should be ordered and linked to present information in a reasonable manner. Here reasonable means the reader can easily find what he expects to see as he reads. The following example shows how sentences can be re-ordered to improve readability. It is extracted from the first paragraph of an essay with title "The main sources of diversity in work values":
OriginalRevised
The importance of education should be pointed out as one of the crucial sources of diversity in attitudes to work. The rising educational level has been perceived as one of the most significant investments in the general quality of life. Tertiary education in particular plays a very special role in this respect. One of the crucial sources of diversity in attitudes to work is education. The rising educational level has been perceived as one of the most significant investments in the general quality of life. In this respect, tertiary education in particular plays a very special role.

Comments. It is preferred that the first sentence connects directly to the title, as that is what a reader expects when he starts reading the paragraph. The importance of education is redundant here because education is already pointed out as a crucial source. In this respect should be put at the beginning of a sentence since it serves as a linking phrase.

Text cohesion

Cohesion refers to the way in which elements of language are linked together to make a text. The sequence of given information followed by new information can contribute to the cohesion of a text because the new information of one sentence often becomes the given information of the next sentence. The following is an extract from a thesis about the construction of stance in academic writing. Words are colourised to show how old information is rephrased or repeated to improve smooth transitions to new information.
PassageComments
For the thesis writer, the construction of stance takes a distinctive form which derives from the nature of the thesis itself. Clarify the subject of this passage and identify the expected target readers—the thesis writers.
As noted above, writers have to convince the supervisors and examiners to accept their research, which means that the thesis is inherently persuasive. Clarify the goal of thesis writers to bring up the key characteristic of a thesis: being persuasive.
However, acceptance depends upon satisfying two quite distinct requirements. The subject acceptance (of persuasion) follows logically from the key old info "being persuasive".
The first is that the thesis should make an original contribution to knowledge. Introduce a claim.
In creating new knowledge and conveying it to other members of the discipline, the writer’s stance is that of a professional. Rephrase the claim and reach a partial conclusion.
The second requirement is that the thesis must show mastery of the field. Make another claim.
The display of established knowledge is required because the work will be assessed and thus the writer must also take a stance as a candidate. Rephrase the claim and make another partial conclusion.

However combining the roles of candidate and professional can lead to considerable conflict and uncertainty. Summarise the partial conclusions and start a new discussion.
As this example demonstrates, the new information of one sentence often becomes the given information of the next sentence. This sequence of given informationnew information links the ideas of the text together and creates cohesion.

Devices to link ideas

We present more examples showing how ideas can be linked between sentences grammatically. In the second sentences of each example, the given information are referred using three kinds of grammar patterns. New ideas can be introduced by adding extra information to the given information.
ExamplesComments
Contrast from differently doped regions immediately appears when the in-lens detector is switched on. This huge effect is believed to hold some important implications.A comment (ie. the effect is huge) is implicitly introduced when the old information, which is a neutral description of a fact, is referred. It is preferable to separate description and evaluation clearly in academic writing.
It will be seen in the sample that the simulation is able to adapt again. The ability of this model to emulate changes in the characteristics of different sample types is cited as strong support for its validity. The old information is rephrased by changing part of speech and inverting subject and object. By such rephrasing we avoid using "..., which ..." repeatedly for further explanation/elaboration. 
In this analysis, "norm" refers to a prescribed or proscribed standard of behaviour. Such a definition is therefore broader than rules alone.

Using such a instead of this means that the author is going to generalise. In this case, the author intends to discuss a class of definitions instead of the definition of "norm".
The fact that the same British forces had to cover various commitments at once did stretch them to the limit, and perhaps beyond. The problem of over-stretch had emerged in the late 1950s.The use of problem and over indicates the author's position in the following discussion on the aforementioned fact. Description and evaluation is cleanly separated.
For example, it would be interesting to find out if the active region contains more crystalline polymer as a result of operation. Such an investigation would involve using staining techniques.The noun investigation is used to encapsulate the text to find out .... The use of such a indicates that the following discussion would not be restricted to that particular example.
Remark. The main device for linking ideas in these examples is called nominalisation, which is the use of a noun to express an idea that is previously expressed in the form of a clause or a sentence. Nominalisation is very useful in the construction of argument, as we can present what we already said in a shortened form and then move on to the new point following from that information. Here is an example:
I explained the results as such and such. This explanation is in agreement with earlier work.
We can add more details to the noun using prepositional phrases and clauses:
What I have sought to disprove is that there is an incompatibility between a "bad" nationalism and a "good" patriotism. The assertion that nationalism and patriotism are incompatible causes complete confusion when historians, even great ones, are confronted by past thinkers.

Tuesday, July 18, 2017

A note on useful decidable logics & theories

Note that this post discusses decidable fragments that allows at least some degree of quantifications, not decidable theories of ground formulas used in SMT solvers.

Decidable fragments of first-order logic

Monadic first-order logic. Also known as the relational monadic fragment (RMF) or the Löwenheim class [2], this fragment is a classical example of decidable first-order logic without equality. The RMF consists of the first-order formulas where all relation symbols are unary:$$\phi ::= R(x) \mid \neg \phi \mid \phi \vee \phi \mid \phi \wedge \phi \mid \exists x.\phi \mid \forall x.\phi.$$Note that equality is not allowed in this logic and there are no constant symbols. Satisfiability of the logic is first proved to be NE-complete [2], where NE = $\bigcup_c NTIME(2^{cn})$. Later it is shown that NE = NEXPTIME [1], and thus the problem is NEXPTIME-complete. The Löb-Gurevich class (cf. [8]) is a decidable extension of the RMF where unary function symbols are also allowed. Both of the classes enjoy the finite model property, cf [8].

Effectively propositional logic (EPR). [5] Also known as the Bernays-Schönfinkel class, this fragment consists of first-order formulas such that:
1. The vocabulary is restricted to constant and relation symbols.
2. The quantifier prefix is restricted to $\exists^*\forall^*$ for formulas in prenex normal form.
Condition 2 implies that EPR is not closed under negation. EPR enjoys the finite model property, meaning that a satisfiable formula is guaranteed to have a finite model. The satisfiability problem of EPR is NEXPTIME-complete.
We can extend EPR to allow quantifier alternation and function symbols while maintaining the same properties, as long as the formula in the extended logic is "stratified" [11]. More precisely, we define the quantifier alternation graph $G(\phi)$ for a formula $\phi$ as a directed graph where the set of vertices is the set of sorts, and for each function symbol $f : (s_1,\ldots, s_n) \to s$ in the skolem normal form of $\phi$, there is an edge $s_i \to s$ for $1 \le i \le n$. $\phi$ is called stratified if $G(\phi)$ does not contain a cycle through the sort of some universally quantified variable. For example, $\phi := \forall x. \exists y. f(x) = y$ is skolemized to $\forall x. f(x) = g(y)$ with $f,g : A \to B$. Hence $G(\phi)$ is acyclic and thus $\phi$ is in the decidable fragment.
Stratified formulas are also decidable, for there is no way to generate an infinite sequence of instantiation terms. By contrast, $\psi := \forall x. \exists y. f(y) = x$ is skolemized to $\forall x. f(g(x)) = x$ where $f: A \to B$ and $g: B \to A$. Hence $G(\psi)$ contains a cycle through the sort of $x$, and thus $\psi$ is not in the decidable fragment.
Formulas in the extended EPR can be effectively translated into propositional logic formulas through a instantiation process over a finite Henken domain, see e.g., [12,13]. Extended EPR is supported by first-order logic provers such as iProver (which is known as the most efficient solver for EPR), and by modern SMT solvers such as Z3.

Presburger arithmetic. The first-order theory of structure $(\mathbb N, +)$ can be decided by encoding $n\in \mathbb N$ in binary (LSB first), such that the atomic ternary-relation $+$ corresponds to a regular relation $+_2$ over this coding and $\mathbb N$ corresponds to the $\omega$-regular language $(0+1)^*0^\omega$. Buchi has shown in 1962 that given any formula $\phi(x_1,...,x_n)$ over ${\rm FO}(\mathbb N, +)$, one can effectively compute an $\omega$-automaton $A(x_1,...,x_n)$ that defines the same relation modulo convolution $\otimes$, namely, $\phi(x_1,...,x_n)$ holds iff $A$ accepts the $\omega$-word $x_1 \otimes \cdots \otimes x_n.$ Satisfiability problem for ${\rm FO}(\mathbb N, +)$ thus reduces to the emptiness problem for $\omega$-automata and is thus decidable. See [9] for a review of the algorithmic and computational complexity aspects of PA. Also see a recent historical review in this paper and its presentation.

A list of decidable theories can be found on Wikipedia.

References

1. Structural properties of complete problems for exponential time. 1997.
2. Complexity results for classes of quantification formulas. 2001.
3. Verification of randomised security protocols.
4. Decidability of second-order theories and automata on infinite trees. 1969.
5. Complexity results for classes of quantificational formulas. 1980.
6. The theory of ends, pushdown automata, and second-order logic. 1985.
7. Bisimulation through Probabilistic Testing. 1991.
8. The Classical Decision Problem. 2001.
9. A Survival Guide to Presburger Arithmetic. 2018.
10. Automata: From Logics to Algorithms. 2007.
11. Complete instantiation for quantified formulas in satisfiabiliby modulo theories. 2009.
12. Deciding effectively propositional logic using DPLL and substitution sets. 2008.
13. Proof systems for effectively propositional logic. 2008.

Tuesday, June 20, 2017

A note on bisimulation and coinduction

A machine $(X,Y,S,\delta,\lambda)$ is a 5-tuple composed by an input alphabet $X$, an output alphabet $Y$, a state space $S$, a transition function $\delta:S\times X \rightarrow S$, and an output function $\lambda: S\rightarrow Y.$ We assume that $\delta$ is total without loss of generality. Also, its domain be extended to $X^*$ by defining $\delta^*(s,\epsilon)=s$ and $\delta^*(s,wx)=\delta(\delta^*(s,w),x).$ A bisimulation $R$ between two machines $M=(X,Y,S,\delta,\lambda)$ and $M'=(X,Y,S',\delta',\lambda')$ is a relation on $S\times S'$ defined recursively as $$R=\{(s,s'): \lambda(s)=\lambda'(s') \wedge \forall x\in X. (\delta(s,x),\delta'(s',x))\in R\}.$$ The union of all bisimulations is the greatest bisimulation and is often denoted by $\sim$. When $\sim$ is defined for one machine, it forms an equivalence relation and is called bisimilarity. Bisimilarity can be characterised inductively as follows. We define a sequence of equivalence relations $R_0 \supseteq R_1 \supseteq \cdots$ by letting $R_0 = S\times S$ and for $n\ge 1$, $(s, s') \in R_n$ iff for all $x\in X$, $\delta(s, x)$ implies $\exists x'\in X.\ (\delta(s, x), \delta(s', x')) \in R_{n-1}.$ It is easy to verify that the $R_n$ converges to the bisimilarity $\sim$, namely, $\bigcap_{n\ge 0} R_n=\sim$. (To see this, note that $\supseteq$ holds as $R_n \supseteq \sim$ for all $n\ge 0.$ Also, $\bigcap_{n\ge 0} R_n$ is a bisimulation and thus $\subseteq$ holds.) This inductive definition gives a simple fixpoint procedure for computing bisimilarity, see below.

Bisimulation and language equivalence

We call a machine a (deterministic) automaton if its output alphabet, denoted by $\mathcal{B}$, consists of only "rejected" and "accepted", denoted by $\bot$ and $\top$, respectively. It is interesting to note that a bisimulation $R$ between two automata $M$ and $M'$ over the same input alphabet $X$ can be itself described by an automaton $(X,\mathcal{B},R,\delta_R,\lambda_R)$, such that
1. $\delta_R((s,s'), x)=(\delta(s,x),\delta'(s',x))$ iff $(\delta(s,x),\delta'(s',x))\in R,$ and
2. $\lambda_R((s,s'))=\lambda(s)=\lambda'(s').$
Given an automaton $M=(X,Y,S,\delta,\lambda)$, the state language of a state $s$ of $M$ is defined as $L(s)=\{w\in X^{*}:\lambda\circ\delta^*(s,w)=\top\}.$ We call $L(s)$ the language of $M$ when $s$ is the initial state of $M$. The following implication can be shown by induction on the length of words:$$s\sim s' \implies L(s)=L(s').$$
Bisimulation equivalence is also called behavioural or observational equivalence by regarding the observable behaviours of a state as its state language. In the world of finite-state systems, two other notions of equivalence are often used in practice to prove language equivalence. The first notion is graph isomorphism. As the minimal automaton recognising a ($\omega$-)regular language is unique up to graph isomorphism, checking language equivalence amounts to checking graph isomorphism between their minimised automata. The second notion is trace equivalence, which relates two states iff they yield the same set of traces. Trace equivalence and bisimulation equivalence coincide on deterministic machines but the latter is finer in general (e.g. consider $a(b+c)$ v.s $ab+ac$). Here is a comparison among the three relations on finite machines:
Complexity: Bisimulation equivalence is $O(m\log n)$; Graph isomorphism is in NP; Trace equivalence is PSPACE-Complete.
Fineness: Graph isomorphism (w.o. minimisation) $\subseteq$ Bisimulation equivalence $\subseteq$ Trace equivalence $\subseteq$ Language equivalence

The proof principle of coinduction

Coinduction is a mathematical tool to define bisimulation. To demonstrate its use, we present here a technique that proves language equivalence using coinduction [1]. A $w$-derivative of a language $L\subseteq X^{*}$ is defined as $L_{w}=\{u\in X^{*}:wu\in L\}.$ The set of $\mathcal{L}$ languages on $X$ can be defined as an automaton $M_{\mathcal{L}}=(X^{*},\mathcal{L},\mathcal{B},\delta,\lambda)$ such that $\delta(L,x)=L_{x}$ and $\lambda(L)=\top$ iff $\epsilon\in L$. It turns out that $L(l)=l$ for all $l\in\mathcal{L}$, namely the state language of each state in $M_{\mathcal{L}}$ is the state itself. Hence we have$$L\sim L' \iff L=L'.$$This fact provides a uniform way to prove the equivalence of two languages. The more familiar method of proof by induction requires that one start from the equality of the base case (which is the pair of the minimal words in the two languages) and proceed to establish the equivalence of the pairs of longer words. Induction however doesn't work when the languages contain words of infinite length. In contrast, proof by coinduction starts from $\{(L,L')\}$ and continues to include more and more pairs via the transitions until a least fixed point is reached. This fixed point, whenever exists, is a bisimulation relation contained in the bisimilarity $\sim$ and thus implies that $L = L'.$
The principle of coinduction is not always effective—the process of constructing a bisimulation may not terminate even when the target languages are equivalent. This fact can be expected since otherwise the Halting problem would be solvable. On the other hand, the coinduction process always terminates when the two languages under consideration are regular: it either constructs a bisimilarity when they are equivalent, or finds a counterexample when they are not. In the case when the two languages are the same regular language, the process actually constructs the smallest finite automaton recognising them (more details later). This fact is a corollary of Kleene's Theorem, stating that a language is regular iff it is accepted by a finite automaton; see Section 8 of [1] for details. In fact, the most efficient equivalence checking algorithm for regular languages up to date is based on coinduction [4].

Bisimulation and fixed points

Given a machine $M = (X,Y,S,\delta,\lambda)$, one can define a monotone function $$F(A) := \{(s,s')\in S\times S : \forall x\in X\cup\{\epsilon\}.(\delta(s,x),\delta(s',x))\in A\}$$on lattice $(2^{S\times S},\subseteq).$ In the terminologies of fixed-point theory, bisimulations are the post-fixed points of $F$ and bisimilarity is the greatest post-fixed point $\nu X.~F(X).$ (Note: these fixed points collapse when $\delta$ defines deterministic transitions.) In this regard, proof by coinduction can be seen as showing that a property is closed under backward computation and contained in a coinductively defined set (ie. the greatest fixed point). In contrast, induction involves showing a property is closed under forward computation and containing an inductively defined set (ie. the least pre-fixed point). Intuitively, backward means that the set is obtained by keeping refine larger sets until the first (ie. the greatest) fixpoint is reached; and forward oppositely means that the set is obtained by keeping extend smaller sets until the first (ie. the smallest) fixpoint is reached.
The following procedure gives a naive fixpoint computation for the bisimilarity over a finite and non-deterministic machine. The bisimilarity is represented as a functional relation $\rho \subset S\times\mathbb N$ that maps states to labels of the equivalence classes induced by the bisimilarity.
Procedure ComputeBisimilarity
Begin
  Set $\rho := \{(s, 1) : s \in S\}$, $\rho' := \emptyset$
  While $\rho \neq \rho'$ do
    Set $\gamma := \{(s, \{ (x, \emptyset) : \delta(s,x)\in S \}) : s \in S \}$
    For all $s, x, s'$ such that $\delta(s,x) = s'$
      Set $\gamma(s)(x) := \gamma(s)(x) \cup \{ \rho(s') \}$
    Set $id := 0$, $A' := \mathbb N$, $\rho' := \emptyset$
    Let $L$ be a listing of $\gamma$ sorted by the second component
    For each $(s,A)$ in $L$ do
      If $A\neq A'$ then
        Set $A' := A$, $id := id + 1$
      Set $\rho' := \rho' \cup \{(s,id)\}$
  Output $\rho$ as the equivalence classes induced by the bisimilarity
 End
The procedure starts from a trivial binary relation where all states are equivalent. In each iteration, the algorithm computes a mapping $\gamma$ from each state to the labelling of its successors induced by the current bisimulation. A state will then be re-labelled according to the labelling of its successors changes. The refinement will continue until the labelling is stable, meaning that the greatest fixpoint (i.e. the unique largest bisimulation) is reached. See this paper for a survey of algorithms for computing bisimulations.
The fact that bisimulation relation can be constructed inductively implies that coinductive and inductive reasoning should yield the same results on inductively defined structures such as lists and trees, and make a difference only on structures that contain cycles or infinite paths. Further, while bisimulation coincides with bisimilarity on deterministic machines (such as $M_{\mathcal L}$), coinduction in general does not pinpoint bisimilarity for non-deterministic machines.

Bisimulation and minimisation

A homomorphism between two machines $M$ and $M'$ over the same input alphabets is any function $f:S\rightarrow S'$ satisfying $f(\delta(s,x))=\delta'(f(s),x)$, namely $$ \array{ S\times X &\overset{f\times id_X}{\longrightarrow}& S'\times X \\ \delta \downarrow & & \downarrow \delta' \\ S &\overset{f}{\longrightarrow}& S' }$$A homomorphism is called unique if given any homomorphism $g$ there exists an isomorphism $h$ such that $f=h\circ g$. The notion of unique homomorphisms is closely related to that of bisimulations. First, a mapping $f:S\rightarrow S'$ is a homomorphism iff $\{(s,f(s)):s\in S\}$ is a bisimulation between machines $M$ and $M'$. Furthermore, given a homomorphism $f:S\rightarrow S'$,
i) If $R$ is a bisimulation on $M$, then $f(R)=\{(f(u),f(v)):(u,v)\in R\}$ is a bisimulation on $M'$. [p.34, 5]
ii) If $R'$ is a bisimulation on $M'$, $f^{-1}(R')=\{(u,v):(f(u),f(v))\in R'\}$ is a bisimulation on $M$. [p.34, 5]
iii) $f$ is unique iff for any homomorphism $g:S\rightarrow S'$, the set $\{(f(s),g(s)):s\in S\}$ is a bisimulation on $M'$.
Unique homomorphism is a powerful utility to study languages and automata. For example, fix an automaton $M$ and consider a mapping $\phi: s \mapsto L(s)$ from $M$ to $M_{\mathcal L}.$ It is straightforward to check that $\phi$ is a unique homomorphism. One interesting fact about $\phi$ is that it identifies precisely the bisimilar states in $M$. That is, $$u\sim v \iff \phi(u)=\phi(v),$$which follows from the fact that $\{(u,v)\in S\times S: \phi(u)=\phi(v)\}$ is a bisimulation on $M$ and that $\{(\phi(u),\phi(v))\in \mathcal{L}\times \mathcal{L}:u\sim v\}$ is a bisimulation on $M_{\mathcal L}.$ Another amazing fact is that $\phi$ effectively minimises automata. To see this, let $\langle s \rangle_N$ (resp. $\langle S \rangle_N$) denote the sub-automaton generated by state $s$ (resp. sets of states $S$), i.e., the automaton induced by the states reachable from $s$ (resp. $S$), in automaton $N$. Also, we lift $\phi$ to a graph homomorphism such that given an automaton $N$, $\phi(N)$ is the sub-automaton of $M_{\mathcal L}$ generated by $\{\phi(s):s\in State(N)\}$. It turns out that$$\langle L(s) \rangle_{M_{\mathcal L}} = \langle \phi(s) \rangle_{M_{\mathcal L}} = \phi(\langle s \rangle_M).$$Since given $L$ we are free to choose any $s$ and $M$ satisfying $L(s)=L$, it follows that
i) If $L$ is regular, then $\langle L \rangle_{M_{\mathcal L}}$ is the canonical minimum automaton accepting $L$, and
ii) $\phi(\langle s \rangle_M)$ is the canonical minimisation of automaton $\langle s \rangle_M$.
In particular, $L$ is accepted by a finite automaton iff $\{L_w: w\in X^*\}$ is finite. This result is equivalent to the well-known characterisation of regular languages by Nerode and Myhill:
Theorem. (Nerode-Myhill) $L$ is accepted by a finite automaton if $R_L$ has a finite index. (Namely there are a finite number of classes in the equivalence relation $R_L$ on $L$ defined by $(u,v)\in R_L$ iff $\forall w\in L. (uw\in L \iff vw\in L).$)

References and further reading

1. Automata and Coinduction (An Exercise in Coalgebra)
2. Introduction to Bisimulation and Coinduction (book) (slides)
3. An introduction to (co)algebra and (co)induction
4. Checking NFA Equivalence with Bisimulations up to Congruence
5. Universal Coalgebra: A Theory of Systems
6. Bisimulation and Language Equivalence
Related Posts Plugin for WordPress, Blogger...