Finite trees and their automata.
Apart from graphs, we will also use mso to define properties of trees (and their special case, words). Define a ranked alphabet to be a finite set where every element has an associated arity in . Here is a picture of a ranked alphabet:
A tree over a ranked alphabet is defined to be a (rooted) tree where every node has a label from . Furthermore, for each node the number of children is the same as the rank of the label, and we assume that these children are ordered, i.e. one can speak of the first child, second child, etc. Here is a picture of a tree over the alphabet from the example above:
A tree over an alphabet can be viewed as a logical structure, call it . The universe of is the nodes. For every there is a binary predicate which says that is the -th child of , and for every there is a predicate which says that has label . Using mso, one can define a descendant predicate: a node is a descendant of if and only if every belongs to every set that contains and is closed under children.
We say that a set of trees over a ranked alphabet is mso definable if there is an mso formula such that
Examples of mso definable sets of trees include: “the number of nodes is divisible by 3” or “some root-to-leaf path has even length”.
Tree automata
Apart from mso, another mechanism of defining sets of trees (also known as tree languages) is to use automata. There also exist regular expressions for trees, but we do not discuss those here.
A (nondeterministic) tree automaton consists of:
An automaton is called deterministic if every transition relation is actually a function . A run of the automaton over an input tree is a labelling of tree nodes by states which is consistent with the transition relation in the following sense: if a node has label and its children have states then the state in node satisfies
Note that that this definition covers the special case of leaves; if a leaf has label then the transition relation indicates what are the states allowed in leaves with state . Therefore, the transition relations for labels of rank zero can be viewed as initial states in the automaton. A tree is accepted by the automaton if there is some run which has an accepting state in the root. Therefore, one can view the run of the automaton as a bottom-up pass through the tree, which is considered accepting if the last state (the root) has an accepting state. That is why the automaton model described above is sometimes called a bottom-up automaton; although the distinction between top-down or bottom-up makes most sense when talking about deterministic automata.
The same proof as for automata on words, i.e. the subset construction, shows that tree automata can be determinised. The proof of the following theorem is skipped, it is a straightforward induction on the size of an mso formula. The proof uses the fact that tree automata have all the closure properties that correspond to operators in mso formulas, such as Boolean combinations and projections (guessing).
Theorem. If a tree language is definable in mso, then it is recognised by some tree automaton.
The converse of the above theorem is also true, i.e. every language definable by a tree automaton can be defined in mso.
Leave a Reply