LTCS–Report Approximation in Description Logics: How Weighted Tree Automata Can Help to Deﬁne the Required Concept Comparison Measures in FL 0

. Recently introduced approaches for relaxed query answering, approximately deﬁning concepts, and approximately solving uniﬁcation problems in Description Logics have in common that they are based on the use of concept comparison measures together with a threshold construction. In this paper, we will brieﬂy review these approaches, and then show how weighted automata working on inﬁnite trees can be used to construct computable concept comparison measures for FL 0 that are equivalence invariant w.r.t. general TBoxes. This is a ﬁrst step towards employing such measures in the mentioned approximation approaches.


Introduction
Description Logics (DLs) [5] are a well-investigated family of logic-based knowledge representation languages, which are frequently used to formalize ontologies for application domains such as biology and medicine [21]. To define the important notions of such an application domain as formal concepts, DLs state necessary and sufficient conditions for an individual to belong to a concept. These conditions can be atomic properties required for the individual (expressed by concept names) or properties that refer to relationships with other individuals and their properties (expressed as role restrictions). The expressivity of a particular DL L is determined on the one hand by what sort of properties can be required and how they can be combined. On the other hand, DLs provide their users with ways of stating terminological axioms in a socalled TBox. The simplest kind of TBoxes are called acyclic TBoxes, which consist of concept definitions without cyclic dependencies among the defined concepts. Basically, such a TBox introduces abbreviations for complex concept descriptions. General TBoxes use so-called general concept inclusions (GCIs) to state subconcept-superconcept constraints between concepts. Once the relevant concepts of an application domain are formalized in a TBox, they can be employed to state information about specific entities (individuals, objects) and their relationships in a so-called ABox. Given a TBox and an ABox, queries can then be used to retrieve new information from the data formalized this way. We will introduce the basic notions of DLs in Section 2, and define three DLs of different expressive power, namely the DLs ALC, EL, and FL 0 .
Since the semantics of traditional DLs is based on classical first-order logic, the interpretation of the properties required for a concept is strict in the sense that all these properties need to be satisfied for an individual to belong to a concept, and the same is true for answers to queries. In applications where exact definitions are hard to come by, it would be useful to relax this strict requirement and allow for approximate definitions of concepts, where most, but not all, of the stated properties are required to hold. Similarly, if a query has no exact answer, approximate answers that satisfy most of the features the query is looking for could be useful. For example, in clinical diagnosis, diseases are often linked to a long list of medical signs and symptoms, but patients that have a certain disease rarely show all these signs and symptoms. Instead, one looks for the occurrence of sufficiently many of them. Similarly, people looking for a flat to rent or a bicycle to buy may have a long list of desired properties, but will also be satisfied if many, but not all, of them are met.
In order to allow for approximate definitions of concepts, we have introduced the notion of a graded membership function in [4]. Instead of a Boolean membership value 0 or 1 such a graded function yields a membership degree from the interval [0,1]. Threshold concepts can then, for example, require that an individual belongs to a concept C with degree at least 0.8. A different approach, which is based on the use of similarity measures on concepts [25], was used by Ecke et al. [18,19] to relax instance queries (i.e., queries that consist of a single concept). Given a query concept C, they are looking for answers to queries D whose similarity to C is higher than a certain threshold. While these two approaches were originally developed independently of each other, it has turned out that there are close connections. Similarity measures can be used to define graded membership functions, and threshold concepts w.r.t. these functions provide a more natural semantics for relaxed instance queries [4,6]. Thus, in both approximation approaches mentioned until now, the availability of appropriate measures for comparing concepts is crucial. The same is true for the approximate unification of concepts introduced in [7]. Basically, unification in DLs tries to make two concepts equivalent by replacing some of the concept names occurring in their descriptions by complex concepts [9,8]. In the approximate case, one requires that concepts are made "almost" equivalent, where the meaning of "almost" is formalized using distance measures between concepts. Strictly speaking, these distance measures are not similarity measures in the sense of [25] since they need not map into [0,1]. In the following, we will call functions that compare pairs of concepts by mapping them into a (usually numerical) domain equipped with a partial order concept comparison measures.
An indispensable requirement for the concept comparison measures used in the three approximation approaches mentioned above is that they respect the semantics of concepts in the sense that they are invariant under equivalence of concepts w.r.t. their definitions in the TBox. For the DL EL, a framework for defining concept similarity measures that are equivalence invariant w.r.t. acyclic TBoxes has been introduced in [25]. This was extended in [19] to general TBoxes. For FL 0 , concept similarity measures that are equivalence invariant for acyclic TBoxes were introduced in [30]. The main technical contribution of this paper is to introduce a framework for defining computable concept comparison measures for FL 0 that are equivalence invariant w.r.t. general TBoxes. Basically, this is achieved by leveraging a new formal language-based characterization of equivalence in FL 0 w.r.t. general TBoxes [28], where the semantics of a concept is characterized using a tuple of (possibly infinite) formal languages. Following the ideas in [9,10,28], such tuples can be represented by (infinite) trees. These trees (or more precisely, appropriate finite representations of them) can in turn be used as inputs for weighted tree automata [31], which then yield the output of the measure. We will show that, under certain conditions on the weighted tree automata, this approach indeed yields computable concept comparison measures.

Description Logics, concept comparison measures, and approximation
We start by recalling basic notions of Description Logics, and in particular the DLs ALC, EL, and FL 0 . Then, we introduce concept comparison measures, which generalize concept similarity measures, and finally we show how such measures can be used to relax query answering, approximately define concepts, and approximately solve unification problems.

Description Logics
In Description Logics, concept constructors are used to build complex concept descriptions out of concept names (unary predicates) and role names (binary predicates). A particular DL L is determined by the available constructors. Given finite, disjoint sets N C and N R of concept names and role names, respectively, we denote the set of all concept descriptions that can be built from N C and N R using the constructors of L with C L (N C , N R ).
As an example, consider the constructors top concept ( ), bottom concept (⊥), conjunction (C D), disjunction (C D), negation (¬C), value restriction (∀r.C), and existential restriction (∃r.C), which determine the DL ALC. Then, C ALC (N C , N R ) is inductively defined as follows: We will also consider the following two sub-logics EL and FL 0 of ALC: -EL has the constructors top concept, conjunction, existential restriction; -FL 0 has the constructors top concept, conjunction, value restriction.
The semantics of a DL L is defined using first-order interpretations I = (∆ I , . I ) consisting of a non-empty domain ∆ I and an interpretation function . I that assigns a set A I ⊆ ∆ I to each concept name A ∈ N C and a binary relation r I ⊆ ∆ I × ∆ I to each role name r ∈ N R . This function is extended to complex concept descriptions by assigning a set C I ⊆ ∆ I to each C ∈ C L (N C , N R ) according to the semantics of the constructors of L. The semantics of the constructors is defined by equations that enable the inductive definition of C I for any interpretation I. (∀r.C) I = {x ∈ ∆ I | ∀y ∈ ∆ I : (x, y) ∈ r I ⇒ y ∈ C I }, (∃r.C) I = {x ∈ ∆ I | ∃y ∈ ∆ I : (x, y) ∈ r I ∧ y ∈ C I }.
An L terminology (TBox) T is a finite set of general concept inclusions (GCIs), which are expressions of the form C D for C, D ∈ C L (N C , N R ). The interpretation I is a model of T if it satisfies all its GCIs, i.e., C I ⊆ D I holds for all GCIs C D in T . An L ABox A is a finite set of assertions, which are expressions of the form C(a) or r(a, b), where C ∈ C L (N C , N R ), r ∈ N R and a, b are elements of an additional set N I of individual names, which is disjoint with N C and N R . An interpretation then additionally assigns elements a I ∈ ∆ I to individual names a ∈ N I . The interpretation I is a model of A if it satisfies all its assertions, i.e., a I ∈ C I (resp. (a I , b I ) ∈ r I ) holds for all assertions C(a) (resp. r(a, b)) in A.
Given an L TBox T and two L concept descriptions C, D, we say that C is subsumed by D w.r.t. T (denoted as C T D) if C I ⊆ D I for all models I of T . These two concept descriptions are equivalent (denoted as C ≡ T D) if C T D and D T C. Equivalent concept descriptions have the same meaning w.r.t. T in the sense that they always (i.e., in every model of T ) yield the same set. In the presence of an L ABox A, we can also consider the instance problem: given an individual name a and an L concept description C we say that a is an instance of C in A w.r.t. T (written A |= T C(a)) if a I ∈ C I for all models I of T that are also models of A. For the DL EL, the subsumption, equivalence, and instance problem are polynomial [14] whereas they are ExpTime-complete for FL 0 [3] and for ALC [32].

Concept comparison measures
Subsumption and equivalence can be seen as operations that compare concept descriptions, and yield the comparison value 1 if the relation holds and 0 otherwise, i.e., and accordingly for equivalence. Intuitively, concept comparison measures generalize such operations by yielding a degree to which the comparison relation is satisfied. More formally, they return a value in a partially ordered set.
Definition 1. Let L be a DL. A concept comparison measure (CCM) for L is a family of functions c that contains, for every L TBox T , an equivalence invariant function c T : -S is a non-empty set equipped with a partial order ≤ S , and equivalence invariant means that The reason we require equivalence invariance is that we do not view concept descriptions as syntactic objects, but rather as semantic ones that, for every interpretation, yield a subset of the interpretation domain. Since equivalent concept descriptions always yield the same sets, they are the same objects from a semantic point of view, and thus should also be treated the same way by the comparison function. The partial order on S allows us to compare different comparison degrees. We will later use the natural numbers and the non-negative real numbers, possibly extended with infinity +∞, as well as the closed real interval [0, 1] with the obvious orders as sets S.
Well-investigated examples of CCMs are concept similarity measures (CSMs), for which S = [0, 1] (see e.g., [25]). Intuitively, a CSM T is a graded variant of equivalence, where two concept descriptions C, D are equivalent iff T (C, D) = 1, and they become less and less similar with decreasing value T (C, D). Usually, one also requires CSMs to be symmetric in the sense that T (C, D) = T (D, C). For the DL EL, a framework for defining CSMs satisfying certain additional properties has been introduced in [25], but equivalence invariance was only achieved for so-called acyclic TBoxes. This was extended in [19] to general TBoxes. For FL 0 , CSMs that are equivalence invariant for acyclic TBoxes were introduced in [30]. We will show later how CCMs for FL 0 that are equivalence invariant for general TBoxes can be obtained by using weighted tree automata. CSMs for ALC are, for instance, investigated in [15].
Our definition of CCMs encompasses CSMs, but also covers other measures such as concept distance measures, which are mappings into [0, +∞) for which a larger value indicates that the concept descriptions are less similar (see e.g., [7]). In addition, it covers graded variants of subsumption, which map into [0, 1], but in contrast to CSMs are not supposed to be symmetric. For example, the CSMs for EL and FL 0 in [34] and [30], respectively, are based on asymmetric concept subsumption measures, which are then turned into symmetric CSMs by combining the results of the comparisons in both directions by computing the average.

Approximation
In contrast to approaches that try to speed up reasoning by employing approximate inference techniques [27], we use approximation as a way to extend the range of admissible answers to queries or admissible elements of concept descriptions. In this context, CCMs can be used together with a threshold construction to define which answers or individuals are admissible.
Relaxing instance queries Ecke et al. [18,19] use CSMs to relax instance queries in EL, i.e., instead of requiring that an individual is an instance of the query concept, they only require that it is an instance of a concept description that is "similar enough" to the query concept.

Definition 2.
Let be a CSM for EL, T an EL TBox, A an EL ABox, and t ∈ [0, 1). The individual a ∈ N I is a relaxed instance of the EL concept description Q w.r.t. T , A, , and the threshold t if there exists an EL concept description X such that T (Q, X) > t and A |= T X(a).
Ecke et al. [18,19] show that, under certain conditions on the CSMs used, the relaxed instance problem for EL is decidable. They also introduce a class of polynomially computable CSMs on EL concept descriptions for which the relaxed instance problem is in NP.
Adding threshold concepts to EL In [4], a similar construction is used to relax membership in EL concept descriptions. To be more precise, the authors introduce the notion of a graded membership function m to generalize elementhood in concept descriptions, and then use a threshold construction to obtain new concept constructors.

Definition 3.
A graded membership function m is a family of functions that contains for every interpretation I a function m I : ∆ I × C EL (N C , N R ) → [0, 1] satisfying the following conditions (for C, D ∈ C EL (N C , N R )): Intuitively, given an interpretation I and d ∈ ∆ I , m I (d, C) ∈ [0, 1] represents the degree to which d belongs to C in I. The threshold concept C ∼t for ∼ ∈ {<, ≤, >, ≥} then collects all the elements of ∆ I that belong to C with degree ∼ t, as measured by m. To be more precise, the formal semantics of threshold concepts is then defined as follows: (C ∼t ) I := {d ∈ ∆ I | m I (d, C) ∼ t}. The DL τ EL(m) extends EL with such threshold concepts.
In [4] a specific such graded membership function called deg is introduced and the complexity of reasoning in τ EL(deg) w.r.t. empty TBoxes (NP-complete or coNP-complete, depending on the reasoning problem) is investigated in detail. In addition, it is shown that, using a construction similar to the one in Definition 2, a CSM satisfying certain properties can be used to define a graded membership function: To ensure that this construction yields a well-defined graded membership function, the CSM must be equivalence invariant, role-depth bounded, and equivalence closed (see [25,4] for definitions of the latter two properties). Finally, the authors of [4] prove that answering relaxed instance queries w.r.t.
is the same as answering instance queries for threshold concepts C >t in τ EL(m ).
In [6] it is shown that, for computable CSMs satisfying these properties, reasoning in τ EL(m ) can effectively be reduced to reasoning in the DL ALC, but the reduction is in general nonelementary. The authors then introduce a class of CSMs such that reasoning in τ EL(m ) has the same complexity as reasoning in τ EL(deg).

Approximate unification
Unification has been introduced as a novel inference service that can be used to detect redundancies in ontologies. Basically, in unification one views some of the concept names in concept descriptions as variables, which can be replaced by concept descriptions using a substitution. The goal is then to make two concept descriptions equivalent by applying the same substitution to both. For example, consider the FL 0 concept descriptions C = A ∀r.(X ∀s.B) and D = Y ∀r.(Z A ∀r.B).
Obviously, the substitution σ that replaces X by A ∀r.B, Y by A, and Z by ∀s.B makes C, D equivalent (w.r.t. the empty TBox): Such substitutions are called unifiers.
Unification was first investigated in detail for the DL FL 0 [9], and later on for the DL EL [8]. The unification problem, i.e., deciding whether two concept descriptions with variables have a unifier or not, is ExpTime-complete in FL 0 and NP-complete in EL. Both works basically restrict their attention to the case of an empty TBox. For EL, some attempts have been made to extend the results to unification w.r.t. GCIs [2], but these approaches can at the moment only deal with TBoxes that satisfy a certain restriction on cyclic dependencies between concept names. For ALC, decidability of the unification problem (even w.r.t. the empty TBox) is a longstanding open problem, though it is known that undecidability holds for extensions of ALC by so-called nominals or the universal role [36].
Approximate unification relaxes the requirement that the two concept descriptions must be made equivalent. Instead, it requires that they are made "almost" equivalent, where the meaning of "almost" is formalized using distance measures between concept descriptions. Such measures are CCMs that map into [0, +∞) and satisfy some additional properties. Basically, given such a measure and a threshold value, one then asks whether one can lower the distance between two concept descriptions below the threshold value by applying a substitution. This is called the decision problem for approximate unification. For the computation problem, one wants to calculate the lowest achievable distance.
In [7], approximate unification is introduced and then investigated for the DL FL 0 and two concept distance measures that are induced by distance measures between formal languages (see the next section). It is shown that (w.r.t. the empty TBox and these two measures), approximate unification has the same complexity (ExpTime) as unification.

Concept comparison measures for F L 0
Until recently, the research on concept comparison measures in DLs was mostly concerned with EL [25,19,34] and more expressive DLs [15]. To achieve equivalence invariance, concept descriptions are usually first translated into an appropriate normal form, and then the structure of the normalized descriptions is compared. For instance, measures that achieve equivalence invariance only for the empty TBox or for acyclic TBoxes in EL [25,34] make use of the reduced form of EL concept descriptions introduced in [24]. Extensions to general TBoxes [19] use the so-called canonical model, which is generated by the polynomial-time subsumption algorithm for EL [3].
Two recent approaches for defining concept comparison measures for FL 0 [30,7] were restricted to the case of the empty TBox. Both approaches employ a formal language-based characterization of equivalence between FL 0 concept descriptions. In the remainder of this paper, we will develop a general approach for defining concept comparison measures for FL 0 concept descriptions that are equivalence invariant also w.r.t. general TBoxes. This is achieved by using a new formal language-based characterization of equivalence in FL 0 w.r.t. general TBoxes [28], where the semantics of a concept description is characterized using a tuple of (possibly infinite) formal languages. Basically, this tuple serves as a normal form for the concept description. In order to define equivalence invariant measures on FL 0 concept descriptions, it is thus sufficient to define measures that compare such tuples. We will show how tuples of languages can be represented by infinite trees, and then use appropriate weighted tree automata to compute the comparison value.

From F L 0 concept descriptions to tuples of formal languages
In FL 0 , subsumption and equivalence can be nicely characterized using language inclusion. This characterization relies on transforming FL 0 concept descriptions into an appropriate normal form as follows. First, the semantics given to concept constructors in FL 0 implies that value restrictions distribute over conjunction, i.e., for all C, D ∈ C F L0 (N C , N R ) and r ∈ N R it holds that ∀r.(C D) ≡ ∅ ∀r.C ∀r.D.
Using this equivalence as a rewrite rule from left to right, each FL 0 concept description can be translated into an equivalent concept description that is either or a conjunction of concept descriptions of the form ∀r 1 . . . ∀r n .A, where {r 1 , . . . , r n } ⊆ N R and A ∈ N C . Such concept descriptions can be abbreviated as ∀w.A, where w represents the word r 1 . . . r n . Note that n = 0 means that w is the empty word ε, and thus ∀ε.A corresponds to A. Furthermore, a conjunction of the form ∀w 1 .A . . . ∀w m .A can be written as ∀L.A where L ⊆ N R * is the finite language {w 1 , . . . , w m }. We use the convention that ∀∅.A corresponds to the top concept . Thus, if we fix the set of concept names as N C := {A 1 , . . . , A }, then any two concept descriptions C, D ∈ C F L0 (N C , N R ) can be represented as where K 1 , L 1 , . . . , K , L are finite languages over the alphabet of role names N R , i.e., finite subsets of N R * . Using this representation, it was shown in [9] In the presence of a non-empty TBox T , a similar characterization of subsumption and equivalence can be obtained [28], but now the definition of the languages needs to take the GCIs in T into account. Given an FL 0 concept description C and a TBox T , we define for all A ∈ N C the following language and call this language the value restriction set of C w.r.t. T and A. It can easily be verified that, for all concept names A i ∈ N C , the sets K i in (1) are actually equal to L ∅ (C, A i ). However, while in the case of the empty TBox these languages are finite, they may be infinite for non-trivial TBoxes. This is illustrated by the following example. where we have used standard notation for writing regular expressions to describe these infinite languages.
Just as in the case of the empty TBox, the value restriction sets can be used to characterize equivalence and subsumption w.r.t. general TBoxes (see [28]): The equivalence (3) shows that, in FL 0 , formal languages can be used to represent the semantic content of concept descriptions: up to equivalence, every FL 0 concept description C ∈ C F L0 (N C , N R ) is uniquely represented by the tuple of languages We will use this fact to reduce the definition of concept comparison measures between FL 0 concept descriptions w.r.t. a TBox to the definition of measures comparing tuples of languages: given two FL 0 concept descriptions C, D, we define c T (C, D) by comparing the tuples L T (C) and L T (D). One advantage of this approach is that equivalence invariance comes "for free" since equivalent concept descriptions are indistinguishable from the language point of view.

Using tuples of languages to define CCMs
The idea of using tuples of languages to compare FL 0 concept descriptions has already been employed in [30,7], but restricted to the empty TBox. In both works, the general approach used to define such measures consists of the following three steps: 1. Translate the FL 0 concept descriptions C and D into their corresponding tuples of languages L ∅ (C) = (K 1 , . . . , K ) and L ∅ (D) = (L 1 , . . . , L ). For the sake of readability, we will denote these tuples as K and L, respectively. 2. To compare the tuples K and L, their components K i and L i are compared pairwise, and the values obtained this way are then appropriately combined into a value s(K, L). 3. Finally, the value s(K, L) is used to define c ∅ (C, D).
In the following, we recall the exact definitions of the measures introduced in [30] and [7].
Example 2. In [30], the authors' goal is to define concept similarity measures. To this end, given K and L, they first define an asymmetric measure s, which they apply to (K, L) and (L, K).
The obtained values are then combined using average. For the definition of the asymmetric measure, they propose two possible functions e 1 and e 2 to compare every pair (K i , L i ): the function e 1 checks inclusion: the function e 2 returns the fraction of the words in L i that also belong to K i , and thus yields 1 if L i ⊆ K i , but 0 only if the two languages are disjoint: 1 The asymmetric measure s is then defined as s(K, L) = f (e j (K 1 , L 1 ), . . . , e j (K , L )), where f is the average operator and j ∈ {1, 2}. Finally, the CSM ∅ for FL 0 concept descriptions is defined as Example 3. In [7], the authors introduce the notion of concept distance measures for FL 0 . They obtain such measures by applying a language distance (which is assumed to be a topological metric) to the pairs of languages (K i , L i ), and then combining these values using a function f . In particular, they define two language distances d 1 and d 2 , which we introduce below.
Let M 1 and M 2 be two languages over an alphabet Σ. We denote the symmetric difference of M 1 and M 2 as M 1 ∆ M 2 , i.e., The language distances d 1 and d 2 are now defined as Intuitively, the symmetric difference captures all the discrepancies between two concept descriptions C and D with respect to a concept name A. More precisely, if for instance, w ∈ L ∅ (C, A) \ L ∅ (D, A) for some w ∈ N R * , then C ∅ ∀w.A and D ∅ ∀w.A, which amounts to a semantically relevant difference between C and D. Based on this intuition, the first distance looks for the shortest such discrepancy, while the second one takes all differences into account, but differences for longer words count less than differences for shorter ones (see [7] for a more detailed explanation).
As already mentioned, these language distances are then used to define a measure s on tuples by setting s(K, . For functions f satisfying certain properties (called combining functions in [7]), this yields a concept distance m d,f for FL 0 concept descriptions: In the two examples above, the definition of a CCM for FL 0 was in the end reduced to define a distance function that compares two languages. Thus, the input for this function is a pair of languages. In general, one may also want to allow for definitions of distance functions on language tuples that do not resort to binary comparisons of the components of the tuples. The inputs for the function are then 2 -tuples of languages. For this reason we now develop means for defining functions that receive tuples of languages as input, which covers both the binary and the general case.
Though developed for the case of the empty TBox, and thus with finite languages in mind, the functions e 1 , d 1 , d 2 of our examples are also well-defined for infinite languages, and thus can also be employed in the more general setting of non-empty TBoxes. However, if we are not only interested in defining, but also in computing the functions, we need to find ways of representing their input (i.e., tuples of possibly infinite languages) in a finite way. In the next section, we show that finite automata working on infinite trees can be used for this purpose.

Finitely representing tuples of languages
Following the ideas in [9,10,28], we will represent tuples of (possibly infinite) languages using infinite trees.
Definition 4. Let Σ = {σ 1 , . . . , σ k } be a non-empty finite set of symbols. Given a set of labels L, an L-labeled Σ-tree is a mapping t : Σ * → L that assigns a label t(w) ∈ L to every node w ∈ Σ * . The set of all L-labeled Σ-trees is denoted as T ω Σ,L .
Intuitively, the nodes of a Σ-tree t correspond to finite words in Σ * , where the empty word ε represents the root of t and every node w has k children corresponding to the words wσ 1 , . . . , wσ k . Since for a non-empty alphabet Σ the set Σ * of all words over Σ is infinite, Σ-trees are by definition infinite. We use tuples over {0, 1} as labels to represent tuples of languages over Σ.
Definition 5. Let Σ be a finite set of symbols and ∈ N. We define the mapping γ : It is easy to see that γ is a bijection between tuples of languages over the alphabet Σ and {0, 1} -labeled Σ-trees. Given a tree t ∈ T ω Σ,{0,1} , the inverse function yields the tuple γ −1 (t) = (L 1 , . . . , L ) where L i consists of the words w for which the ith component of t(w) is equal to 1.
Basically, this translation of tuples of languages into trees is used in [28] to represent the tuple of value restriction sets L T (C) of an FL 0 concept description C as an N R -tree t C . Strictly speaking, the label set employed in [28] is 2 N C for N C = {A 1 , . . . , A } rather than {0, 1} , but it should be clear that, by fixing a linear order A 1 < A 2 < . . . < A on N C , these two representations can be translated into each other. Obviously, a single value restriction set L T (C, A i ) can be represented as a {0, 1}-labeled N R -tree t C,Ai , where the words belonging to this language receive label 1 and the others label 0.
The following example illustrates this representation of value restriction sets by trees using the concept description C and the TBox T of Example 1.

Example 4.
Recall from Example 1 that L T (C, A) = r * ∪ sr * and L T (C, B) = ss * . To express the tuple of these languages as a tree, we assume that r is the first symbol of the alphabet and s is the second, and that A < B. Then L T (C) = (r * ∪sr * , ss * ), and this tuple is represented by the tree sketched on the left-hand side of Figure 1. For better readability, we have labeled the edges with the symbols r and s. As an example for the labeling, consider the node corresponding to the word sr. It has label (1, 0) since this word belongs to r * ∪ sr * , but not to ss * . The extension of this tree to infinity is obtained as follows. On the one hand, the outgoing dotted edges tell us that all the nodes below are labeled with the tuple (0, 0). Notice, for example, that there are no words starting with rs or srs in any of the two languages. On the other hand, the nodes rrr, srr and sss are the roots of infinite trees representing the tuples of languages (r * , ∅), (r * , ∅) and (∅, s * ), respectively.
The tree on the right-hand side of the figure represents the language L T (C, A), which is obtained from t C by projecting the label-tuples to the first component.
Using the same approach, given two concept descriptions C, D, the pair of tuples (L T (C), L T (D)) can obviously be represented as an infinite N R -tree t (C,D) : As mentioned before, our goal is to represent such input tuples in a finite way. Using infinite trees obviously does not solve this problem. Thus, we need to develop an approach for representing such trees in a finite way. For general tuples of infinite languages and thus arbitrary Σ-trees this is clearly not possible. However, a closer look at the trees t C constructed in [28] shows that they are actually regular trees, which admit a finite representation. Therefore, we restrict our attention to the class of regular trees. We start by formally defining the notion of a regular tree, and then show that regular trees can always be represented using certain kinds of tree automata. Definition 6 (regular tree). Let t be a tree in T ω Σ,L . Given a node w ∈ Σ * , the subtree t w : Σ * → L of t is defined as t w (v) := t(wv) for all v ∈ Σ * . We say that t contains the subtree t w . Then, t is a regular tree if it contains finitely many distinct subtrees.
There are different ways to represent regular trees in a finite way [35]. Here, we use looping tree automata for this purpose.
Definition 7 (Looping tree automaton (LTA)). A looping tree automaton is a tuple A = (Σ, Q, L, ∆, I) where Σ = {σ 1 , . . . , σ k } is a finite set of symbols, Q is a finite set of states, L is a finite set of labels, ∆ ⊆ Q × L × Q k is the transition relation and I ⊆ Q is a set of initial states. A run of this automaton on a tree t ∈ T ω Σ,L is a Q-labeled Σ-tree r : Σ * → Q such that r(ε) ∈ I and (r(w), t(w), r(wσ 1 ), . . . , r(wσ k )) ∈ ∆ for all w ∈ Σ * . The tree language L(A) recognized by A is the set of all trees t ∈ T ω Σ,L such that A accepts t, i.e., A has a run on t.
In general, LTAs recognize sets of trees. Therefore, to uniquely represent a tree we only consider those recognizing singleton sets. It is easy to see that trees that can be represented by looping tree automata are indeed regular.
In fact, LTAs are Rabin tree automata [29,35] with trivial acceptance conditions, and it is well-known that non-empty tree languages recognized by Rabin tree automata always contain a regular tree. Thus, if such an automaton recognizes the singleton set {t}, then t must be regular. Conversely, we can show that any regular tree can be represented in this way.
Proposition 1. Let t ∈ T ω Σ,L be an L-labeled Σ-tree. Then, t is regular iff it can be represented by an LTA.
Proof. We have already seen that the if-direction holds. To show the only-if direction, assume that t is a regular tree. By Definition 6 it thus contains only finitely many distinct subtrees, say t 0 , t 1 , . . . , t m where we assume without loss of generality that t 0 = t. For all 1 ≤ i ≤ m, we denote the direct subtrees of t i as t i σ1 , . . . , t i σ k . Note that these are also subtrees of t, and thus belong to the set {t 0 , t 1 , . . . , t m }. We build the looping tree automaton A t = (Σ, Q t , L, ∆ t , {t 0 }) as follows: Q t := {t 0 , t 1 , . . . , t m } and ∆ t : In the following, we show that that t = t 0 is the only tree accepted by A t . Initially, we prove that A t accepts t, by inductively defining a run r of A t on t. Set r(ε) = t 0 = t ε . Assume that for w ∈ Σ * , the state r(w) has already been defined and it holds that r(w) = t w = t j for some 0 ≤ j ≤ m. Note that t j σi = t wσi for every 1 ≤ i ≤ k. Since t(w) = t w (ε) = t j (ε) and (t j , t j (ε), t j σ1 , . . . , t j σ k ) ∈ ∆, we define r(wσ i ) = t j σi = t wσi . In this way, the run r of A t on t is inductively defined, and thus t ∈ L(A t ).
The automaton A t constructed in the above proof actually has a very specific syntactic shape (see Definition 9 below), which ensures that it accepts only one tree.
The following proposition states some obvious consequences of this definition and the proof of Proposition 1. Proof. The first claim is immediate after observing that the automaton A t introduced in the proof of Proposition 1 is an rLTA. For the second claim, completely analogously to the proof of Proposition 1, we can prove that A has a run r on some tree t, and for any run r of A on some tree t it holds that r = r and t = t.

Proposition 2. Let
In [28] it is shown that, given an FL 0 concept description C and a TBox T , the tree t C encoding the tuple L T (C) can be represented by an rLTA.
Theorem 1 ( [28]). Let C be an FL 0 concept description and T a TBox. Then, one can construct a representing looping tree automaton that represents t C in time exponential in the size of C and T .
In case we are given a general LTA A, we should like to know whether it actually represents a tree (i.e., recognizes a singleton set), and if the answer is affirmative construct an rLTA that represents the same tree.
Lemma 1. Let A be an LTA. We can decide in polynomial time whether A represents a tree. If A represents a tree t ∈ T ω Σ,L , then we can construct an rLTA representing t in polynomial time.
Proof. Given A, we remove superfluous states by applying the emptiness test for looping automata [12,10] and check whether L(A) = ∅. If this is the case, A does not represent a tree. Otherwise, we check whether the automaton accepts a unique tree. If the answer is affirmative, we obtain an automaton A r by removing all but one transition for every state. Obviously, A r is an rLTA and L(A r ) ⊆ L(A). If A represents a tree, A r is the rLTA we are looking for.
Before providing the exact algorithm below, a definition is due. Claiming that an LTA has no superfluous states is the informal way of saying that the LTA is trim. An LTA A is called trim if every state can be used in some run of A. It is easy to see that every LTA can be transformed into a trim LTA that is equivalent in the sense of having the same runs.
Algorithm for deciding whether a given LTA represents a tree.
Given an LTA A = (Σ, Q, L, ∆, I): -Construct an equivalent trim LTA A = (Σ, Q , L, ∆ , I ) [10, Lemma 2]. If the resulting automaton has no initial states, then L(A) = ∅, and thus A does not represent a tree. -Otherwise, compute the binary relation ∼ on Q (that is inspired from automata minimization) as follows: The iteration becomes stable and thus terminates after m ≤ |Q 2 | steps. Define ∼:= Q 2 \B m . -Check whether q ∼ q for every q, q ∈ I . The answer is positive iff A represents a tree.
The following lemma proves correctness of the algorithm.

Lemma 2.
A trim LTA A = (Σ, Q, L, ∆, I) represents a tree iff I = ∅ and q ∼ q for every q, q ∈ I.
Proof. Assume that A does not represent a tree. This means that either it does not accept any tree, or it accepts more than one. In the first case, since A is trim, we get that I = ∅. In the second case, there are at least two trees t 1 , t 2 accepted by A. Let w = σ 1 . . . σ n be a minimal word s.t. t 1 (w) = t 2 (w) and r 1 , r 2 be runs of A on t 1 , t 2 respectively. Thus, we get that there are transitions (r 1 (w), t 1 (w), ...), (r 2 (w), t 2 (w), ...) ∈ ∆ with t 1 (w) = t 2 (w). By the construction in the algorithm, (r 1 (w), r 2 (w)) ∈ B 0 ⊆ B m . For every proper prefix v of w, since t 1 (v) = t 2 (v) (by minimality of w) we get that (r 1 (v), r 2 (v)) ∈ B m . In particular, (r 1 (ε), r 2 (ε)) ∈ B m , and since r 1 (ε), r 2 (ε) ∈ I the proof of this direction is complete.
For the other direction, if I = ∅ then obviously A does not accept any trees. Assume that q ∼ q , i.e., (q, q ) ∈ B m for some q, q ∈ I. Then, let l be the least number such that (q, q ) ∈ B l . If l = 0, there exist (q, σ, . . . ), (q , σ , . . . ) ∈ ∆ with σ = σ , and since the automaton is trim, we get that A accepts at least two trees, one with root σ and one with σ . If l ≥ 1, exist (q, σ, q 1 , . . . , q k ), (q , σ, q 1 , . . . , q k ) ∈ ∆ with (q i , q i ) ∈ B l−1 for some 1 ≤ i ≤ k. Iterating the above argument, we get a word w ∈ Σ * (with length at most l) and a pair (p, p ) ∈ B 0 s.t. p is a w-successor of q and p of q and, as before, we derive that A accepts at least two trees (with the difference existing in the node w instead of the root).
The results of this section show that we can restrict the attention to rLTAs when representing regular trees.

Using weighted looping tree automata to assign a value to a tuple of languages
Our goal is now to assign values from a (numerical or other) domain to tuples of (possibly infinite) languages that can be represented by regular trees. Consequently, we need a device that takes as input such a tree and returns a value. Weighted looping tree automata are such devices: they assign values (from a so-called semiring) to infinite trees. In the next subsection, we introduce the special type of weighted tree automata that we will use together with the necessary notions (semirings, discounting, etc.). We show how the language distances d 1 , d 2 and the function e 1 introduced in the previous section can be realized using such automata. Then, we turn to the problem of how to actually compute the value assigned by such an automaton to a regular tree that is represented by an rLTA.

Weighted looping tree automata
In order to assign a value to a tree, weighted tree automata make use of transitions that are equipped with weights. These weights are usually elements of a semiring such that one can add and multiply weights. An extensive survey of weighted tree automata can be found in [20]. In a setting where the automata are required to work on infinite trees, the underlying semiring should admit suitable infinite sums and products [31]. In the context of infinite trees, it is also useful to employ discounting. This has been used for modeling systems with non-terminating behavior [1] in order to assign different degrees of importance to incidents that happen later in time. In our setting, discounting can be used to assign less importance to differences that occur for longer words, i.e., further down in the tree.
Semirings. The weight structures underlying our weighted tree automata are totally complete commutative semirings [31]. A semiring is called commutative if a ⊗ b = b ⊗ a for all a, b ∈ S.
Next, assume that addition can be suitably extended to infinite sums, i.e., the semiring S is equipped with infinitary sum operations I : S I → S, for any index set I, such that for all I and all families (a i | i ∈ I) of elements of S the following hold: The semiring S together with the operations I is called complete.
A complete semiring is said to be totally complete, if it is endowed with countably infinite product operations satisfying for all sequences (a i | i ≥ 0) of elements of S the following conditions: where a 0 = a 0 ⊗ . . . ⊗ a n1 , a 1 = a n1+1 ⊗ . . . ⊗ a n2 , . . . for an increasing sequence 0 < n 1 < n 2 < . . . of natural numbers, and where I 1 , I 2 , . . . are arbitrary index sets.
A totally commutative complete semiring is a commutative and totally complete semiring that additionally satisfies: Examples The following semirings are totally commutative complete: All of the above examples but Viterbi can be found in [31]. To the best of our knowledge, whether the Viterbi semiring is totally commutative complete has not been investigated in the literature before. To prove that this is indeed the case, it suffices to makes the following two observations: -It is well-known (see for example [22]) that the infinite product i≥0 a i converges in case i≥0 (1 − a i ) converges, and is equal to 0 if i≥0 (1 − a i ) = +∞. Since 0 ≤ a i ≤ 1, we have that i≥0 (1 − a i ) either converges or is equal to +∞, and thus the infinite product is well-defined. . Thus, Viterbi being a totally commutative complete semiring is a corollary of R inf being one.
Even though in Definition 1 we required S to be equipped with a partial order ≤ S , it is often enough if a preorder is available. On one hand, for the relaxed instance and approximate unification problems, strict orders, not partial ones are utilized. Given a partial order or a preorder , a strict order ≺ is induced in the same way by setting a ≺ b ⇐⇒ a b ∧ b a. On the other hand, a partial order can always be derived from a preorder by considering the quotient set, identifying elements a and b s.t. a b ∧ b a. For every semiring S = (S, ⊕, ⊗, O, 1) a natural preorder is induced, by setting a b if there exists some c ∈ S such that a ⊕ c = b. Note, however, that in all the examples above, the induced preorder is actually a partial order.
Discounting. In the setting of semirings, discounting is defined by using semiring endomorphisms. This approach was originally used for weighted automata on infinite words by Droste and Kuske in [16], and extended to weighted automata on infinite trees by Mandrali and Rahonis [26].
The set End(S) of all endomorphisms of S is a monoid with composition • as binary operation and the identity mapping id as unit.
For R sup , it was proved in [16] that every endomorphism is of the form p(a) = p · a for some p ∈ [0, +∞), and conversely, every p ∈ [0, +∞) defines an endomorphism of R sup in this way. The same result can be shown for R inf as well [17]. Finally, it is not difficult to see that, a similar result holds for the Viterbi semiring. Proof. Initially, observe that for every a, b ∈ [0, 1] and for every p ∈ [0, +∞) it holds that -p(sup{a, b}) = (sup{a, b}) p = sup{a p , b p } = sup{p(a),p(b)}, -p(a · b) = (a · b) p = a p · b p =p(a) ·p(b), -p(0) = 0 p = 0 and -p(1) = 1 p = 1.
Weighted looping tree automata. In the following, S is assumed to be a totally complete commutative semiring. An infinitary tree series h over L and S is a mapping h : T ω Σ,L → S. The class of all infinitary tree series over L and S is denoted by S T ω Σ,L .
Definition 13 (Weigthed looping tree automaton with discounting Φ). A weighted looping tree automaton with discounting Φ (Φ-wLTA) over S is a tuple M = (Σ, Q, L, in, wt) where Q is a finite state set, L is a finite set of labels, Σ = {σ 1 , . . . , σ k } is a finite set of symbols, in : Q → S is the initial distribution, and wt : Q × L × Q k → S is a mapping assigning weights to the transitions of the automaton.
Given a Φ-wLTA M = (Σ, Q, L, in, wt) over S, a run of M on a tree t ∈ T ω Σ,L is a mapping r : Σ * → Q. We denote the set of all runs of M on t by R M (t). Given a run r, we denote the transition (r(w), t(w), r(wσ 1 ), . . . , r(wσ k )) by − → r (w). The weight of the run r at w ∈ Σ * is defined as wt(r, w) := wt( − → r (w)). The Φ-weight (or simply weight) of r is defined as weight(r) := in(r(ε)) ⊗ w∈Σ * φ w (wt(r, w)).
Finally, the Φ-behavior (or simply behavior) of M is the infinitary tree series ||M|| ∈ S T ω Σ,L whose coefficients are determined for every t ∈ T ω Σ,L by weight(r).
If we take φ i = id for every i = 1, . . . , k, then we are left with a "normal" wLTA over S in the sense of [31], and thus dispense with the prefix Φ-in the notation.
If |L| = 1, then T ω Σ,L consists of a single tree t ul , which we will call the unlabeled tree since the labels are then irrelevant. In this case, we omit the label from the transitions of a Φ-wLTA M and write R M for its runs, omitting t ul . Also note that then M is a single element of S rather than a tree series.
Expressing language distance functions. The functions d 1 , d 2 introduced in Example 3 and the function e 1 of Example 2 take a pair of languages over an alphabet Σ as input. Thus, to represent this kind of input in a tree, we use the label set L 2 := {0, 1} 2 . We show that d 2 as well as a vital component of d 1 can be expressed by weighted looping automata with discounting over R inf . The function d 1 itself and the function e 1 can be expressed using the Viterbi semiring.
Example 5. The first language distance described in [7] is d 1 (K, N ) = 2 −n where n = min{|w| | w ∈ K ∆ N }. We introduce a wLTA (without discounting) that, given a tree t representing the tuple of languages (K, N ), computes the minimum n rather than 2 −n itself. Given n, the exponentiation can be done by external computation. Consider the wLTA M 1 = (Σ, Q, L 2 , in 1 , wt 1 ) over R inf = (R ≥0 ∪ {+∞}, inf, +, +∞, 0), where Q = {q 0 , q 1 }, in 1 (q 0 ) = +∞, in 1 (q 1 ) = 0 and wt1(q, l, p1, . . . , p k ) = Intuitively, each run using only transitions with non-infinite weights selects one path in the tree, which it labels with q 1 until an element in the symmetric difference is found. The transitions up to this point in the selected path receive weight 1, and all other transitions have weight 0. Thus, adding up the weights (with the multiplication ⊗ = + of R inf ) gives us the distance from the root to the node where the difference was detected, i.e., the length of the word in the symmetric difference (or +∞ in case no difference is found on the chosen path). By building the infimum over all runs, the length of the shortest word in the symmetric difference is found.
It is easy to see that this automaton works completely analogously to the previous one, computing ( 1 2 ) n instead of n.
Example 7. Recall the function e 1 from Example 2 where e 1 (K, N ) = 1 if N ⊆ K, and 0 otherwise. As was the case for d 1 , given a tree t representing the tuple of languages (K, N ) we can either consider a wLTA over R inf that helps computing e 1 (K, N ), or make use of a wLTA over the Viterbi semiring to compute the exact value. In this example, we examine the second case. Note that, since |Q| = 1, there is exactly one run r of M e on t. The intuition behind this construction is the following: we have that N ⊆ K iff ∃w ∈ Σ * such that w ∈ N ∧ w / ∈ K iff ∃w ∈ Σ * such that t(w) = (0, 1). In other words, if there exists a node of t labeled with (0, 1) we have that N ⊆ K. From the automaton point of view, a node labeled with (0, 1) requires a transition with weight 0, and thus weight(r) = 0. On the other hand, if only the labels (0, 0), (1, 0) and (1, 1) appear, implying that N ⊆ K, every transition has weight 1, and thus weight(r) = 1. Hence, by computing (||M||, t) we know e 1 (K, N ).

Computing the behavior on regular trees
Given a Φ-wLTA M over a semiring S and an rLTA A representing a regular tree t, we want to compute the behavior of M on t, i.e., (||M||, t). In a first step, we reduce this problem to the problem of computing the behavior of a Φ-wLTA on the unlabeled tree. To be more precise, we combine the two automata M and A into a single Φ-wLTA M A that works on the unlabeled tree t ul such that (||M||, t) = (||M A ||, t ul ).
Theorem 2. Given a Φ-wLTA M = (Σ, Q, L, in, wt) over S and an rLTA A = (Σ, P, L, ∆, {p s }) representing a regular tree t, one can construct in polynomial time a Φ-wLTA M A over S working on the unlabeled tree t ul such that (||M||, t) = (||M A ||, t ul ).
Proof. Let S = (S, ⊕, ⊗, O, 1). By the definition of rLTAs, for every state p ∈ P there exists a unique letter l p ∈ L such that (p, l p , . . .) ∈ ∆. Additionally, by Proposition 2 it holds that A has a unique run, say θ, on t. For simplicity, for every w ∈ Σ * we denote θ(w) ∈ P by p w .
We define the Φ-wLTA M A = (Q × P × L, Σ, in , wt ) over S as follows: To prove that ( M , t) = ( M A , t ul ), it is sufficient to show that there exists an injection τ : R M (t) → R M A such that weight(r) = weight(τ (r)) for every r ∈ R M (t) and weight(r ) = O for every r ∈ R M A \ im(τ ), where im(τ ) stands for the image set of the mapping τ .
Thus, we obtain In other words, for every r ∈ R M (t), r = τ (r), and hence ∃z ∈ Σ * , r (z) = (r(z), p z , l pz ) From r we define three mappings r 0 : Σ * → Q, p 0 : Σ * → P, l 0 : Σ * → L by setting r (w) = (r 0 (w), p 0 (w), l 0 (w)) for every w ∈ Σ * . Obviously, r 0 ∈ R M (t) (since any mapping from Σ * to Q is a run of M on t). Then, from (5), we get that ∃z ∈ Σ * such that p 0 (z) = p z or l 0 (z) = l pz (otherwise it would be the case that r = τ (r 0 )), and assume without loss of generality that z has minimal length. We distinguish two cases.
z = ε. This implies that p 0 (ε) = p ε or l 0 (ε) = l pε . In both cases, in(r (ε)) = O and thus weight(r ) = O, since O ⊗ a = O for all a ∈ S. z = vσ i . This implies that p z = p 0 (z) or l pz = l 0 (z). In the first case, we have that Thus, it remains to show how the behavior of a Φ-wLTA working on the unlabeled tree can be computed. For wLTAs (without discounting) over complete distributive lattices this was done in [11]. In the next section, we show how the behavior of a Φ-wLTA over the semiring R inf can be computed.

Computing the behavior on the unlabeled tree in R inf
Concentrating on R inf is motivated, on the one hand, by the fact that our motivating examples (the distance functions d 1 and d 2 and the similarity function e 1 ) can be expressed using wLTA with discounting over this semiring. On the other hand, discounting for this semiring is wellunderstood [16] and nicely behaved. Note, however, that our algorithms can be extended to the Viterbi semiring. Still, in order to get analogous complexity results, further computability and/or precision considerations have to be taken into account (see comments at the end of each section).
Recall that, for R inf , all endomorphisms are of the form p(a) = p · a for p ∈ R ≥0 , and thus the discounting is of the form Φ = (p 1 , . . . , p k ). Given w = σ i1 . . . σ im ∈ Σ * , we set p w = p i1 ·. . .·p im where the empty product (case w = ε) is 1. Then φ w (a) = φ i1 • · · · • φ im (a) = p i1 · . . . · p im · a = p w (a), and thus φ w = p w . It is easy to see that, for p > 0, p distributes over inf and . In the following, we assume that p i = 0 for i = 1, . . . , k, and we will write p w · a instead of φ w (a).
A q-run r of M is a run with r(ε) = q. We denote the set of all q-runs of M as R(q). The running weight of a q-run is defined like its weight, but without taking the initial distribution into account, i.e., rweight(r) := w∈Σ * p w · wt(r, w), and thus weight(r) = in(q) + rweight(r). Consequently, if we define µ(q) := inf r∈R(q) rweight(r) (for every q ∈ Q) then ( M , t ul ) = min q∈Q {in(q) + µ(q)} . Hence, in order to compute the behavior of M on t ul , it suffices to calculate the values µ(q) for all q ∈ Q.
The following lemma provides recursive equations that are useful to achieve this goal.
Lemma 4. For every state q ∈ Q it holds that Proof.
Note that the exact same computations can be made for the Viterbi semiring ([0, 1], sup, ·, 0, 1), with the difference that inf, min, +, , ∞, 0 are replaced by sup, max, ·, , 0, 1 respectively, and that endomorphisms are of the formp(a) = a p . Thus, we would obtain equations of the form Our approach for computing the values µ(q) depends on the kind of discounting used.

Behavior for nondecreasing discounting
In this section we assume that the discounting is nondecreasing, i.e., p i ≥ 1 for all i = 1, . . . , k.
Note that absence of discounting corresponds to the special case where p i = 1 for all i = 1, . . . , k.
If the discounting is nondecreasing, then we have for every run r ∈ R M that where in the latter infinite sum only finitely many distinct non-negative real numbers occur. Consequently, this sum (and thus the original sum as well) is a finite number iff only 0 is used infinitely often in the sum. Therefore, a run r has finite weight iff, from a certain depth on, it has only zero-weight transitions. Consequently, we can restrict our attention to deciding for each state q whether such a (finite weight) q-run exists, and compute the smallest weight among all of them.
The first step consists of computing the set of states in Q that admit a run with only zeroweight transitions. Clearly, these are exactly the states q for which µ(q) = 0. By keeping only transitions with weight 0 and then applying the emptiness test for LTAs [12] to the resulting automaton, these states can easily be computed.
More precisely, the computation can be done as follows. Let ∆ 0 ⊆ Q k+1 be the set containing only the transitions in M with zero weight: and B 0 the subset of Q containing all the states that have no transition in ∆ 0 , i.e., Then, we define the following iteration for i ≥ 0: The iteration becomes stable after at most ≤ |Q| steps. The set Q = Q \ B is then the set of states that admit a run with only zero-weight transitions, as the following lemma shows.
Proof. Let q ∈ Q . This means that there is a transition (q, q 1 , . . . , q k ) ∈ ∆ 0 such that (q 1 , . . . , q k ) ∈ (Q ) k . Iterating this argument for the successor states one can build the wanted run r.
For the opposite direction, suppose that q / ∈ Q and thus q ∈ B . Assume that j is the least index such that q ∈ B j . By induction on j we will prove that there is no q-run that has only zero-transitions, i.e., transitions from ∆ 0 . If j = 0, i.e., q ∈ B 0 , then there is no zero-weight transition starting from q, and thus no q-run with only zero-weight transitions. If j > 0, this implies that for every (q, q 1 , . . . , q k ) ∈ ∆ 0 exists some i such that q i ∈ B j−1 . By the induction hypothesis, there is no q i -run with only zero-weight transitions, and thus the same holds for q.
The following lemma is a straightforward consequence of the previous one and the definition of µ(q).
Summing up, the above construction gives us the next lemma.
A run with finite weight does not use a transition with weight +∞ and below a certain depth in the tree it contains only states that belong to Q µ=0 . Thus, the states used in the run must have access to states in Q µ=0 through transitions with finite weight. To be more precise, define the set Q acc of states that have access to Q µ=0 to be the least subset of Q such that (i) Q µ=0 ⊆ Q acc and (ii) if q i ∈ Q acc for every i = 1, . . . , k and wt(q, q 1 , . . . , q k ) = +∞ then q ∈ Q acc . States q that have access to Q µ=0 have a q-run with finite running weight, and hence µ(q) < +∞. If q does not have access to Q µ=0 , then µ(q) = +∞. By using an approach inspired by Dijkstra's shortest path algorithm, we can compute the states that have access to Q µ=0 together with their µ-value in polynomial time.
Initially, note that in case Q = ∅, no state has access to Q and thus µ(q) = ∞ for all q ∈ Q.
To compute µ(q) for all the states q we use the following algorithm. Set S 0 = Q and consider the function if q ∈ S 0 min (q1,...,q k )∈(S0) k wt(q, q 1 , . . . , q k ), otherwise Next, for i > 0, iteratively do the following: Then, having (q 0 1 , . . . , q 0 k ) ∈ (S i−1 ) k implies that µ(q 0 j ) = m i−1 (q 0 j ) for all 1 ≤ j ≤ k (by induction). Recall that m i (s ) corresponds to the expression: Since (S i−1 ) k ⊆ (S i ) k , this means that the value corresponding to the inner minimization in the previous expression is not greater than µ(s ). Therefore, using (7) we obtain: Moreover, if m i (s ) < m i−1 (s ) holds, this must have been a direct consequence of introducing s i to form S i . In other words, m i (s ) is updated by using the value m i−1 (s i ), which means that m i−1 (s i ) ≤ m i (s ). The latter is obviously not consistent with (9). Therefore, in order to still be consistent with µ(s ) < µ(s i ), the equality m i (s ) = m i−1 (s ) must hold. But then, it follows that m i−1 (s ) < m i−1 (s i ) which is a contradiction since s i was selected to obtain S i . Thus, we can conclude that µ(s i ) ≤ µ(q) for all q ∈ S i . 2. Consider µ(s i ) expressed as in (8) (substitute s by s i ). Since we have just shown that µ(s i ) ≤ µ(q) for all q ∈ S i , similarly as before, it can be proved that (q 0 1 , . . . , q 0 k ) ∈ (S i−1 ) k holds for s i as well. Then, the same argument used in 1. yields that m i (s i ) ≤ µ(s i ). Thus, m i (s i ) = µ(s i ) holds (using (7)).
Summing up, computing Q requires quadratic time and computing m f (q) for every q ∈ Q requires cubic time. Since the behavior of M on t ul can easily be computed from the values µ(q) for q ∈ Q, this yields the following theorem.
Theorem 3. The behavior of a Φ-wLTA with nondecreasing discounting Φ over R inf on the unlabeled tree can be computed in polynomial time.
Note that a similar algorithm can be utilized for the behavior over the Viterbi semiring. As was done for obtaining Equation (6), replace inf, min, +, , ∞, 0 by sup, max, ·, , 0, 1 respectively. Also recall that endomorphisms are of the formp(a) = a p . Other than this, the algorithm and the proof of its correctness is the same. However, since we might have to compute numbers as small as x p |Q| , the algorithm might take exponential time because of their computation.

Behavior for contracting discounting
In this section, we assume a contracting discounting, i.e., p < 1 k , where k = |Σ| and p = max i=1,...,k p i .
Recall that it suffices to compute the value µ(q) for every q ∈ Q. To achieve this, we generalize the approach used in [7] for the special case of d 2 . Let Q = {q 1 , . . . , q n }. For each q i ∈ Q, the unknown value µ(q i ) is associated to a variable x i . Additionally, let I = {1, . . . , n}. Then, Lemma 4 states that (µ(q 1 ), . . . , µ(q n )) is a solution of the following system of equations: Before we continue with solving this system of equations, we recall some notions from Metric Topology. Formal definitions can be found in any book on the subject, for example [13,23].
Without loss of generality, assume that f i (b) ≤ f i (a). Thus, we have: ) for every i ∈ I, and thus Since p < 1 k , implying k · p < 1, we have that f is a contraction.
From Banach's fixed point theorem, we have that f has a unique fixed point, and thus the system of equations (10) has a unique solution. Thus, to compute the values µ(q) for q ∈ Q it is sufficient to compute this unique solution. This can be realized using Linear Programming [33], basically by the same approach used in [7]. where a i,j , b i , c j are rational numbers.
The feasible region of the LP problem consists of all the tuples (x 1 , . . . , x n ) that satisfy the restrictions. The answer to an LP problem is a tuple in the feasible region that maximizes the objective function and "no" if the feasible region is empty.
It is well known that LP problems are solvable in polynomial time in the size of the problem [33].
From the above system of equations 10 we can derive an LP problem. Consider for every i ∈ I, (i 1 , . . . , i k ) ∈ I k the inequation x i ≤ wt(q i , q i1 , . . . , q i k ) + k j=1 p j · x ij (11) and the objective z = max i∈I x i .
Lemma 10. The LP problem consisting of the inequations (11), and the objective (12) has the unique solution {x i → µ(q i ) | i ∈ I}.
Proof. Initially, observe that the above vector is in the feasible region, since it satisfies the restrictions (11). Next, we procede to show that it is indeed the only point that maximizes the objective function. First, we need the following claim.
Claim. If a is a solution that maximizes the objective function then, for every i ∈ I, at least one of the inequalities (11) holds as an equality.
Proof (Claim). Suppose on the contrary that c is a solution that maximizes z, but for some i ∈ I, inequalities c i ≤ wt(q i , q i1 , . . . , q i k ) + k j=1 p j · c ij are strict for all (i 1 , . . . , i k ) ∈ I k . This would mean that the value of c i can be increased, and all inequalities would still hold; the increase in c i might increase the right-hand side of some other inequality, but since the left-hand side remains the same, all restrictions are satisfied. Thus, a new point c has been produced that satisfies all the restrictions of the LP problem and additionally gives a larger value for the objective function. This is a contradiction to our initial assertion about c. This completes the proof of the claim.
As a result, any points that are solutions to the LP problem, satisfy the condition x i = min (i1,...,i k )∈I k wt(q i , q i1 , . . . , q i k ) + k j=1 p j · x ij for all i ∈ I. Thus, they correspond to solutions of the system of equations (10).
Finally, since there is a unique such solution, the solution of the LP problem is this unique solution, the vector (µ(q 1 ), . . . , µ(q n )).
Since solutions of Linear Programming problems can be computed in polynomial time, this is also the case for the values µ(q), and thus for the behavior.
Theorem 5. The behavior of a Φ-wLTA with contracting discounting Φ over R inf on the unlabeled tree can be computed in polynomial time.
Note that for the Viterbi semiring we get analogous equations: By taking the logarithm of these equations, we derive equations of the form (10). However, the numbers might no longer be rational, and thus further computational issues should be taken into consideration.

Conclusion
We have seen that concept comparison measures are important components of several approaches for approximation in DLs. Given two concepts C, D, such a measure assigns to them a value that expresses how well they compare. In general, these values come from a partially ordered set, but in most approaches to approximation considered so far, measures that map into the real numbers, and often only into the real interval [0, 1], are used. An important requirement for such measures is that they respect the semantics of concepts, i.e., are invariant under equivalence in the sense that, if we replace C, D by equivalent concepts C , D , then the returned similarity comparison value is the same. To be useful in practice, another important requirement on the measures is that they are computable.
The main technical contribution of this paper is the development of a general framework for defining concept comparison measures for the DL FL 0 that are computable and invariant under equivalence w.r.t. general TBoxes. Our framework is based on a characterization of equivalence w.r.t. general FL 0 TBoxes that uses tuples of formal languages. These tuples can be expressed by infinite trees, which in turn are represented by looping tree automata. Assigning a comparison value to a pair of FL 0 concepts in an equivalence invariant way thus boils down to assigning a value to a tree that is represented by a looping tree automaton. We use weighted tree automata with discounting for this purpose, and reduce the problem of computing the comparison value to the problem of computing the behavior of such an automaton on the unlabeled infinite tree. If the weights of the automaton come from the semiring R inf , then this behavior can be computed in polynomial time provided that the employed discounting is nondecreasing or contracting. An obvious topic for future research is thus to extend these results to discounting that is neither contracting nor nondecreasing, or to other semirings as weight structures.
While the use of our framework guarantees that the obtained concept comparison measures are equivalence invariant and computable, the user of the framework needs to ensure (by appropriately defining the weighted automaton) that the obtained values make sense in the intended application. Nevertheless, it might be helpful to provide the user with automated tools for checking whether the defined measure satisfies certain properties, such as the properties often required for concept similarity measures [25]. In our framework, this boils down to deciding certain properties of weighted tree automata with discounting.
Finally, if concept comparison measures defined using our framework are employed within one of the approximation approaches sketched in Section 2.3, one can investigate whether the important inference problems in this approach are guaranteed to be decidable. For example, assume that a concept similarity measure defined using our approach is employed to relax instance queries in FL 0 . Can we extend our computability result for the measure to a decidability result for the relaxed instance problem? If the answer is affirmative, what is the exact complexity of the relaxed instance problem in this setting?