LTCS–Report On Confident GCIs of Finite Interpretations

In the work of Baader and Distel, a method has been proposed to axiomatize all general concept inclusions (GCIs) expressible in the description logic ℰℒ K and valid in a given interpretation ℐ . This provides us with an effective method to learn ℰℒ K -ontologies from interpretations, which itself can be seen as a different representation of linked data . In another report, we have extended this approach to handle errors in the data. This has been done by not only considering valid GCIs but also those whose confidence is above a certain threshold 𝑐 . In the present work, we shall extend the results by describing another way to compute bases of confident GCIs. We furthermore provide experimental evidence that this approach can be useful for practical applications. We finally show that the technique of unravelling can also be used to effectively turn confident ℰℒ K gfp -bases into ℰℒ K -bases.


Introduction
Description logic ontologies provide a practical yet formally well-defined way of representing large amounts of knowledge. They have been applied especially successfully in the area of medical and biological knowledge, examples being the widely used ontologies SNOMED CT [16], GALEN [17] and the Gene Ontology [2].
A part of description logic ontologies, the so called TBox, contains the terminological knowledge of the ontology. Terminological knowledge constitutes connections between concept descriptions and is represented by general concept inclusions (GCIs). For example, we could fix in an ontology the fact that everything that has a child is actually a person. Using the description logic ℰℒ K , this could be written as Dchild.J Ď Person.
Here, Dchild.J and Person are examples of concept descriptions, and the Ď sign can be read as "implies." General concept inclusions are, on this intuitive level, therefore quite similar to implications.
The construction of TBoxes of ontologies, which are supposed to represent the knowledge of a certain domain of interest, is normally conducted by human experts. Although this guarantees a high level of quality of the resulting ontology, the process itself is long and expensive. Automating this process would both decrease the time and cost for creating ontologies and would therefore foster the use of formal ontologies in other applications. However, one cannot expect to entirely replace human experts in the process of creating domain-specific ontologies, as these experts are the original source of this knowledge. Hence constructing ontologies completely automatically does not seem reasonable.
A compromise for this would be to devise a semi-automatic way of constructing ontologies, for example by learning relevant parts of the ontology from a set of typical examples of the domain of interest. The resulting ontologies could be used by ontology engineers as a starting point for further refinement and development.
This approach has been taken by Baader and Distel [5,6,11] for constructing ℰℒ K -ontologies from finite interpretations. The reason why this approach is restricted to ℰℒ K is manifold. Foremost, the approach exploits a tight connection between the description logic ℰℒ K and formal concept analysis [12], and such a connection has not been worked out for other description logics. Moreover, the description logic ℰℒ K can be sufficient for practical applications, as, for example, SNOMED CT is formulated in a variant of ℰℒ K . Lastly, ℰℒ K is computationally much less complex as other description logics, say ℒ or even ℱℒ 0 .
In their approach, Baader and Distel are able to effectively construct a base of all valid GCIs of a given interpretation, where this interpretation can be understood as the collection of typical examples of our domain of interest. This base therefore constitutes the complete terminological knowledge that is valid in this interpretation. Moreover, these interpretations can be seen as a different way to represent linked data [7], the data format used by the semantic web community to store its data. Hence, this approach allows us to construct ontologies from parts of the linked data cloud, providing us with a vast amount of real-world data for experiments and practical applications.
In [10], a sample construction has been conducted on a small part of the DBpedia data set [8], which is part of the linked open data cloud. As it turned out, the approach is effective. However, one result of these experiments was a different observation: in the data set extracted from DBpedia, a small set of errors were present. These errors, although very few, greatly influenced the result of the construction in the way these errors invalidated certain GCIs, and hence these GCIs were not extracted by the algorithm anymore. Then, instead of these general GCIs, more special GCIs were extracted that "circumvent" these errors by being more specific. This not only lead to more extracted GCIs, but also to GCIs which may be hard to comprehend.
As the original approach by Baader and Distel considers only valid GCIs, even a single error may invalidate a certain, otherwise valid GCI. Since we cannot assume from real-world data that it does not contain any errors, this approach is quite limited for practical applications. Therefore, we want to present in this work a generalization to the approach of Baader and Distel which does not only consider valid GCIs but also those which are "almost valid." The rationale behind this is that these GCIs should be much less sensitive to a small amount of errors than valid GCIs. To decide whether a GCIs is "almost valid," we shall use its confidence in the given interpretation. We then consider the set of all GCIs of a finite interpretation whose confidence is above a certain threshold P r0, 1s, and try to find a base for them. This base can then be seen as the terminological part of an ontology learned from the data set.
This report sets out to extend the results found in [9]. In this report, first results have been given on how to construct bases of confident GCIs of finite interpretations. We augment these results by another construction that allows us to directly obtain a confident base from a set of implications of a suitable formal contexts. Furthermore, we shall provide experimental results using the DBpedia data set. With these results we want to show that our approach of considering confident GCI may provide useful information in practical applications. Lastly, we answer an open question raised in [9] and show that confident ℰℒ K gfp -bases can effectively turned into confident ℰℒ K -bases. For this, we shall use the techniques of unravelling that have also been used in [11] to show a similar result for bases of valid GCIs.
This report is structured as follows. In the following two section we shall introduce the necessary notions from the field of formal concept analysis and description logics needed for this paper. We shall then discuss a construction of a confident base from a suitable formal context. Afterwards, we apply our results to the same interpretation as it has been used in [10], where we not only consider particular confident GCIs and discuss their validity, but where we also examine the  Then, in the subsequent section, we show that unravelling applied to confident bases of finite interpretations can effectively be used to obtain ℰℒ K -bases from ℰℒ K gfp -bases. We finish this report with some conclusions and outlook on future work.

Formal Concept Analysis
In this section we want to introduce the necessary definitions from formal concept analysis [12] needed in this work.

Formal Contexts and Contextual Derivation Operators
Formal concept analysis originated as an attempt to unify modern lattice theory with philosophical ideas about concepts as hierarchies [12]. The fundamental definition of formal concept analysis is the one of a formal context.

Definition Let ,
be two sets and let Ďˆ. Then the triple K " p , , q is called a formal context, whereas the set is denoted as the set of objects of K and the set is denoted as the set of attributes of K. For P , P we read p , q P as "object g has attribute m" and write in this case. ♢ If a formal context K " p , , q is finite, i. e. if the sets and are finite, it is sometimes convenient to depict K as a cross table, as shown in the following example.
Then K " p , , q is a formal context, which is depicted as a cross table in Figure 1. Here, we have a table where the rows are labeled with elements from and the rows are labeled with elements from . In a cell corresponding to a pair p , q Pˆwe write a cross "ˆ" if and only if p , q P . Otherwise, we leave this cell blank or write a single dot "." in it. ♢ Given a formal context K " p , , q and some set Ď of objects one can ask what the largest set of attributes is that all objects in share. Likewise, one can ask for a set Ď of attributes what the largest set of objects is that have all attributes in . To answer this question we introduce the derivation operators for a formal context K.

Definition
Let K " p , , q and Ď , Ď . Then we define the derivations in the formal context K as The set is called an extent of K if and only if " p 1 q 1 . The set is called an intent of K if and only if " p 1 q 1 . ♢ For convenience, we shall drop the extra parentheses and write shorter p 1 q 1 " 2 and p 1 q 1 " 2 .
As a first observation on the derivation operators let us note that the functions 1 : Pp q Ñ Pp q, 1 : Pp q Ñ Pp q form a so called Galois connection. For this let us recall that for a set an order relation ď is just a set ď Ďˆsuch that ď is reflexive, antisymmetric and transitive.

Definition
Let , be two sets and let ď and ď be order relations on and , respectively. Then the two mappings : Ñ , : Ñ form an antitone Galois connection between p , ď q and p , ď q if and only if for all P , P holds ď p q ðñ ď p q. ♢ We can now see the Galois connection of the derivation operators between the ordered sets pPp q, Ďq and pPp q, Ďq. We collect this fact, among other, immediate consequences, in the following proposition.

Proposition
Let K " p , , q be a formal context, 1 , 2 Ď , 1 , 2 Ď . Then the following conditions hold: Another easy observation regarding derivation operators is the following: If Ď and p | P q is a family of subsets of such that Ť P " , then In particular, for Ď Pp q it is true that We shall make use of these observations in our further discussions.

Implications
If we have given a formal context K " p , , q, it may very well be that all objects that have certain attributes Ď always have the attributes Ď in addition. In this case, we say may that the attributes from imply the attributes from in the formal context K.

Definition
Let be a set. An implication Ñ on is a pair p , q where , Ď . In this case, is called the premise and is called the conclusion of the implication Ñ . We shall denote the set of all implications on by Impp q.
Let K " p , , q be a formal context. An implication Ñ of K is an implication on . The set of all implications of K is denoted by ImppKq, i. e.
The implication Ñ holds in K (or is valid in K) if Ď 2 . We then write K |ù p Ñ q. If is a set of implications of K such that each implication in holds in K, then we may denote this with K |ù . The set of all implications of K that hold in K is denoted by ThpKq.♢ Note that the condition Ď 2 is equivalent to 1 Ď 1 by Proposition 2.5, i. e. an implication Ñ holds in K " p , , q if and only if every object P that has all attributes in also has all attribute in .

Definition
Let be a set and let Ď Impp q be a set of implications. Then an implication Ñ is entailed by if for every context K with attribute set in which all implications from hold, the implication Ñ holds as well. In this case, we write |ù p Ñ q. The set of all implications in Impp q entailed by shall be denoted by Cnp q. ♢ Implications on a set give rise to a certain class of mappings on the powerset lattices pPp q, Ďq, namely closure operators on . Abstractly, a closure operator is a mapping • Ď ñ p q Ď p q, i. e. is monotone, and • p p qq " p q, i. e. is idempotent, is true for all sets , Ď . A set Ď is said to be closed under if and only if p q " . Now, implications give rise to closure operators on , as described in the following definition. Additionally, it is not hard to see that every closure operator on is equal to a closure operator induced by implications.

Definition Let
be a set and ℒ Ď Impp q. Then define for Ď ℒ 1 p q :" ď t | p Ñ q P ℒ, Ď u, ℒ`1p q :" ℒpℒ p qq p P N ą0 q, The mapping ℒ : Pp q Ñ Pp q with Þ Ñ ℒp q is then called the closure operator induced by ℒ. A set Ď is said to be closed under ℒ if and only if ℒp q " . ♢ It is easy to see that every closure operator induced by a set of implications on a set is indeed a closure operator on in the sense of the aforementioned definition.
An interesting observation now is that entailment for implications can be rephrased in terms of the induced closure operators. See [9,12] for more details on this.

Bases of Implications
Implications can be understood as logical objects for which we can decide validity in formal contexts. This automatically yields the following definition of implicational bases, which results in a way to represent all valid implications of a formal context in a compact way.
2.10 Definition Let K be a formal context. A set of implications of K is an implicational base (or just a base) of K if the following conditions hold:

1)
is sound for K, i. e. every implication in holds in K,

2)
is complete for K, i. e. every implication holding in K follows from .
Moreover, a base of K is said to be non-redundant if each proper subset of is not a base of K. ♢ An obvious base is the following.

Theorem
Let K be a formal context. Then the set Checking completeness of a set ℒ of implications may be a tedious task, as, naively, one may have to consider all valid implications of K. However, completeness of ℒ can also be verified by considering the intents of K, as the following lemma shows.
2.12 Lemma Let K " p , , q be a formal context and let ℒ Ď Impp q. Then ℒ is complete for K if and only if @ Ď : ℒp q " ùñ " 2 , i. e. the closed sets of ℒ are intents of K.
It is easy to see that if we reverse the direction of the implication in the previous lemma, that we then obtain a characterization for ℒ to be sound for K.
The base that is described in Theorem 2.11 is not very practical, as it always contains exponentially many implications measured in the size of . Luckily, we can explicitly describe a base that always has minimal cardinality among all bases of a formal context. Unfortunately, even this base may exponentially many elements in the size of [13].
2.13 Definition ( -pseudo-intent) Let K be a finite formal context and let Ď Impp q. A set Ď is said to be a -pseudo-intent of K if and only if i.
ii. p q " and iii. for all -pseudo-intents Ĺ it holds that 2 Ď .
If " H, then is also called a pseudo-intent of K. ♢ Let us define for a formal context K and Ď Thp q the canonical base of K with background knowledge to be the set We may write CanpKq if " H and just call it the canonical base of K.
We can consider the canonical base of K with background knowledge as a smallest set of valid implications of K such that CanpK, q Y is a base for K. Intuitively, if we assume that we already know the implications of but want to learn all valid implications of K, then CanpK, q is a smallest set of valid implications that we need to add.
2.14 Theorem (Theorem 3.8 from [11]) Let K be a finite formal context and Ď Thp q.
Then the set CanpK, q Y is base of K having the least number of elements among all bases of K containing .
This theorem assumes the background knowledge to contain only valid implications of K. However, this is not necessary, as the following theorem shows.
2.15 Theorem (Theorem 2.17 from [9]) Let K " p , , q be a formal context and let Ď Impp q. Then CanpK, q is the set of valid implications with minimal cardinality such that CanpK, q Y is complete for K.

Canonical Bases of Sets of Implications
We have discussed the canonical base CanpKq of a formal context K. We can understand CanpKq as a smallest set of implications ℒ such that Cnpℒq " ThpKq. Indeed, instead of only considering the set ThpKq, we can consider any set of implications and ask for a smallest set ℒ such that Cnpℒq " Cnp q.
We shall give such sets ℒ a special name.

Definition
Let be a finite set and let Ď Impp q. A set ℒ Ď Impp q is called a base of if and only if Cnpℒq " Cnp q. ♢ In [18], Rudolph describes a method to effectively convert the set into a base Canp q of of least cardinality. We shall call this set the canonical base of , since this construction yields CanpThpKqq " CanpKq. It is the purpose of this section to repeat these results, as we shall make use of them later on.
We shall first introduce the notion of pseudo-closed sets of .

Definition
Let be a finite set and let Ď Impp q. A set Ď is called a pseudo-closed set of if and only if the following conditions hold: ii. for all Ĺ , it is true that p q Ď .
♢ Now we expect that the set Canp q :" t Ñ p q | pseudo-closed set of u is a base of of minimal cardinality. The correctness of this intuition is guaranteed by the following result. Before we are going to prove, let us note that if 1 and 2 are two sets of implications on a finite set such that Cnp 1 q " Cnp 2 q, that then Canp 1 q " Canp 2 q is true. This follows immediately from the definition of pseudo-closed sets, as Cnp 1 q " Cnp 2 q implies 1 p q " 2 p q for all Ď .

Theorem
Let be a finite set and let Ď Impp q. Then the set Canp q is a base of of minimal cardinality.
Proof We can find a formal context K with attribute set such that 2 " ℒp q is true for each Ď . From this, we can immediately infer that Cnpℒq " CnpThpKqq " ThpKq, because for , Ď it is true by Lemma 2.9.
It is now easy to see that Canp q " CanpThpKqq " CanpKq. By Theorem 2.15 (with empty background knowledge) it is true that CanpKq is a base of ThpKq with minimal cardinality. As ThpKq " Cnp q and CanpKq " Canp q, it follows that Canp q is a base of of minimal cardinality. Obtaining the canonical base of the set can be done effectively. As shown in [18], Algorithm 2.19 computes for the set of implications on its canonical base Canp q. Note that the expression p 1 Y qp q just denotes the application to the set of the closure operator induced by 1 Y .
3 The Description Logics ℰℒ K and ℰℒ K gfp Description logics are part of the field of knowledge representation, a branch of artificial intelligence. Its main focus lies in the representation of knowledge using well-defined semantics.
For this, description logics provide the notion of ontologies. These ontologies can be understood as a collection of axioms. More specifically, description logic ontologies consist of assertional axioms and terminological axioms. Examples for an assertional axioms are "Tom is a cat" and "Jerry is a mouse", written in description logic syntax as CatpTomq and MousepJerryq.
An example for terminological axiom would be to say that "every cat hunts a mouse", written as The use of the existential quantifier may be a bit surprising here, but it can be explained as follows. Consider the reformulation of "every cat hunts a mouse" to "whenever there is a cat, there exists a mouse it hunts." The above statement should be read with this reformulation in mind.
Another example would be to say that "nothing is both a cat and a mouse", written as Again, a reformulation may clarify the used syntax. The phrase "nothing is both a cat and a mouse" can be understood as "whenever there is something that is both a cat and a mouse, we have a contradiction." The bottom sign K denotes this contradiction.
These examples are formulated in the description logic ℰℒ K , the logic we shall mainly use in this work. The constructors used in ℰℒ K are conjunction [, existential restriction D and the bottom concept K.
During the course of our considerations, however, it shall turn out that ℰℒ K does not suffice for all our purposes. We shall therefore latter on introduce another description logic called ℰℒ K gfp that can be understood as an extension of ℰℒ K that allows for cyclic concept descriptions. The main motivation to consider this description logic shall become clear when we introduce model-based most-specific concept descriptions, which allow us to reformulate notions from formal concept analysis in the language of description logics.

The Description Logic ℰℒ K
We are now going to introduce the syntax and semantics of the description logic ℰℒ K . For this, let us fix three disjoint sets , and . We think of these sets as the sets of concept names, role names and individual names, respectively. We may sometimes refer to the triple p , , q as the current signature.

Definition
The set of ℰℒ-concept description is defined as follows: i. If P , then P .
iii. If P and P , then D . P .
iv. J P .
v. is minimal with these properties.
An ℰℒ K -concept description is either K or an ℰℒ-concept description. We may simply talk about concept descriptions if it is clear from the context that we refer to ℰℒ K -concept descriptions.
We have already seen some examples for ℰℒ K -concept descriptions, but let us consider one more example, this time a bit more formally. Intuitively associating a meaning with an ℰℒ K -concept description is not sufficient for a knowledge representation formalism. Therefore, description logics define the semantics of concept descriptions in terms of interpretations. An interpretation can be understood as a directed graph where the vertices are labeled with concept names from and edges are labeled with role names from . Additionally, some of the vertices are explicitly named with elements from and no vertex has more than one name.

Definition
An interpretation ℐ " p∆ ℐ ,¨ℐq consists of a set ∆ ℐ and an interpretation function¨ℐ such that In addition, the unique name assumption holds: If , P , ‰ , then ℐ ‰ ℐ . ♢

Example
Let us choose again " t Cat, Mouse, Animal u, " t hunts u and in addition " t Tom, Jerry u. An interpretation ℐ " p∆ ℐ ,¨ℐq would then be given by where we have specified the interpretation function¨ℐ through its graph. Figure 2 shows the interpretation ℐ as a directed and labeled graph. ♢ Given an interpretation ℐ " p∆ ℐ ,¨ℐq, we can extend the interpretation function¨ℐ to the set of all ℰℒ K -concept descriptions as follows. Let be an ℰℒ K -concept description.

Definition
If is an ℰℒ K -concept description and ℐ is an interpretation, then ℐ is said the be the extension of in ℐ. The elements of ℐ are said to satisfy the concept description and the elements of ∆ ℐ z ℐ are said to not satisfy the concept description . ♢ The notion of interpretations also allows us to speak of concept descriptions that are more specific than other concept descriptions.

Definition
Let , be two ℰℒ K -concept descriptions. Then is said to be more specific then (or is subsumed by ), written as Ď , if and only if for all interpretations ℐ it is true that ℐ Ď ℐ .
Two ℰℒ K -concept descriptions and are equivalent, written as " , if and only if is more specific than and is more specific than , i. e.
" ðñ p Ď q and p Ď q. ♢ We shall now introduce the notions of terminological axioms and TBoxes.

Definition
An terminological axiom is of the form where P and , are ℰℒ K -concept descriptions. Terminological axioms of the form Ď are called general concepts inclusions (GCIs), axioms of the form " are called concept definitions. If Ď is a GCI, then is called the subsumee and is called the subsumer of Ď .
Let ℐ be an interpretation. Then a general concept inclusion Ď holds in ℐ if and only if ℐ Ď ℐ . A concept definition " holds in ℐ if and only if ℐ " ℐ . An interpretation ℐ is a model of a set of terminological axioms if and only if all axioms in hold in ℐ. ♢

Example We can define the notion of a hunting cat by the concept definition
HuntingCat " Cat [ Dhunts.J.
A general concept inclusions which expresses that every Cat is also an Animal would be A word of caution is appropriate here. We have introduced the symbol Ď for denoting both subsumption and general concept inclusions. This may cause some confusions, but is an established convention in the field of description logics. It may even sometimes be that both meanings of this sign occur together. In those situations we have to exercise some extra care on clearly distinguishing both meanings of Ď.
Collections of terminological axioms are called TBoxes (for terminological boxes). We shall define two types of TBoxes, namely cyclic TBoxes and general TBoxes. For this, let us fix another set , begin pairwise disjoint to all , , , which we shall call the set of defined concept names.

Definition
Let be a set of concept definitions and define p q :" t | D : p " q P u.

Then
is called a cyclic TBox, if every concept definition p " q P is such that is a defined concept name, is an ℰℒ-concept description with concept names from and p q, and each P p q appears at most once on the left-hand side of a concept definition of .
The set p q is then called the set of defined concept names of the cyclic TBox . The set p q of concept names that appear in concept descriptions in but are not defined concept names is called the set of primitive concept names. ♢

Example
In the case of Tom and Jerry, it is often not really clear who hunts whom. We can therefore define HuntingCat " Cat [ Dhunts.HuntingMouse, The set containing these two concept definitions is a cyclic TBox. Its defined concept names are t HuntingMouse, HuntingCat u, its primitive concept names are t Cat, Mouse u. both hold in ℐ. Therefore, general concept inclusions can express concept definitions. Thus, if we are given a cyclic TBox 1 that contains concept definitions, we can always transform it into a set 2 containing only general concept inclusions such that the models of 1 are precisely the models of 2 . In this respect, sets containing only general concept inclusions are a generalization of cyclic TBoxes. We shall call such sets general TBoxes.

Definition
To make our argumentation easier to read, we may simply refer to as a TBox whenever is a cyclic or general TBox.
We have just defined the semantics of both cyclic and general TBoxes. If is such a TBox, an interpretation ℐ is a model of if and only if all definitions in hold in ℐ. For this we need that the interpretation mapping¨ℐ of ℐ has been extended to the set p q of defined concept names of . This semantics then is called descriptive semantics. As we shall see later, there are also other kinds of semantics for TBoxes. As a particular example, we shall introduce greatest fixpoint semantics when we discuss the description logic ℰℒ K gfp .

The Description Logic ℰℒ K gfp
In the work of Distel [11], various parallels between the fields of formal concept analysis and description logics are noted. In particular, in both areas certain elements can be described. Let Figure 3: An interpretation where t u has no model-based most-specific concept description in ℰℒ K .
K " p , , q be a formal context. Then an object P can be described by a set Ď of attributes if P 1 . The same is true for an interpretation ℐ " p∆ ℐ ,¨ℐq. An element P ∆ ℐ is described by a concept description if P ℐ . Furthermore, in both K and ℐ we can obtain for a description and the set of objects 1 and elements ℐ described by it.
However, in K we can associate for a most-specific description :" t u 1 . By Proposition 2.5, describes . If then P 1 , then t u Ď 1 , i. e. t u 2 Ď 3 " 1 . But then 1 Ď 1 , and hence describes the fewest objects of all sets Ď that describe . In other words, describes in the most specific way.
An analogous notion of a most-specific concept-description with respect to an interpretation ℐ has been introduced in [11] as model-based most-specific concept description.

Definition
Let ℐ " p∆ ℐ ,¨ℐq be a interpretation and let Ď ∆ ℐ . Then a model-based most-specific concept description for over ℐ is a concept description such that Intuitively speaking, a model-based most-specific concept description for Ď ∆ ℐ is a mostspecific concept description that describes all elements in .
Model-based most-specific concept descriptions may not exist. We shall see in the next example an interpretation ℐ where some sets of elements do not have model-based most-specific concept descriptions in ℰℒ K . To compensate for this we shall consider the description logic ℰℒ K gfp that allows for cyclic concept descriptions. In this logic, model-based most-specific concept descriptions always exist.
The following example is a minor variation of one given in [11].

Example Let
" H and " t u. We consider the interpretation ℐ " p∆ ℐ ,¨ℐq with ∆ ℐ " t u and ℐ " t p , q u. The interpretation depicted as a graph is shown in Figure 3.

Now suppose that
is an ℰℒ K -concept description that is at the same time a model-based most-specific concept description for " t u over ℐ. Because " H and " t u, is equivalent to one of the concept descriptions Then ℐ " t u and Ď , ı , contradicting the fact that is a model-based most-specific concept description of over ℐ. ♢ On the other hand, if model-based most-specific concept descriptions exist, they are necessarily unique up to equivalence. Therefore, if is a set of elements of an interpretation ℐ, we can denote the model-based most-specific concept description of over ℐ by the special name ℐ . This notation has been used to stress the similarity to the derivation operators from formal concept analysis.
In the remained of this section, we shall introduce the description logic ℰℒ K gfp to overcome the deficiency of ℰℒ K that there may not always exist model-based most-specific concept descriptions. We start this introduction by definition the syntax of ℰℒ K gfp -concept descriptions. 3.14 Definition Let be a cyclic TBox. A concept definition p " q P is said to be normalized, if is of the form where , P N, 1 , . . . , P p q and 1 , . . . , P p q. If " " 0, then " J. We call normalized if and only if it contains only normalized concept definitions.
An ℰℒ gfp -concept description now is of the form " p , q where is a normalized TBox and is a defined concept name of . An ℰℒ K gfp -concept description is either K or an ℰℒ gfp -concept description. ♢

Example
Let us reconsider the TBox from Example 3.10, i. e.
Then is a normalized cyclic TBox and the pair pHuntingMouse, q is a valid ℰℒ K gfp -concept description. ♢ We have already defined the notion of ℰℒ K -GCIs. Of course, this definition can be easily modified to yield the notion of ℰℒ K gfp -GCIs: these are just expressions of the form Ď , where and are ℰℒ K gfp -concept descriptions.
We shall sometimes omit the logic and call an ℰℒ K gfp -concept description just a concept description and likewise shall call an ℰℒ K gfp -GCIs just a GCI, if the description logic used is clear from the context.
As we have defined the syntax of ℰℒ K gfp , the natural next step is to define the semantics of ℰℒ K gfp . This, however, is not as straight forward as in the case of ℰℒ K , as we have to deal with circular concept descriptions. As we shall see shortly, semantics can be defined using fixpoint semantics. This has been done in [3,15].
Let be an ℰℒ K gfp -concept description and let ℐ " p∆ ℐ ,¨ℐq be an interpretation. If " K, then certainly ℐ " H. Hence let " p , q. Then P p q. The idea to define the extension of in ℐ is now to extend the interpretation mapping¨ℐ such that ℐ " ℐ is true for all p " q P . If we have given this, we could simply define To make this approach into an actual definition, we have to resolve two issues. Firstly, it is not clear if such an extension of¨ℐ to p q always exists. Secondly, if such an extensions exists, it may not necessarily be unique, so we have to make an explicit choice. As it turns out, we can describe the extensions of¨ℐ we are looking for as fixpoints of a particular mapping and can thus prove the existence of such extensions. Furthermore, it turns out that these fixpoint are naturally ordered, and we can just choose the largest one. See also [4,15] for more details and motivation.
We are now going to work out this approach in more detail. For this, we start by formally defining the notion of an extension of¨ℐ.

Definition
Let ℐ be an interpretation and let be a TBox. Then an interpretation is an extension of the interpretation ℐ with respect to if and only if ∆ ℐ " ∆ , is defined for all P p q and We shall denote with Ext pℐq the set of all extensions of ℐ with respect to . ♢ We can define an order relation ĺ on Ext pℐq by It is clear that pExt pℐq, ĺq is an ordered set.

Proposition
For each interpretation ℐ and TBox , the ordered set pExt pℐq, ĺq is a complete lattice.
Indeed, it is easy to see that and the latter is, as a product of complete lattices, again a complete lattice.
As already noted, we are interested only in those extensions of ℐ such that " is true for all p " q P . In other words, we are only interested in extensions of ℐ that are models of .
This fact can also be seen from another perspective: let us define a mapping : Ext pℐq Ñ Ext pℐq by p q :" for all p " q P and P Ext pℐq. Since for each P p q, there is exactly one concept definition p " q P , the function is well-defined. Furthermore, it is sufficient to define p q only on defined concept names, as the value of p q is already fixed for concept and role names, since p q P Ext pℐq. Moreover, this mapping is monotone, i. e. ℐ 1 ĺ ℐ 2 ùñ pℐ 1 q ĺ pℐ 2 q for all ℐ 1 , ℐ 2 P Ext pℐq. This is easy to see if one recalls that the concept description where 1 , . . . , P and 1 , . . . , P p q.
We can now see that the extensions of ℐ that are models of are actually fixpoints of . This is because P Ext pℐq is a model of if and only if " for all p " q P .
But this means that p q " " , i. e. p q " . Hence to show that there exist extensions of ℐ that are models of it is sufficient to show that has fixpoints. To do this, we use the fact that is monotone and the following, well-known theorem by Tarski [19].

Theorem
Let p , ďq be a complete lattice and let ℎ : Ñ be a monotone mapping on p , ďq, i. e. ď ùñ ℎp q ď ℎp q holds for all , ď . Then the set is such that p , ďq is a complete sublattice of p , ďq. In particular, ‰ H and there exists a least and greatest fixpoint of ℎ.
As a corollary, we obtain the fact that the mapping has fixpoints in Ext pℐq and that there exists a greatest fixpoint of in Ext pℐq. We call this fixpoint the greatest fixpoint model (gfp-model) of in ℐ. Having this, we are finally able to define the extension of the concept description .

Definition
Let be an ℰℒ K gfp -concept description and let ℐ be an interpretation. Then The main result for our considerations about ℰℒ K gfp is now the following theorem from [5,11].
Now that we can guarantee the existence of model-based most-specific concept descriptions we can consider some first properties. The following result can also be found in [5].
3.21 Lemma (Lemma 4.1 of [11]) Let ℐ be a finite interpretation. Then for each ℰℒ K gfpconcept description and every Ď ∆ ℐ , it holds Proof Suppose Ď ℐ . Then ℐ Ď holds by the definition of model-based most-specific concept descriptions (Definition 3.12). This shows the direction from left to right.
Suppose conversely that ℐ Ď . Then ℐ is a concept description that is satisfied by all elements of , therefore This shows the converse direction.T his lemma may remind one of the definition of a Galois connection, however the relation Ď is not an order relation on the set of all model-based most-specific concept descriptions. This is because model-based most-specific concept descriptions are only unique up to equivalence. Yet, most of the properties of a Galois connection are still valid. More precisely, if ℐ is a finite interpretation, , are concept descriptions and , Ď ∆ ℐ , then the following statements are true.
They can be proven in the same way as for any Galois connection. We shall write ℐℐ instead of p ℐ q ℐ .
Another property that was already claimed is that ℰℒ K gfp can be considered as an extension of the description logic ℰℒ K . This may not be obvious at a first glance, since the definition of ℰℒ K gfp -concept descriptions is quite different from the one of ℰℒ K -concept descriptions. Still, ℰℒ K gfp can be understood as an extension of ℰℒ K . To see this we shall first define conjunction and existential restriction for ℰℒ K gfp -concept descriptions.
Let , be two ℰℒ K gfp -concept descriptions. If " K, then [ :" K and D . :" K. Likewise for " K. Hence we may assume that both , are not the K concept description. Then " p , q, " p , q and we can assume that the defined concept names of and are disjoint. Then let us define where is a fresh defined concept name. Furthermore, if P , then where again is a fresh defined concept name. These definitions preserve the semantics, i. e. for each interpretation ℐ " p∆ ℐ ,¨ℐq it holds We can use these definitions to see that ℰℒ K gfp can indeed be regarded as an extension of ℰℒ K . For this we assign for the ℰℒ K -concept description J the ℰℒ K gfp -concept description p , t " J uq.
Furthermore, if is a concept name, then it is equivalent to the ℰℒ K gfp -concept description p , t " uq. Using the definitions for conjunction and existential restriction for ℰℒ K gfp -concept descriptions, we can inductively assign for each ℰℒ K -concept description an equivalent ℰℒ K gfpconcept description. As these constructors preserve the semantics, ℰℒ K gfp can be seen as an extension of ℰℒ K .

Bases for GCIs of Interpretations
In the case of formal contexts, we were able to extract bases of implications form them. As we view GCIs as the description logic analogue of implications, we want to do the same for GCIs and finite interpretations.
In [11], the algorithm for computing the canonical base has been generalized to the description logic ℰℒ K gfp . This generalized algorithm is then able to compute bases of valid GCIs of a finite interpretation ℐ. In this short subsection we want to introduce the notion of a base and some related definitions.

Definition
Let ℐ be a finite interpretation. The set of valid GCIs of ℐ that consist of ℰℒ K gfp -concept descriptions is denoted by Thpℐq. ♢ One of the main results of [11] was to find a finite set of valid GCIs of ℐ such that every valid GCI of ℐ was already entail by this finite set. These finite sets are then called bases of ℐ. But we can also introduce this notion in a more general setting, namely for arbitrary sets of GCIs.

Definition
Let be a set of GCIs. Let be a set of GCIs.
i. is said to be sound for if and only if |ù , i. e. every GCI in is entailed by ; ii. is said to be complete for if and only if |ù , i. e. every GCI in is entailed by ; iii. is said to be a base for if and only if is both sound and complete for .
If is a base of , then is said to be a non-redundant base of if and only if no proper subset of is a base of . One of the main results of Baader and Distel is now to give explicit descriptions of some finite bases for ℐ. We shall discuss their results in detail in Section 4.2.

Unravelling ℰℒ K gfp -concept descriptions
The base of described by Baader and Distel makes use of model-based most-specific concept descriptions, and therefore in general contains ℰℒ K gfp -concept descriptions. This may be undesired, as ℰℒ K gfp -concept descriptions may be very hard to understand due to their cyclic nature. To overcome this issue, Distel [11] present a method to convert bases of finite interpretations into equivalent set of GCIs which only contains ℰℒ K -concept descriptions. We shall generalize this technique to special kinds of confident bases in Section 6. For this, it is necessary to introduce the notion of unravelling ℰℒ gfp -concept descriptions up to a certain depth. This is the purpose of this section.
Of course, the concept description K is not interesting for this problem, and we therefore restrict our attention to unravelling ℰℒ gfp -concept descriptions . The idea of doing this is very natural: we can view as a graph (with cycles allowed), which we then just "unravel" into an possibly infinite tree. Then to unravel to a certain depth P N just means describes the concept description that corresponds to the unravelling of cut at depth .
To make this intuition into a formal definition, we shall first define the notion of ℰℒ-description graphs of ℰℒ gfp -concept descriptions, which goes back to [4]. We then give a formal definition as in [11] of the unravelling of such a description graph, possibly only up to a certain depth .

Definition Let
" p , q be an ℰℒ gfp -concept description. Then its ℰℒ-description graph :" p , , q is defined as follows.
Then define :" p q, :" and :" t p 1 , , 2 q | 2 P p 1 q u. The vertex P is called the root of the ℰℒ-description graph of .
We shall call the set of vertices, is the set of edges and is the labeling function of the ℰℒ-description graph of . ♢ It is easy to see that every description graph can easily be turned back into an ℰℒ gfp -concept description and that the concept description of the ℰℒ-description graph of a concept description is equivalent to .
In accordance to the definition of unravelling as given in [11], we shall introduce the notion of a directed path in an ℰℒ-description graph " p , , q as a word " 1 1 2 2 . . .`1, where 1 , . . .`1 P and for each P t 1 . . . u it is true that p , ,`1q P . We shall say that the path starts at P if and only if " 1 , and that ends at P if and only if 1 " . We shall also write`1 ": p q and call it its destination. Finally, we shall say that the length p q of is .

Definition
Let " p , q be an ℰℒ gfp -concept description and let " p , , q its ℰℒ-description graph.
The unravelling of is defined as the triple 8 8 is the set of all directed paths of starting at ; ii. 8 :" t p , , q | , P 8 u; iii. 8 p q :" p p qq.
Let P N. The unravelling up to depth of is defined as the description graph " p , , q, where :" t P 8 | p q ď u; ii.
We shall denote with the concept description corresponding to . Then is called the unravelling up to depth of . ♢ It is easy to see that is equivalent to an ℰℒ-concept description if and only if its ℰℒ-description graph does not contain cycles. Consequently, for each P N, is equivalent to an ℰℒ-concept description.

Example
As an example to illustrate these definitions, let us consider the concept description where is a concept name. In Figure 4 the description graph of and its unravelling are depicted.
Let us compute the concept description 3 , the unravelling of up to depth 3. For this, we use the unravelling of the description graph of as shown in Figure 4, and cut it at depth 3. We obtain 3 " Dr.pB [ Ds.Dr.Bq. ♢ Now, the results we need for our further considerations are the following.
3.29 Lemma (Lemma 5.5 of [11]) Let ℐ be a finite interpretation. Then there exists a P N such that ℐ " ℐ is true for each ℰℒ K gfp -concept description .
Lemma 5.5 of [11] also gives a formula to compute the number . However, we are not interested in this formula here and shall not go into further detail here.

A Base for Confident GCIs
The goal of this section is to present a way to effectively obtain bases of confident GCIs of finite interpretations. For this, we shall briefly introduce the notion of confidence in Section 4.1 and use it to define confident GCIs of finite interpretations ℐ as those GCIs whose confidence in ℐ is above a certain, user-defined threshold P r0, 1s. Then, to obtain a base of all those confident GCIs, we shall make use of methods of formal concept analysis. We introduce some necessary machinery in Section 4.2, which allows us to describe a close relationship between formal concept analysis and the description logic ℰℒ K gfp . We then make use of this machinery in Section 4.3 to obtain bases of confident GCIs of ℐ from bases of certain implications of K ℐ .

Confident GCIs of Finite Interpretations
The notion of confidence has been introduced in [1] as a measure of "interest" for association rules. Translated into the language of formal concept analysis, one can regard association rules simply as implications. Then the notion of confidence of an implication Ñ just is the empirical probability that an object that has all attributes from also has all attributes from . See also [20].
This idea of considering this empirical probability fits very well in our plan of considering GCIs which are "almost true." Furthermore, the notion of confidence admits a straight-forward generalization to our setting.

Definition
Let K be a finite formal context and let p Ñ q P Impp q. Then its confidence conf K p Ñ q is defined as Let ℐ be a finite interpretation and let , be ℰℒ K gfp -concept descriptions. Then the confidence conf ℐ p Ď q is defined as Let P r0, 1s. We shall denote with Th pℐq the set of all implications of K whose confidence is at least , and with Th pℐq we shall denote the set of all GCI whose confidence is at least , i. e.
Th pℐq :" t Ď | , some ℰℒ K gfp -concept descriptions, conf ℐ p Ď q ě u. ♢ Note that Thpℐq Ď Th pℐq, and that conf ℐ p Ď q " 1 if and only if Ď holds in ℐ. Also note that contrary to the case of Thpℐq, the set Th pℐq is not necessarily closed under entailment.
The idea is now to consider the set Th pℐq of GCIs instead of Thpℐq for our construction of terminological axioms from ℐ. To make this approach reasonable, we need a finite representation of Th pℐq, i. e. a base. In this particular case, it may also be interesting to look for special bases where all GCIs have confidence at least . This is because those GCIs may be of most interest to the ontology engineer.

Definition
Let P r0, 1s. Let K be a finite formal context. A set ℒ Ď Impp q is called a confident base of Th pKq if and only if ℒ is a base of Th pKq and ℒ Ď Th pKq.
Let ℐ be a finite interpretation. Then a set ℬ of GCIs is called a confident base of Th pℐq if and only if ℬ is a base of Th pℐq and ℬ Ď Th pℐq. ♢ Note that in the case of " 1, bases of Thpℐq " Th 1 pℐq are always confident bases of Thpℐq as well.

Projections and Induced Contexts
The main purpose of this section is to introduce the notions of projections and induced contexts. These notions are important for our further discussions because it forms the basis of connection the description logic ℰℒ K gfp and formal concept analysis. Projections have been introduced in [5,6,11]. The main idea behind its definition is the following: given a finite interpretation ℐ, we are mainly interested in its model-based most-specific concept description. To make methods from formal concept analysis applicable, we shall construct a special formal context K ℐ , whose set of attributes will be a set of certain concept descriptions. Then, if we have given another concept description , we would like to "approximate" this concept description in terms of attributes of K. By approximation we mean that we want to find a set Ď such that Ď P is true "as good as possible." Of course, such a set can readily be defined by This is exactly the definition of projections.

Definition
Let be a set of concept descriptions and let be another concept description. Then the projection pr p q of onto is defined as pr p q :" t P | Ď u. ♢ As projections allow us to approximate in terms of , conjunction Þ Ñ P allows us to go the way back, i. e. from sets of concept descriptions to concept descriptions. For brevity, let us define for Ď :" We also want to lift this definition to sets of implications. Let us define for a set ℒ of implications the set of GCIs ℒ :" t Ď | p Ñ q P ℒ u.
It now turns out that the mappings Þ Ñ pr p q and Þ Ñ satisfy the main condition of a Galois connection. But note again that since Ď does not constitute an order relation on the set of all ℰℒ K gfp -concept descriptions, the aforementioned mappings actually cannot be a Galois connection.

Lemma
Let be a set of concept descriptions. Then for each Ď and for each concept description it is true that Proof Let us first show the direction from left to right. From Ď we can conclude pr p q Ď pr p q, since every concept description P satisfying Ď also satisfies Ď . Furthermore, for each P we have Ď , therefore Ď pr p q and hence Ď pr p q Ď pr p q as desired.
For the other direction let us suppose that Ď pr p q. Then Ě pr p q. Now Ď pr p q is true as well, as we have already argued.Therefore,

Ď
pr p q Ď as desired.W e have introduced pr p q as an approximation of the concept description in terms of . Occasionally, it may happen that this approximation is as good as possible, i. e. that the approximation pr p q indeed describes completely. We shall capture this situation in the following definition.

Definition
Let be a set of concept descriptions and let be another concept description. We say that is expressible in terms of if and only if there exists a subset Ď such that " . ♢ Unsurprisingly, expressibility in terms of can be characterized easily using projections, as the following result shows.

Proposition
Let be a set of ℰℒ K gfp -concept descriptions and let be an ℰℒ K gfp -concept description. Then is expressible in terms of if and only if " pr p q.
Proof If " pr p q, then clearly is expressible in terms of . Conversely, let Ď such that " . Then Ď for each P and hence Ď pr p q, which implies Ě pr p q. On the other hand, Ď pr p q by Lemma 4.4 and hence " pr p q follows as required.Ő ne of the crucial observations of [11] is that we can explicitly describe a set ℐ that is able to express all model-based most-specific concept descriptions of ℐ: 4.7 Theorem (Lemma 5.9 from [11]) Let ℐ be a finite interpretation and let be a concept description. Then ℐℐ is expressible in terms of ℐ .
The definitions and results given so far allow us to formulate one of the main results of [11], which is an explicit description of a finite base of ℐ.
4.8 Theorem (Theorem 5.10 of [11]) Let ℐ be a finite interpretation. Then the set is a finite base for ℐ.
We now turn out attention to the notion of induced contexts, as they are define in [11]. Using a special induced context K ℐ for the interpretation ℐ, we shall be able to derive a close relationship of the model-based most-specific concept descriptions of ℐ and the intents of K ℐ .

Definition
Let ℐ be a finite interpretation and let be a set of concept descriptions. Define the formal context K ℐ, :" p∆ ℐ , , ∇q, where ∇ ðñ P ℐ for all P ∆ ℐ and P . The formal context K ℐ, is the induced formal context of ℐ and .
If " ℐ , we write K ℐ instead of K ℐ, and call this induced formal context of ℐ. ♢ Induced context play a crucial role in combining formal concept analysis and ℰℒ K gfp . In particular, they allow us to reduce the size of the base described in Theorem 4.8 as much as possible.
Then the set ℬ Can defined by is a base of ℐ of minimal cardinality.
Projections and induced formal context allow to express very close relationships between operations in ℰℒ K gfp and in K ℐ, for suitable choices of . In particular, we can express the extensions of concept descriptions in ℐ, which are expressible in terms of , as pr p q 1 in K ℐ, for some set Ď . In addition, if we can express for Ď ∆ ℐ its model-based most-specific concept descriptions ℐ in terms of , we are also able to represent these concept description ℐ as 1 .
We formulate these relationships in the following two propositions. They already appear in [11]. where the derivation are computed within the induced context of ℐ and . Furthermore, every set Ď ∆ ℐ satisfies 1 " pr p ℐ q.
Proof Since is expressible in terms of , " pr p q by Proposition 4.6. Therefore P ℐ ðñ P p pr p qq ℐ ðñ @ P pr p q : P ℐ ðñ P pr p q 1 as pr p q 1 " t P ∆ ℐ | @ P pr p q : P ℐ u.
For the second claim we observe where Ď ℐ ðñ ℐ Ď holds due to Lemma 3.21.4 .12 Proposition (Lemma 4.10 and 4.11 from [11]) Let ℐ be a finite interpretation and let be a set of concept descriptions. Then each Ď satisfies where the derivations are computed in K ℐ, .
Let Ď ∆ ℐ . If ℐ is expressible in terms of , then Proof Remember that an object P ∆ ℐ has an attribute P if and only if P ℐ . Hence Let Ď ∆ ℐ such that ℐ is expressible in terms of . By Proposition 4.6, By Proposition 4.11, pr p ℐ q " 1 and hence the claim follows.I n Theorem 4.14 we shall precisely formulate a connection between the model-based most-specific concept descriptions of ℐ and the intents of K ℐ . To prove this connection, we shall make use of the following proposition.

Proposition Let ℐ be a finite interpretation and let
where the derivation is computed in K ℐ .
Proof By Lemma 4.4, Ď pr ℐ p q holds. Now as required.H aving defined the formal context K ℐ , we are now going to show that this formal context indeed allows us to view model-based most-specific concept descriptions as intents of a formal context. We have already seen that all model-based most-specific concept descriptions are expressible in terms of ℐ , the set of attributes of K ℐ . It is therefore not surprising that the lattice of intents of K ℐ and the equivalence classes of model-based most-specific concept descriptions ordered by Ě are order-isomorphic.
Before we prove the following theorem, we have to deal with a technical detail. This is because model-based most-specific concept descriptions are only unique up to equivalence. In particular, Ď is in general not an order relation on the set of all model-based most-specific concept descriptions. To overcome this we use the standard approach of considering classes of equivalent concept descriptions instead.
Let be a set of concept descriptions. Then let us define {" :" t r s | P u where r s :" t P | " u.
Furthermore, for , P we set r s Ď r s ðñ Ď .
Note that this is well-defined because ifˆP r s,ˆP r s, thenˆ" ,ˆ" and hence Ď ðñˆĎˆ. With this definition it is easy to see that p {", Ďq is an ordered set.

Theorem
Let ℐ be a finite interpretation and let ℳ be the set of all model-based most-specific concept descriptions of ℐ. Then the mappings describe an order-isomorphism between the ordered sets pPp ℐ q, Ďq and p ℳ {", Ěq via and´1pr sq " pr ℐ p q. More precisely, the following statements hold: i. P ℳ for each P IntpK ℐ q.
ii. pr ℐ p q P IntpK ℐ q for each P ℳ. iii.
v. pr ℐ p q " for each P IntpK ℐ q.
vi. pr p q " for each P ℳ.
Additionally, 2 " pr ℐ pp q ℐℐ q and ℐℐ " ppr ℐ p qq 2 for each set Ď ℐ and each concept description expressible in terms of ℐ , where the derivations are computed in K ℐ .
Proof We show each claim step by step.
For ii, let P ℳ, i. e. " ℐℐ . By Theorem 4.7, is expressible in terms of ℐ and hence by Proposition 4.11 pr ℐ p q " pr ℐ p ℐℐ q " p ℐ q 1 " ppr ℐ p qq 2 , thus pr ℐ p q P IntpK ℐ q. Claims iii and iv are already contained in Lemma 4.4.
For v we need to show that pr ℐ p q " for P IntpK ℐ q. By Proposition 4.13, Ď pr ℐ p q Ď 2 , and since " 2 , equality follows. Claim vi follows from Proposition 4.6, as P ℳ is expressible in terms of ℐ by Theorem 4.7.
Finally for Ď ℐ pr ℐ pp q ℐℐ q " pr ℐ pp 1 q ℐ q " 2 by Proposition 4.12 and Proposition 4.11, and ppr ℐ p qq 2 " ppr ℐ p q 1 q ℐ " ℐℐ for every ℰℒ K gfp -concept description , again by Proposition 4.11 and Proposition 4.12.A n immediate consequence of this theorem is the following.

Corollary
Let ℐ be a finite interpretation. Then for each Ď ℐ it is true that where the derivations are done in K ℐ .
Proof It is true that is expressible in terms of ℐ , therefore Theorem 4.14 yields Now by Proposition 4.11 we have pr ℐ p q 1 " p q ℐ . Therefore since p q ℐ " 1 by Proposition 4.12.4

.3 Computing Confident Bases of Finite Interpretations
Building upon the results of the previous sections, we are now able to describe a first confident base of Th pℐq for arbitrary choices of P r0, 1s. For this, we shall make use of results of [9], which itself uses ideas from Luxenburger [14]. As [9] already gives a thorough introduction and motivation of Luxenburger's results, we shall not repeat it here. Instead, we shall extend the results obtained in [9] by the result of Theorem 4.20.
Roughly speaking, the ideas by Luxenburger applied to our setting of confident GCIs can be formulated as follows. We consider the partition Th pℐq " Thpℐq Y pTh pℐqz Thpℐqq and try to separately find a base for Thpℐq and a confident base for Th pℐqz Thpℐq. Of course, a base ℬ of Thpℐq has already been given by Distel [11], so it remains to find a confident base of Th pℐqz Thpℐq.
To achieve this we use the following observation from Luxenburger, translated to the language of description logics: if p Ď q P Th pℐqz Thpℐq, it is true that because ℬ |ù p Ď ℐℐ q and H |ù p ℐℐ Ď q (note that Ď ℐℐ always holds in ℐ.) Therefore, it suffices to consider only GCIs of the form ℐℐ Ď ℐℐ .
Then we can formulate the following result.

Theorem
Let ℐ be a finite interpretation, let P r0, 1s and let ℬ be a base of ℐ. Then ℬ Y Confpℐ, q is a confident base of Th pℐq.
Proof Clearly ℬ Y Confpℐ, q Ď Th pℐq and it only remains to be shown that ℬ Y Confpℐ, q entails all GCIs with confidence at least in ℐ.
Let Ď be a GCI with conf ℐ p Ď q ě . We have to show that ℬ Y Confpℐ, q |ù Ď . If Ď is already valid in ℐ, then ℬ |ù Ď and nothing remains to be shown. We therefore assume that conf ℐ p Ď q ‰ 1.
As Ď ℐℐ is valid in ℐ, ℬ |ù Ď ℐℐ . Furthermore, conf p Ď q " conf ℐ p ℐℐ Ď ℐℐ q and hence p ℐℐ Ď ℐℐ q P Confpℐ, q. Finally, H |ù ℐℐ Ď . We therefore obtain ℬ Y Confpℐ, q |ù Ď ℐℐ , ℐℐ Ď ℐℐ , ℐℐ Ď and hence ℬ Y Confpℐ, q |ù Ď as required.I t is not hard to see that the prerequisites of the previous theorem can be weakened in the following way: instead of considering the whole set Confpℐ, q, it is sufficient to choose a base Ď Confpℐ, q of Confpℐ, q, since then Furthermore, it is not necessary for ℬ to be a base of ℐ. Instead, one can choose a setB of valid GCIs such thatB Y is complete for ℐ, because then

Corollary
Let ℐ be a finite interpretation, P r0, 1s. Let Ď Confpℐ, q be a base of Confpℐ, q and let ℬ Ď Thpℐq such that ℬ Y is complete for ℐ. Then ℬ Y is a confident base of Th pℐq.
Now, this results allows us to describe confident bases of Th pℐq in a very simple way: if ℒ is a confident base of Th pK ℐ q, then the set is a confident base of Th pℐq. This is the content of Theorem 4.20, which we shall prepare with the following lemmas.

Lemma
Let be a set of concept descriptions and let ℒ Ď Impp q and p Ñ q P Impp q. Then ℒ |ù p Ñ q implies ℒ |ù p Ď q.
Proof Let " p∆ ,¨q be an interpretation such that |ù ℒ. Recall that we denote with K , the formal context induced by and . We shall show that K , |ù ℒ. This then implies K , |ù p Ñ q and from this we shall infer that |ù p Ď q, as required.
Let p Ñ q P ℒ. Then p q Ď p q since |ù ℒ. By Proposition 4.12, p q " 1 , where the derivation is done in K , . Therefore, 1 Ď 1 is true in K , and hence K , |ù ℒ. Since ℒ |ù p Ñ q, we obtain 1 Ď 1 . Again by Proposition 4.12 we obtain p q Ď p q and therefore |ù p Ď q.N ote that the converse direction is not true in general, i. e. ℒ |ù p Ď q does in general not imply ℒ |ù p Ñ q. This fact is illustrated by the following example.

Example Let
:" t A, B u, :" t r u and " t A, B, Dr.A, Dr.B u. Define :" t Dr.A u, Then clearly ℒ |ù p Ñ q, but ℒ " t A Ď B u, p Ď q " pDr.A Ď Dr.Bq and therefore ℒ |ù p Ď q. ♢ We now prove the main result of this section.

Theorem
Let ℐ be a finite interpretation and let P r0, 1s. Let ℒ be a confident base of Th pK ℐ q. Then the set is a confident base of Th pℐq.
Proof We have to show that ℬ is sound and complete for Th pℐq, i. e. ℬ Ď Th pℐq and for each p Ď q P Th pℐq follows ℬ |ù p Ď q.
To see that ℬ is sound let p Ñ q P ℬ. We have to show that conf ℐ p Ñ q ě . To do this, we shall verify Let p q ℐ ‰ H. Then by Proposition 4.12, 1 ‰ H and hence If p q ℐ " H, then 1 " H and therefore This shows that ℬ is sound for Th pℐq.
We shall now show that ℬ is complete for Th pℐq. For this we shall show that i. ℬ |ù p Ď p q ℐℐ q for each Ď ℐ ; ii. ℬ |ù Confpℐ, q.
Recall from Theorem 4.8 that is a finite base of ℐ. By Theorem 4.16, ℬ ℐ Y Confpℐ, q is a confident base of Th pℐq. If we show the two claims from above, we have shown ℬ |ù pℬ ℐ Y Confpℐ, qq which then implies that ℬ is complete for Th pℐq as well.
For the first case let Ď ℐ . Since ℒ is a confident base for Th pK ℐ q, it is complete for K ℐ and therefore ℒ |ù p Ñ 2 q. For the second case let p ℐ Ď ℐ q P Confpℐ, q, i. e. , Ď ∆ ℐ and conf ℐ p ℐ Ď ℐ q P r , 1q. By Proposition 4.12, ℐ " 1 and ℐ " 1 , since ℐ , ℐ are expressible in terms of ℐ by Theorem 4.7. Therefore, As before we see that Since ℒ is a confident base for Th pK ℐ q it is true that ℒ |ù 1 Ñ 1 , hence ℬ |ù p 1 Ď 1 q by Lemma 4.18 and therefore ℬ |ù p ℐ Ď ℐ q.S ince Th pK ℐ q is a confident base of itself, we immediately obtain that Th pK ℐ q is a confident base of Th pℐq.
In addition to entailing all confident GCIs, a base of Th pK ℐ q also compromises the knowledge about when two concept descriptions , P ℐ subsume each other. If Ď , then the corresponding implication t u Ñ t u always holds in K ℐ and is therefore entailed by any base ℒ of Th pK ℐ q. However, this knowledge is not needed in the base ℒ of Th pℐq, and therefore may cause some redundancies in the base ℒ. The following result shows that at least these redundancies can be avoided.

Corollary
Let ℐ be a finite interpretation, P r0, 1s and let ℒ Ď ImppK ℐ q be such that ℒ Y ℐ is a confident base of Th pK ℐ q. Then ℒ is a confident base of Th pℐq.
Proof By Theorem 4.20, ℒ [ ℐ " pℒ Y ℐ q is a confident base of Th pℐq. Since is valid in every interpretation, the set ℒ is already a base of Th pℐq.3 More generally, if a base ℒ of Th pK ℐ q is redundant, then there exists an implication p Ñ q P ℒ such that ℒzt Ñ u |ù Ñ .

By Lemma 4.18 this yields
i. e. redundancies in a set of implications yield redundancies in the corresponding set of GCIs. Therefore, removing these redundancies is a good starting point for reducing the size of the result set of GCIs.
One way to do this is to use the results from Section 2.4 in the following way: if ℒ is a (confident) base of Th pK ℐ q, then we can reduce the size of the base by considering Canpℒq instead. As Cnpℒq " CnpCanpℒqq, we know that Canpℒq is a base of Th pK ℐ q. Also in this case does the set Canpℒq yield a base of Th pℐq.

Corollary
Let ℐ be a finite interpretation, P r0, 1s and let Ď ImppK ℐ q be a base of Th pK ℐ q. Then is a base of Th pℐq.
Proof By the observation after Theorem 4.20 we know that Th pK ℐ q is a confident base of Th pℐq. As is a base of Th pK ℐ q, from Lemma 4.18 we infer that is also a base of Th pK ℐ q. Hence, entails all GCIs from Th pℐq. Therefore, is complete for Th pℐq.
Conversely, as is a base of Th pK ℐ q, Th pK ℐ q is also a base of . But then is entailed by Th pK ℐ q, i. e. is entailed by Th pK ℐ q Ď Th pℐq. Therefore, is sound for Th pℐq and hence a base of Th pℐq.H owever, the approach of considering Canpℒq instead of ℒ has the drawback that we cannot guarantee anymore that Canpℒq Ď Th pK ℐ q, i. e. that Canpℒq is a confident base of Th pℐq.
Another observation is that in general we cannot transfer non-redundancy of a base ℒ Y ℐ of Th pK ℐ q to non-redundancy of ℒ. This is illustrated by the following example.

Example
We want to find an interpretation ℐ, a number P r0, 1s and a non-redundant set ℒ of implications of K ℐ such that ℒ Y ℐ is a confident base of Th pK ℐ q but ℒ contains redundancies.
The main idea to obtain such an example is to construct the interpretation ℐ in such a way that for two concept names A, B P , both implications t A u Ñ t B u and t Dr.A u Ñ t Dr.pA [ Bq u have confidence at least in K ℐ . Of course, this immediately implies that both A Ď B and Dr.A Ď Dr.pA [ Bq have confidence at least in ℐ. Then, if we can include the two implications in the set ℒ we are looking for, this will immediately result in ℒ being redundant. If, in addition, we can make sure that ℒ is a non-redundant set of implications such that ℒ Y ℐ is a confident base of Th pK ℐ q, then we have obtained our desired example.

Let
" t A, B u, " t r u and consider the interpretation ℐ 1 given in Figure 5, where every edge denotes an r-edge. In this interpretation, it is true that So, let us choose " 1 2 . Then we want to find a non-redundant set ℒ Ď ImppK ℐ1 q such that ℒ Y ℐ1 is a confident base of Th pK ℐ1 q and pt A u Ñ t B uq, pt Dr.A u Ñ t Dr.pA [ Bq uq P ℒ.   We can now compute an irredundant and complete subset ℒ 1 of the confident base CanpK ℐ1 q Y Confpℐ 1 , q, which, after removing redundancies from ℒ 1 with respect to ℐ1 , contains the following implications: However, and therefore ℒ is not irredundant. ♢

Experiments with Confident GCIs
The main motivation to consider confident GCIs is the idea that they may provide helpful information on finding errors in the data. But of course this is only a heuristic idea, and it is not clear a-priori whether this approach really is useful. Indeed, it is at least very hard to give some theoretical insight into the usefulness of this approach, as even formalizing the notion of interpretations with errors in accordance to practical observations is far from obvious.
Therefore, in this section we want to show the usefulness of considering confident GCIs by means of a real-world example. The data set we use stems from the DBpedia data set [8] as of March 2010, and is given as an interpretation ℐ DBpedia that represents the child-relation in this data set. A detailed construction of this interpretation has been given in [9], and we shall not repeat it here. Instead, we can think of ℐ DBpedia as an interpretation containing all elements that appear in the child-relation of the DBpedia data set as of March 2010. For these elements, we can collect properties such as Artist or Criminal, which then serve as concept names. As role name we just use child. Collecting these information in the interpretation ℐ DBpedia , we obtain 5626 elements and 60 concept names.
We have to mention a special peculiarity of this interpretation here, to not cause confusion when we present our experimental results. One would expect that the child-relation only relates persons to persons, i. e. only persons can be children of persons. However, DBpedia suffers from the liberal structure of Wikipedia Infoboxes, where it draws its information from. These infoboxes are not standardized in any way, and extracting information from them is really a difficult task. If in such an infobox a link to another article appears under the rubric listing children, then this link is collected as a children. However, sometimes there are some links under this rubric that link to articles somehow related to these children. For example, in ℐ DBpedia , the element Ellen_Harper has as a child the element The_Carol_Burnett_Show, which is a US american comedy show of the late 1960s and 1970s. This is because in the infobox of the Wikipedia article on Ellen Harper, one of its children is related to the Carol Burnett Show, with a link to the corresponding article.
Despite this oddity in the child-relation of DBpedia, the data set itself contains a lot of valuable information. Even better, one could argue that because of this peculiar child-relation, the interpretation ℐ DBpedia is very well suited for our experiments, because this allows us to verify in how far confident GCIs are able to detect some of these errors.
In the following, we want to assume that an ontology engineer wants to use ℐ DBpedia to construct an ontology that represents the properties of the child-relation. This ontology engineer want to consider confident GCIs to overcome some of the errors present in this interpretation. To do this, she has to extract some confident GCIs from ℐ DBpedia and has to check them for usefulness. This she has to do manually, and therefore this can be an expensive task. To show how much of extra work this can be for our particular example of ℐ DBpedia , we propose the following three experiments: i. We want to explicitly consider the sets Confpℐ DBpedia , 0.95q and Confpℐ DBpedia , 0.90q, to see whether the GCIs thus obtained are of any use for our ontology engineer working on ℐ DBpedia .
ii. We want to consider for all P t 0, 0.01, 0.02, . . . , 0.99 u the sizes of the set ConfpK ℐ DBpedia , q, to see how many GCIs have to be consider when varying the threshold on the minimal confidence. During this, we may also want to consider the canonical base of ConfpK ℐ DBpedia , q, since this may give rise to a much smaller base of Th pℐq.
iii. Finally, we want to consider for all P t 0, 0.01, 0.02, . . . , 0.99 u sizes of the canonical base of Th pK ℐ DBpedia q. The rationale behind this is the following: if we consider confident GCIs, we assume that we can circumvent certain errors and can extract more general patterns which are falsified by erroneous counterexamples. We therefore expect that, if we consider confident GCIs as actually valid GCIs, that the resulting theory extracted from a finite interpretation gets more succinct. With this experiment we want to examine in how far this is true for ℐ DBpedia for varying values of .
5.1 Confident GCIs of ℐ DBpedia for " 0.95 and " 0.90 In this section we want to show how our ontology engineer would examine confident GCIs extracted for two particular choices of . For this, we shall examine the sets Confpℐ DBpedia , 0.95q and Confpℐ DBpedia , 0.90q and discuss whether the GCIs contained in these sets are "reasonable." Thereby we decide whether a GCI Ď is reasonable by considering the counterexamples to Ď , for which we can decide whether they are valid counterexamples or not. It is quite surprising that the set turns out to have only three elements. Let us now consider every GCIs in more detail.
The set contains the GCI Dchild.J Ď Person, which indeed looks very natural. However, ℐ DBpedia contains four counterexamples, namely Teresa_Carpio, Charles_Heung, Adam_Cheng and Lydia_Shum. However, all these elements name individuals which are artists from Hong Kong, and therefore certainly are persons. In other words, these counterexamples are erroneous and the corresponding GCIs is valid.
It is also convincing that the GCI Place Ď PopulatedPlace is reasonable as well (places named in DBpedia appear because people have been born or lived there), and the only counterexample to this GCI is Greenwich_Village, denoting a district of New York which certainly is populated.
The last GCI which remains to be considered is One way to find a complete set of Thpℐ DBpedia q such that the mentioned GCIs are valid as well is just to compute the canonical base of K ℐ DBpedia with the corresponding background knowledge, i. e. we compute Since CanpK ℐ DBpedia , ℐ DBpedia q is complete for Thpℐ DBpedia q, the set pℒ Y ℱ q is complete for Thpℐ DBpedia q as well. Therefore pℒ Y ℱ q is a base of Thpℐ DBpedia q Y t Dchild.J Ď Person, Place Ď PopulatedPlace u.
If we now compute the set ℒ, we obtain a set of 1245 implications, therefore pℒ Y ℱ q is a base of (5.1) of size 1247. Compared to the 1252 implications needed to axiomatize Thpℐq, we can indeed observe a decrease in the size of the base, although this may not be very impressive.
Note, however, that another consequence of including the set ℱ into a base is of course, that the size of the concept descriptions in the resulting GCIs will become smaller and more readable. These GCIs are all quite specific and it is doubtful whether they may be of any use for an ontology designed who tries to extract GCIs from ℐ DBpedia . But let us still have a look at the counterexample for the given GCIs.
We shall start with the first GCI listed above, i. e. This GCI seems to be rather complicated, and one may assume a much more general GCI to be true, namely Dchild.J Ď Dchild.Person which is the ℰℒ-approximation of the fact that all children should be persons. However, as already discussed, this GCI is not true in ℐ DBpedia (and has confidence only around 0.53.) Now this GCI states that if you have generations of instances of Person of at least 5 generations, then the element at the fifth generation can be chosen to be a Person. The only counterexample to this GCI is Mayer_Amschel_Rothschild, naming the founder of the Rothschild dynasty. The only two fifth-generation descendants not being instances of Person in ℐ DBpedia are Edouard_Etienne_de_Rothschild and David_René_de_Rothschild, which are certainly persons. Therefore, this counterexample is invalid and this GCI is valid.
Let us now consider the remaining GCIs. In the order of appearance above, the following list gives all the counterexamples in ℐ DBpedia of the corresponding GCIs: The last counterexample has already been discussed in the previous case, so we shall focus our discussions on the first four only.
i. The individual John_McManners denotes an British clergyman and historian who had a son, Hugh_McManners, a musician and writer, who itself has a son. However, John_McManners, though being a famous writer, was not an artist. Therefore, this GCI is not correct.
ii. The individual Alois_Hitler names the father of Adolf Hitler, who was the only of the children of Alois Hitler to rule a country. As he had no children on its own, the individual serves as a correct counterexample to the given GCI, which is therefore incorrect.
iii. The individual Dejan_Dragaš denotes a 14th-century Serbian noblemen and despot of Kumanovo. He had two sons, Constantine_Dragaš, who had children and was ruler of parts of Serbia, but not a monarch, and Jovan_Dragaš, who was despot of Kumanovo, but had no children. Again, this counterexample is correct and the GCI invalid.
iv. The individual Marion_Dewar is not a correct counterexample, as Marion Dewar was member of the Canadian House of Commons from 1987 to 1988.
The other individual, Ranasinghe_Premadasa, denotes a former Prime Minister and later President of Sri Lanka. It is, however, quite hard to tell whether this means that he has ever been member of the Parliament of Sri Lanka. Hence, from the point of view of DBpedia extracting available knowledge from the Wikipedia pages, this counterexamples can be assumed correct, although further investigations by a human expert may be necessary.

Discussion
By considering Confpℐ DBpedia , 0.95q and Confpℐ DBpedia , 0.90q we have illustrated in which way an ontology engineer can make use of confident GCIs. As a first observation, we have seen that this may include non-trivial research for the ontology engineer. In particular, deciding whether a counterexample present in the data is correct always involves the question whether the counterexample is relevant for the particular domain the resulting ontology is to represent. It may therefore happen that an otherwise correct counterexample is rejected since it does not appear in the domain of discourse. With respect to this observation, one could also say that confident GCIs may help to model domains from data that does not fully describe these domains, but are merely an approximation of them.

Sizes of Bases of Th pℐ DBpedia q
In this section we shall conduct the remaining two experiments presented in the introduction, i. e. we shall examine the behavior of the size of Confpℐ DBpedia , q, CanpConfpℐ DBpedia , qq and of CanpTh pK ℐ DBpedia qq for varying values of . For the first experiment, we shall obtain an impression on how many extra GCIs an ontology engineer has to consider. For the second experiment, we shall obtain an intuition on how many GCIs a resulting TBox will contain. For each P , we compute | ConfpK ℐ DBpedia , q|. The result is shown in Figure 7, where the y-axis is scaled logarithmically.

The Size of Confpℐ
The results given in this picture show that the number of confident GCIs the ontology engineer has to check manually declines exponentially as the minimal confidence grows. Even for " 0.86, Of course, it is not clear whether this behavior is typical or just particular to our data set. However, it indicates that considering confident GCIs for data, where the quality is good enough (i. e. where only few errors have been made), is not a noteworthy overhead.
A drawback for this experiment is that we ignore the fact that ConfpK ℐ DBpedia , q does not need to be irredundant. Of course, if our ontology engineer confirms a certain subset ℒ Ď ConfpK ℐ DBpedia , q, then all implications already entailed by ℒ do not need to be checked on their own. The same is true if we consider the GCIs entailed by an already confirmed subset of Confpℐ DBpedia , q.
Therefore, we show in Figure 8 the size behavior of the canonical base of ConfpK ℐ DBpedia , q for all P . Note that this canonical base is irredundant, hence the aforementioned issue does not arise anymore. And indeed, as we can see from the picture, the number of GCIs decreases significantly, especially for small values of . But we can also observe that for larger values of , say ě 0.8, the overall number of GCIs to be considered does not decrease that much. Indeed, Confpℐ DBpedia , 0.8q contains 40 elements, whereas CanpConfpℐ DBpedia , 0.8qq contains 32. One could argue that this does not really help, but one also has to consider that checking every GCI by hand may be such an expensive task that every GCI saved pays off.

The Size of CanpTh pK ℐ DBpedia qq
We now turn our attention to the last experiment. There, we consider the size of the sets CanpTh pK ℐ DBpedia , qq for all P . The results of the experiment are shown in Figure 9.
From this data plot, we can make a couple of observations. For high values of , i. e. P r0.8, 1s, the overall size of CanpTh pK ℐ DBpedia , qq does not decrease significantly. We can see this as a sign that the overall theory Th pK ℐ DBpedia , q does not change significantly, i. e. that the errors we can handle with the aforementioned values of do not significantly influence Th pK ℐ DBpedia , q.
This is actually what we want to achieve with our method, namely correcting small errors while preserving as much of the original theory as possible. Of course, this data plot is only an indication that our methods achieves this in the particular example of ℐ DBpedia . But as ℐ DBpedia  Figure 9: Size of CanpTh pK ℐ DBpedia qq for all P arises from real-world data, we can be certain that this behavior is not accidental.
Of course, the more we decrease , the more we depart from the original set Th pK ℐ DBpedia q of implications. There are three points were this becomes especially apparent, namely around " 0.66, " 0.54 and " 0.18, where the changes in the size of CanpTh pK ℐ DBpedia qq is more significant then for other values of . Incidentally, these are three special values for , as i. for ď 0.18 the implication H Ñ ℐ DBpedia is entailed by Th pK ℐ DBpedia q, resulting in a singleton canonical base; ii. for ď 0.54, the implication t Dchild.J u Ñ t Dchild.Person u is contained in Th pK ℐ DBpedia q, eliminating a large number of special cases; iii. for ď 0.66, the implication H Ñ t Person u is contained in Th pK ℐ DBpedia q, also eliminating a variety of special cases.
Indeed, adding the implication H Ñ t Person u to ThpK ℐ DBpedia q results in the size of the canonical base to drop from 1252 to 1210. If we additionally add the implication t Dchild.J u Ñ t Dchild.Person u, the resulting canonical base then even only contains 1163 implications. 3 To understand this phenomenon, we can take the following point of view: intuitively, the lower the threshold , the simpler the set Th pK ℐ DBpedia q gets, because we neglect more special cases in our data set. If the change in size of the respective canonical base is as significant as observed for the three values given above, we would assume that, either, a lot new implications have been accepted, or that new implications are accepted that render a lot of other implications redundant. Indeed, as we have seen above, the second case occurs (consider also Figure 7 to see that the size of Confpℐ DBpedia , q does not change significantly at those points.) We can therefore consider those significant changes in the size of the canonical base of Th pK ℐ DBpedia q as a sign of discovery of some general implications, which may be of interest for our ontology engineer.
6 Unravelling Confident ℰℒ K gfp -Bases to Confident ℰℒ K -Bases So far, we have only considered bases of Th pℐq formulated in the description logic ℰℒ K gfp . Although this description logic allows us to find such bases in an easy way, the resulting bases may not be suitable for practical purposes. This is mainly due to the inherit incomprehensibility of ℰℒ K gfp -concept descriptions: the presence of cycles in these concept descriptions make it hard, even for logicians, to find out what this concept description is supposed to mean. On the other hand, ℰℒ K -concept descriptions are normally easy to understand and their intention can be also be deduced by non-experts. Therefore, we want to discuss in this section a way to obtain confident ℰℒ K -bases of Th pℐq from confident ℰℒ K gfp -bases of Th pℐq. The technique we are going to use for this is based on unravelling ℰℒ K gfp -concept descriptions as introduced in Section 3.4. The argumentation presented here is a generalization of the argumentation of Distel [11], who showed a similar result on obtaining ℰℒ K -bases of ℐ from ℰℒ K gfp -bases of ℐ. Let P r0, 1s and let us assume that is a base of Th pℐq. We can partition " ℬ Y , where ℬ Ď Thpℐq and X Thpℐq " H. Without loss of generality, we can also assume that ℬ only contains GCIs of the form Ď ℐℐ .
As a first step, we are going to define an auxiliary set ℐ, of ℰℒ K -GCIs that "capture" entailment relations between ℰℒ K gfp -concept descriptions. For this recall that by Lemma 3.29 there exists a P N such that ℐ " p q ℐ (6.1) holds for each ℰℒ K gfp -concept description , where denotes the unravelling of up to depth .
6.1 Definition Let ℐ be a finite interpretation and let P N be as in Lemma 3.29. Then define ℐ, :" t p ℐ q Ď p ℐ q`1 | Ď ∆ ℐ , ‰ H u. ♢ Note that ℐ, is a set of valid GCIs of ℐ, since for each Ď ∆ ℐ , ‰ H, it is true that pp ℐ q q ℐ " ℐℐ " pp ℐ q`1q ℐ .
In the course of Distel's argumentation [11,Theorem 21], the following property of ℐ, had been shown.

Lemma
Let ℐ be a finite interpretation and let P N as in Lemma 3.29. Then for each Ď ∆ ℐ , P N, ě it is true that ℐ, |ù pp ℐ q Ď p ℐ q`1q and ℐ, |ù pp ℐ q Ď ℐ q.
Before we are going to prove this lemma we need to formulate a preliminary result from [11]. As its prove involves notions we have not introduced in this work, we shall not repeat it here.
6.3 Lemma (Lemma 5.19 of [11]) Let , be ℰℒ gfp -concept descriptions. Then for each P N it is true that i. pD . q " D .´1 for each P and ii. p [ q " [ .
We now prove Lemma 6.2.
Proof (Lemma 6.2) For the first claim observer that if " H, ℐ " K and nothing remains to be shown. We therefore assume ‰ H and shall show the claim using induction on . The case " is clear as pp ℐ q Ď p ℐ q`1q P ℐ, . For the step-case assume that we already now that ℐ, |ù pp ℐ q´1 Ď p ℐ q q is true for all Ď ∆ ℐ , ‰ H.
From Theorem 4.7 we know that ℐ is expressible in terms of ℐ . As ‰ H, it is true that ℐ ‰ K and therefore ℐ " [ p , qPΠ D . ℐ for some Ď and Π ĎˆPp∆ ℐ q. By Lemma 6.3 we therefore obtain using Lemma 6.3. If ℐ " H, then p q ℐ " ℐ " H and conf ℐ p Ď q " conf ℐ p Ď q is true as well. Overall, we obtain pp ℐℐ q Ď p ℐℐ q q P Th pℐq and therefore 0 Ď Th pℐq as required.
We now consider the second claim, i. e. that ℬ 0 Y 0 Y ℐ, |ù ℬ. Let p Ď ℐℐ q P ℬ. Then it is true that H |ù p Ď q ℬ 0 |ù p Ď p ℐℐ q q ℐ, |ù pp ℐℐ q Ď ℐℐ q again using Lemma 6.2 for the last entailment. Therefore, and the claim is proven.7

Conclusions and Further Work
This work extended the results obtained in [9] in various ways. Firstly, we have given another construction of a base of Th pℐq, which works by directly transforming bases of Th pK ℐ q into confident bases of Th pℐq. Secondly, we have given experimental evidence that our approach of considering confident GCIs may be helpful during the process of construction an ontology from example data. Finally, we have shown that certain ℰℒ K gfp -bases of Th pℐq can effectively be transformed into ℰℒ K -bases of Th pℐq by generalizing the corresponding technique of [11].
From the viewpoint of both theory and practical application of confident GCIs, the most important next step is to generalize the exploration algorithm from [11] to our setting of confident GCIs. This may simplify the exploration process in the way that certain, special GCIs may not have to be considered as soon as a more general GCI, which may have some erroneous counterexamples, has already been confirmed. As exploration has as its main purpose to complete an ontology by missing statements, generalizing the exploration process to confident GCIs may also unify two steps of the ontology construction, namely construction from data and completing the ontology.
Certainly, another direction of research would be to clarify and formalize the vague argumentation we have given in Section 5.2.2. For this, it may also be interesting to conduct experiments directly with GCIs and not only with implications. This, however, would require a possibility to compute a smallest base of a given set of GCIs, i. e. a method to minimize the cardinality of a given TBox.