Preferential Query Answering in the Semantic Web with Possibilistic Networks (Extended Abstract)

. In this paper, we explore how ontological knowledge expressed via existential rules can be combined with possibilistic networks (i) to represent qualitative preferences along with domain knowledge, and (ii) to realize preference-based answering of conjunctive queries (CQs). We call these combinations ontological possibilistic networks (OP-nets). We deﬁne skyline and k -rank answers to CQs under preferences and provide complexity (including data tractability) results for deciding consistency and CQ skyline membership for OP-nets. We show that our formalism has a lower complexity than a similar existing formalism.


Introduction
The abundance of information on the Web requires new personalized filtering techniques to retrieve resources that best fit users' interests and preferences.Moreover, the Web is evolving at an increasing pace towards the so-called Social Semantic Web (or Web 3.0), where classical linked information lives together with ontological knowledge and social interactions of users.While the former may allow for more precise and rich results in search and query answering tasks, the latter can be used to enrich the user profile, and it paves the way to more sophisticated personalized access to information.This requires new techniques for ranking search results, fully exploiting ontological and user-centered data, i.e., user preferences.
Conditional preferences are statements of the form "in the context of c, a is preferred over b", denoted c : a b [1,7,13].Two preference formalisms that can represent such preferences are possibilistic networks and CP-nets.
Example 1. Bob wants to rent a car and (i) he prefers a new car over an old one, (ii) given he has a new car, he prefers it to be black over not black, and (iii) if he has an old car, he prefers it to be colorful over being black.We have two variables for car type (new (n) or old (o)) and car color (black (b) or colorful (c)),  In CP-nets [7], we have the following ordering of outcomes: nb nc oc ob.That is, a new and colorful car is preferred over an old and colorful one, which is not a realistic representation of the given preferences.A more desirable order of outcomes for Bob would be nb oc nc ob, which can be induced in possibilistic networks with an appropriate preference weighting in the possibility distribution.
We propose a novel language for expressing preferences over the Web 3.0 using possibilistic networks.It has lower complexity compared to a similar existing formalism called OCP-theories [9], which are an integration of Datalog+/− with CP-theories [13].This is because deciding dominance in possibilistic networks can be done in polynomial time, while it is pspace-complete in CP-theories.Every possibilistic network encodes a unique (numerical) ranking on the outcomes, while CP-theories encode a set of (qualitative) total orders on the outcomes.Our framework also allows to specify the relative importance of preferences [1].Possibilistic networks are also a simple and natural way of representing conditional preferences and obtaining rankings on outcomes, and can be easily learned from data [5].We choose existential rules in Datalog+/− as ontology language for their intuitive nature, expressive power for rule-based knowledge bases, and the capability of performing query answering.
All details can be found in the full paper [6].
Example 2. Consider the database D in Fig. 1, modeling the domain of an online car booking system.Moreover, the dependencies say that every offer must have a specification and a vendor and that there cannot be two equivalent offers from the same company with different prices.We denote by t 1 the term specs(s 1 , b, f 1 , o) and by t 1 the tuple (s 1 , b, f 1 , o).
Let now X O be a finite set of variables, where each X ∈ X O corresponds to a predicate from O, denoted pred (X).The domain Dom(X) consists of at least two ground atoms p(c 1 , . . .c k ) with p = pred (X).An outcome o ∈ Dom(X O ) assigns to each variable an element of its domain, and can be seen as a conjunction of ground atoms.An OP-net is of the form (O, Γ), where Γ is a possibilistic network over X O , i.e., a collection of conditional possibility distributions π(X i |pa(X i )), where pa(X i ) are the parents of X i .Taken altogether, they define a joint possibility distribution over Dom(X O ).
represents the conjunction t 1 ∧ t 10 ∧ t 7 and has the possibility 1.
Since outcomes are conjunctions of ground atoms, they may be inconsistent or equivalent w.r.t.Σ.

Encoding Preferences with OP-Nets
In [9], conditional preferences were generalized to Datalog+/-as follows.Let Dom + (X) be the set of all (possibly non-ground) atoms p(t 1 , . . ., t k ) with p = pred (X).An ontological conditional preference ϕ is of the form v : ξ ξ , where A ground instance vθ : ξθ ξ θ of ϕ is obtained via a substitution θ such that vθ ∈ Dom(U ϕ ) and ξθ, ξ θ ∈ Dom(X ϕ ).Under suitable acyclicity conditions, one can construct an OP-net (O, Γ) that respects all ground instances of some given ontological conditional preferences.
Example 4. Consider the ontological conditional preference specs(I, C, F, o) : vendor(V 1 , p) vendor(V 2 , n), i.e., for an old car, it is preferable to have a vendor with positive feedback.One ground instance for this preference is specs(t 1 ) : vendor(t 10 ) vendor(t 11 ).We could choose π(vendor(t 10 )|specs(t 1 )) = 1 and π(vendor(t 11 )| specs(t 1 )) = α < 1 to encode this in an OP-net Although possibilistic networks are less expressive than conditional preference theories (CP-theories) [3,13], they allow for a more compact encoding of conditional preferences over ground atoms and have lower complexity.

Query Answering under OP-Nets
The notions of skyline and k-rank answers are defined in the same way as for OCP-theories [9].In a conjunctive query (CQ), a variable X of the OP-net may be used to annotate an atom over the predicate pred (X).Hence, an answer (tuple) a to a CQ q w.r.t. an outcome o is an assignment of the distinguished variables that can be used to satisfy q in such a way that the marked atoms of q evaluate to the ones given by o.A skyline answer is an answer w.r.t. an undominated outcome of the OP-net.CQ skyline membership is the problem of deciding whether a given tuple is a skyline answer.Similarly, one can define k-rank answers as the k "most preferred" answers, i.e., those resulting from the outcomes with the highest possibilities.
Example 5. Consider the consistent OP-net (O, Γ) of Example 3 and the CQ q(C, F, T, N ) = ∃I specs(I, C, F, T ) ∧ feature(F, N ).Then, b, f 1 , o, ac is the skyline answer under the consistent outcome t 1 ∧ t 10 ∧ t 7 .The skyline answer for q (C, T ) = ∃N q(C, f 2 , T, N ) is c, n with possibility π(t 2 t 10 t 8 ) = 0.5 • 1 • 0.7 = 0.35, while the 2-rank answer is c, n , c, o .Hence, if feature f 2 is mandatory, the offered new and colorful car is preferred over the old and colorful one, mainly due to positive feedback about vendor v 1 .

Computational Complexity
We now analyze the computational complexity of the consistency and CQ skyline membership problems for OP-nets.We assume familiarity with the complexity classes ac 0 , p, np, co-np, ∆ p 2 , Σ p 2 , Π p 2 , ∆ p 3 , pspace, exp, and 2exp.The class is the class of all problems that are the intersection of a problem in np (resp., Σ p 2 ) and a problem in co-np (resp., Π p 2 ).Following Vardi's taxonomy [12], the combined complexity is calculated by considering all the components, i.e., the database, the set of dependencies, and the query, as part of the input.The bounded-arity combined (ba-combined) complexity assumes that the arity of the underlying schema is bounded by a constant.For example, in description logics (DLs) [4], the arity is always bounded by 2. The fixed-program combined (fp-combined) complexity is calculated by considering the set of TGDs and NCs as fixed.Finally, for data complexity, we take only the size of the database into account.
Although CQ answering in Datalog+/-is undecidable in general, there exist many syntactic conditions that guarantee decidability.We refer the reader to [6] for a short overview of the classes of acyclic (A), guarded (G), and sticky (S) sets of TGDs, their "weak" counterparts WA, WG, and WS, linear TGDs (L), full TGDs (F), and the combinations AF, GF, SF, and LF.
Our complexity results for the consistency and the CQ skyline membership problems for OP-nets are compactly summarized in Tables 1 and 2, respectively.Compared to OCP-theories [9], we obtain lower complexities for L, LF, AF, G, S, F, GF, SF, WS, and WA in the fp-combined complexity (completeness for d p and ∆ p 2 , respectively, rather than pspace), and for L, LF, AF, S, F, GF, and SF in the ba-complexity (completeness for d p 2 and ∆ p 3 , respectively, rather than pspace).In particular, for C = pspace, we obtain inclusion in pspace for both problems, and the same for any deterministic complexity class above pspace.For C = np, we get the classes d p 2 and ∆ p 3 .The lower bounds pspace and above is a propositional 3-DNF formula, whether the lexicographically maximal satisfying truth assignment for X = {x 1 , . . ., x n } maps x n to true is ∆ p 3 -complete [11].Finally, we can show that tractability in data complexity for deciding consistency and CQ skyline membership for OP-nets carries over from classical CQ answering.Here, data complexity means that Σ and the variables and possibility distributions of Γ are both fixed, while D is part of the input.Theorem 7. Let T be a class of OP-nets (O, Γ) for which CQ answering in O is possible in polynomial time (resp., in ac 0 ) in the data complexity.Then, deciding consistency and CQ skyline membership in T is possible in polynomial time (resp., in ac 0 ) in the data complexity.
The listed p-hardness results hold due to a standard reduction of propositional logic programming to guarded full TGDs.These results do not apply to WG, where CQ answering is data complete for exp, and data hardness holds even for ground atomic CQs; however, data completeness for exp can be proved similarly to the results for combined complexity above.
We want to emphasize that our complexity results are generic, applying also to Datalog+/-languages beyond the ones listed.Even more, they are valid for arbitrary preference formalisms for which dominance between two outcomes can be decided in polynomial time, e.g., combinations of Datalog+/-with rankings computed by information retrieval methods [10].
Interesting topics of ongoing and future research include the implementation and experimental evaluation of the presented approach, as well as a generalization based on possibilistic logic [3] to gain more expressivity and some new features towards non-monotonic reasoning [1]; moreover, an apparent relation between possibilistic logic and quantitative choice logic [2] may be exploited.
An outcome o dominates another outcome o (written o o ) if π(o) > π(o ).This relation can be decided in polynomial time.Example 3. Consider the OP-net (O, Γ) given by the ontology O of Example 2 and the dependency graph and the conditional possibility distribution in Fig. 2. Here, we have X O = {C O , R O , F O } with the domains Dom(C O ) = {specs(t 1 ), specs(t 2 ), specs(t 3 )}, Dom(F O ) = {feature(t 7 ), feature(t 8 ), feature(t 9 )}, Dom(R O ) = {vendor(t 10 ), vendor(t 11 )}.The parents of F O are {C O , R O }, which in turn do not depend on other variables.The distribution could either be learned or derived from explicit preferences; see Example 4 below.The possibilities of outcomes are then computed as

Theorem 6 .
Let T be a class of OP-nets (O, Γ).If checking non-emptiness of the answer set of a CQ w.r.t.O is in a complexity class C, then consistency in T is in np C ∧ co-np C and CQ skyline membership in T is in p np C .If C = np and we consider the fp-combined complexity, then consistency in T is in d p and CQ skyline membership in T is in ∆ p 2 .
Two outcomes o and o are equivalent, denoted o ∼ o , if O o and O o have the same models.An interpretation I for (O, Γ) is a total preorder over the consistent outcomes in Dom(X O ).It satisfies (or is a model of) (O, Γ) if it is compatible with the dominance and equivalence relations, i.e., for all consistent outcomes o and o , (i) if o ≺ o , then (o, o ) ∈ I and (o , o) / ∈ I, and (ii) if o ∼ o , then (o, o ), (o , o) ∈ I.An OP-net is consistent if it has at least one consistent outcome and it has a model.

Table 1 .
Complexity of deciding consistency of OP-nets

Table 2 .
Complexity of deciding CQ skyline membership for OP-nets and equivalence of outcomes being as powerful as checking entailment of arbitrary ground CQs.The remaining lower bounds for the (fp-/ba-)combined complexity hold already if only NCs are allowed, and are shown by reductions from variants of the validity problem for QBFs.For example, the problem of deciding, given a valid formula ∃X∀Yϕ(X, Y) where ϕ(X, Y)