org.apache.lucene.search

Class DisjunctionMaxQuery

  • All Implemented Interfaces:
    Iterable<Query>


    public final class DisjunctionMaxQuery
    extends Query
    implements Iterable<Query>
    A query that generates the union of documents produced by its subqueries, and that scores each document with the maximum score for that document as produced by any subquery, plus a tie breaking increment for any additional matching subqueries. This is useful when searching for a word in multiple fields with different boost factors (so that the fields cannot be combined equivalently into a single search field). We want the primary score to be the one associated with the highest boost, not the sum of the field scores (as BooleanQuery would give). If the query is "albino elephant" this ensures that "albino" matching one field and "elephant" matching another gets a higher score than "albino" matching both fields. To get this result, use both BooleanQuery and DisjunctionMaxQuery: for each term a DisjunctionMaxQuery searches for it in each field, while the set of these DisjunctionMaxQuery's is combined into a BooleanQuery. The tie breaker capability allows results that include the same term in multiple fields to be judged better than results that include this term in only the best of those multiple fields, without confusing this with the better case of two different terms in the multiple fields.
    • Constructor Detail

      • DisjunctionMaxQuery

        public DisjunctionMaxQuery(Collection<Query> disjuncts,
                                   float tieBreakerMultiplier)
        Creates a new DisjunctionMaxQuery
        Parameters:
        disjuncts - a Collection<Query> of all the disjuncts to add
        tieBreakerMultiplier - the score of each non-maximum disjunct for a document is multiplied by this weight and added into the final score. If non-zero, the value should be small, on the order of 0.1, which says that 10 occurrences of word in a lower-scored field that is also in a higher scored field is just as good as a unique word in the lower scored field (i.e., one that is not in any higher scored field.