org.apache.solr.util

Class SolrPluginUtils



  • public class SolrPluginUtils
    extends Object

    Utilities that may be of use to RequestHandlers.

    Many of these functions have code that was stolen/mutated from StandardRequestHandler.

    :TODO: refactor StandardRequestHandler to use these utilities

    :TODO: Many "standard" functionality methods are not cognisant of default parameter settings.

    • Method Detail

      • setDefaults

        public static void setDefaults(SolrQueryRequest req,
                                       SolrParams defaults,
                                       SolrParams appends,
                                       SolrParams invariants)
        Set default-ish params on a SolrQueryRequest. RequestHandlers can use this method to ensure their defaults and overrides are visible to other components such as the response writer
        Parameters:
        req - The request whose params we are interested i
        defaults - values to be used if no values are specified in the request params
        appends - values to be appended to those from the request (or defaults) when dealing with multi-val params, or treated as another layer of defaults for singl-val params.
        invariants - values which will be used instead of any request, or default values, regardless of context.
      • split

        public static String[] split(String value)
        Split a value that may contain a comma, space of bar separated list.
      • optimizePreFetchDocs

        public static void optimizePreFetchDocs(ResponseBuilder rb,
                                                DocList docs,
                                                Query query,
                                                SolrQueryRequest req,
                                                SolrQueryResponse res)
                                         throws IOException
        Pre-fetch documents into the index searcher's document cache. This is an entirely optional step which you might want to perform for the following reasons:
        • Locates the document-retrieval costs in one spot, which helps detailed performance measurement
        • Determines a priori what fields will be needed to be fetched by various subtasks, like response writing and highlighting. This minimizes the chance that many needed fields will be loaded lazily. (it is more efficient to load all the field we require normally).
        If lazy field loading is disabled, this method does nothing.
        Throws:
        IOException
      • doStandardDebug

        public static NamedList doStandardDebug(SolrQueryRequest req,
                                                String userQuery,
                                                Query query,
                                                DocList results,
                                                boolean dbgQuery,
                                                boolean dbgResults)
                                         throws IOException

        Returns a NamedList containing many "standard" pieces of debugging information.

        • rawquerystring - the 'q' param exactly as specified by the client
        • querystring - the 'q' param after any preprocessing done by the plugin
        • parsedquery - the main query executed formated by the Solr QueryParsing utils class (which knows about field types)
        • parsedquery_toString - the main query executed formatted by its own toString method (in case it has internal state Solr doesn't know about)
        • explain - the list of score explanations for each document in results against query.
        • otherQuery - the query string specified in 'explainOther' query param.
        • explainOther - the list of score explanations for each document in results against 'otherQuery'
        Parameters:
        req - the request we are dealing with
        userQuery - the users query as a string, after any basic preprocessing has been done
        query - the query built from the userQuery (and perhaps other clauses) that identifies the main result set of the response.
        results - the main result set of the response
        Returns:
        The debug info
        Throws:
        IOException - if there was an IO error
      • parseFieldBoosts

        public static Map<String,FloatparseFieldBoosts(String in)
        Given a string containing fieldNames and boost info, converts it to a Map from field name to boost info.

        Doesn't care if boost info is negative, you're on your own.

        Doesn't care if boost info is missing, again: you're on your own.

        Parameters:
        in - a String like "fieldOne^2.3 fieldTwo fieldThree^-0.4"
        Returns:
        Map of fieldOne => 2.3, fieldTwo => null, fieldThree => -0.4
      • parseFieldBoosts

        public static Map<String,FloatparseFieldBoosts(String[] fieldLists)
        Like parseFieldBoosts(String), but parses all the strings in the provided array (which may be null).
        Parameters:
        fieldLists - an array of Strings eg. {"fieldOne^2.3", "fieldTwo", fieldThree^-0.4}
        Returns:
        Map of fieldOne => 2.3, fieldTwo => null, fieldThree => -0.4
      • parseFieldBoostsAndSlop

        public static List<FieldParamsparseFieldBoostsAndSlop(String[] fieldLists,
                                                                int wordGrams,
                                                                int defaultSlop)
        /** Like parseFieldBoosts(java.lang.String), but allows for an optional slop value prefixed by "~".
        Parameters:
        fieldLists - - an array of Strings eg. {"fieldOne^2.3", "fieldTwo", fieldThree~5^-0.4}
        wordGrams - - (0=all words, 2,3 = shingle size)
        defaultSlop - - the default slop for this param
        Returns:
        - FieldParams containing the fieldname,boost,slop,and shingle size
      • setMinShouldMatch

        public static void setMinShouldMatch(BooleanQuery.Builder q,
                                             String spec,
                                             boolean mmAutoRelax)
        Checks the number of optional clauses in the query, and compares it with the specification string to determine the proper value to use.

        If mmAutoRelax=true, we'll perform auto relaxation of mm if tokens are removed from some but not all DisMax clauses, as can happen when stopwords or punctuation tokens are removed in analysis.

        Details about the specification format can be found here

        A few important notes...

        • If the calculations based on the specification determine that no optional clauses are needed, BooleanQuerysetMinMumberShouldMatch will never be called, but the usual rules about BooleanQueries still apply at search time (a BooleanQuery containing no required clauses must still match at least one optional clause)
        • No matter what number the calculation arrives at, BooleanQuery.setMinShouldMatch() will never be called with a value greater then the number of optional clauses (or less then 1)

        :TODO: should optimize the case where number is same as clauses to just make them all "required"

        Parameters:
        q - The query as a BooleanQuery.Builder
        spec - The mm spec
        mmAutoRelax - whether to perform auto relaxation of mm if tokens are removed from some but not all DisMax clauses
      • flattenBooleanQuery

        public static void flattenBooleanQuery(BooleanQuery.Builder to,
                                               BooleanQuery from)
        Recursively walks the "from" query pulling out sub-queries and adding them to the "to" query.

        Boosts are multiplied as needed. Sub-BooleanQueryies which are not optional will not be flattened. From will be mangled durring the walk, so do not attempt to reuse it.

      • stripIllegalOperators

        public static CharSequence stripIllegalOperators(CharSequence s)
        Strips operators that are used illegally, otherwise returns its input. Some examples of illegal user queries are: "chocolate +- chip", "chocolate - - chip", and "chocolate chip -".
      • stripUnbalancedQuotes

        public static CharSequence stripUnbalancedQuotes(CharSequence s)
        Returns its input if there is an even (ie: balanced) number of '"' characters -- otherwise returns a String in which all '"' characters are striped out.
      • removeNulls

        public static <T> NamedList<T> removeNulls(Map.Entry<String,T>[] entries,
                                                   NamedList<T> dest)
        Adds to dest all the not-null elements of entries that have non-null names
        Parameters:
        entries - The array of entries to be added to the NamedList dest
        dest - The NamedList instance where the not-null elements of entries are added
        Returns:
        Returns The dest input object
      • getSort

        public static Sort getSort(SolrQueryRequest req)
        Determines the correct Sort based on the request parameter "sort"
        Returns:
        null if no sort is specified.
      • getRequestPurpose

        public static String getRequestPurpose(Integer reqPurpose)
        Given the integer purpose of a request generates a readable value corresponding the request purposes (there can be more than one on a single request). If there is a purpose parameter present that's not known this method will return "Unknown"
        Parameters:
        reqPurpose - Numeric request purpose
        Returns:
        a comma separated list of purposes or "Unknown"