Cloud Search’s query interpretation feature automatically interprets the operators and filters in a user’s query, and converts those elements into a structured, operator-based query. Query interpretation uses operators defined in the schema, together with the indexed documents, to deduce what the user's query means. This feature allows a user to search with minimal keywords, yet still obtain precise results.
The actual results presented to the user depend on the confidence
of the query interpretation. Confidence is based on several factors,
including where the query strings appear in indexed documents. A string, such
as the name of the actor "Tom Hanks," appearing consistently in a
schema field called actors
results in a higher confidence. The same string
("Tom Hanks") appearing within a paragraph, rather than schema field, can
result in lower confidence. In the case of a strong confidence, only results
from query interpretation are displayed to the user. In the case of weaker
confidence, the results from the query interpretation are blended with a normal
keyword search results.
Example query interpretation
Suppose you have a data source, such as a database, containing information about movies. Figure 1 shows a sample search query and resulting interpretation.
Given this example query, query interpretation does the following:
Parses the schema and determines that the top-level objects in the data source are classified as
objecttype:movies
. Query interpretation now knows that "movies" in the query is an object type.Scans documents in the data source, in conjunction with the schema, to determine where the string "action" occurs. If the string primarily occurs in a specific "genre" data source field, then query interpretation has the confidence that "action" is a property value for the property "genre" as defined in the schema. If the string primarily occurs in the context of paragraphs of content, then query interpretation's confidence level decreases.
The resulting query interpretation is:
actor:“tom hanks” genre:action objecttype:movies
Query interpretation is automatically enabled for all Cloud Search customers with no additional work. However, for optimal query interpretation you should structure your schema per the instructions in this document.
Structure your schema to support query interpretation
You should structure your schema to ensure that you can benefit from query interpretation.
Enable display name interpretations
Cloud Search’s query interpretation utilizes the
objectDefinitions
and
propertyDefinitions
in a schema to interpret a user’s query and tune the
results. To maximize the benefit of these schema elements, you should create
intuitive display names using
displayLabel
for property names,
objectDisplayLabel
for object names, and operatorName
for operators.
The following schema shows intuitive display names for a movie object:
{
"objectDefinitions": [
{
"name": "movie",
"options": {
"displayOptions": {
"objectDisplayLabel": "Films"
}
...
},
"propertyDefinitions": [
{
"name": "genre",
"isReturnable": true,
"isRepeatable": true,
"isFacetable": true,
"textPropertyOptions": {
"retrievalImportance": { "importance": "HIGHEST" },
"operatorOptions": {
"operatorName": "genre"
}
},
"displayOptions": {
"displayLabel": "Category"
}
},
...
]
}
]
}
In the previous example:
The movie object definition has a “Film”
objectDisplayLabel
.The genre propertyDefinition has a “genre”
operatorName
and a “Category”displayLabel
.
These display names enable Cloud Search to make the following query interpretations:
- “action movies,” “genre action type movies,” or “movies genre action” are
interpreted as
genre:action object:movies
. - “movies with genre action or thriller” is interpreted as
objecttype:movies genre:(action OR thriller)
. - “action film” or “action films” is interpreted as
genre:action objecttype:movies
. - “comedy category movies” is interpreted as
genre:comedy objecttype:movies
.
Enable date, numerical, and sort interpretations
You should define the lessThanOperatorName
and greaterThanOperatorName
,
specified in
IntegerOperatorOptions
, for all date and
numerical properties. These settings enable automatic date and
numerical interpretations. Additionally, to enable sort interpretations,
set the isSortable
option for date and numerical properties. The following
schema shows how to enable these options.
{
"objectDefinitions": [
{
"options": {
"displayOptions": {
"objectDisplayLabel": "Films"
}
},
"propertyDefinitions": [
{
"name": "runtime",
"isReturnable": true,
"isSortable": true,
"integerPropertyOptions": {
"orderedRanking": "DESCENDING",
"minimumValue": {
"value": 10
},
"maximumValue": {
"value": 500
},
"operatorOptions": {
"operatorName": "runtime",
"lessThanOperatorName": "runtimelessthan",
"greaterThanOperatorName": "runtimegreaterthan"
}
},
"displayOptions": {
"displayLabel": "Length"
}
},
{
"name": "releasedate",
"isReturnable": true,
"isSortable": true,
"datePropertyOptions": {
"operatorOptions": {
"operatorName": "releasedate",
"lessThanOperatorName": "releasedbefore",
"greaterThanOperatorName": "releasedafter"
}
}
}
]
}
]
}
In the previous example:
- The numeric property
runtime
refers to the length of a movie. Theruntimelessthan
andruntimegreaterthan
is set for this property. - The date property
releaseDate
refers to when a movie is released in the theaters. Thereleasedbefore
andreleasedafter
is set for this property.
These settings enable Cloud Search to make the following query interpretations:
- Assuming the year is 2019, “movies released this year” is interpreted as
objecttype: movies releasedafter:2019-1-1 releasedbefore:2019-12-31
. - Assuming the week is the third week in march, “movies released last week” is
interpreted as
objecttype: movies releasedafter:2019-3-10 releasedbefore:2019-3-16
- “movies with runtime less than 90” is interpreted as
objjecttype: movies runtimelessthan:90
. - Assuming the year is 2019, “movies released this year and length more than
120” is interpreted as
releasedafter:2019-1-1 releasedbefore:2019-12-31 objecttype:movies runtimegreaterthan:120
. - “sort movies by release date” would filter on “objecttype: movies” and the results presented would be sorted on released date with the default sort order being ascending.
Enable reserved operator interpretation
You can also use the type
, before
, after
, objecttype
reserved built-in
operators to enhance query interpretation. When indexing a document, do the
following:
Populate the
updateTime
field in theItemMetadata
to usebefore
andafter
operators. These settings enable Cloud Search to make the following query interpretations:- “movies from last week” would list all the movies that were updated in the index the prior week.
- “movies before jan 2019” would list all the movies that were indexed before January 2019.
Populate the
mimeType
field in theItemMetadata
to use autodetection of type. A query “action videos” would list all action movie documents with a mime type ofapplication/mp4
,application/mpeg4
,application/x-shockwave-flash
,video/
, andapplication/vnd.google-apps.video
.
Query interpretation limitations
The query interpretation feature has the following limitations.
- Query interpretation only works for these datasource ACLs:
- All documents are domain public (everyone in domain can access).
- All documents are datasource public (everyone that has access to data source ACL).
- The majority of documents in the datasource have same ACL (all documents inherit ACL from same container item) with no additional readers defined.
- If multiple schema operators have the same value, the interpretation of
that value to an operator intent for a query depends on the overall confidence
factor returned by the query interpretation system. For example, suppose you
have the properties
priority
andseverity
with the same operator names defined in schema. Let's say both operators can have the values 0, 1, 2, or 3. In this example, "0" in a query can refer to the operator value for eitherpriority
orseverity
. These values are ambiguous and confidence level is lower. - By default, Cloud Search’s query interpretation lowers the case of field
values when interpreting the query, except for those text operators defined with
exactMatchWithOperator
options. - The
source
operator is not supported in queries. - Queries that combine operator-based terms and free text-terms are not interpreted. For example, the query "p0 priority cases severity:s0" wouldn't be supported because "p0 priority cases" is a free text-term while "severity:s0" is a operator-based term.
- The query interpretation strategy always blends the interpreted results with ordinary (non-interpreted, relevance-ranked) results. It does not perform a full page replacement of results.