Full Text Search Optimization

Advantage Concepts

Search conditions that use only search words combined with the logical operators AND and OR are fully optimized (assuming an FTS index exists on the field being searched). For example, the following query is fully optimized:

SELECT * FROM apdd

where contains( definition, ‘science and (history or proof)’ )

Some search conditions require low-level searches of the data to fully resolve them. These searches return an initial optimized result set that must then be searched to determine which records fully meet the condition. The specific cases that require this post-processing step include the following:

Note that substring and post-fix searches do not require post-processing. Although they are not nearly as efficient as prefix and exact match searches, they are fully resolved in the initial result set.

Finally, some search conditions cannot be optimized. Search conditions that use the NOT operator on a sub-condition that requires post-processing cannot be optimized. For example, the following query cannot be optimized because it uses the logical NOT operation on a partially optimized sub-condition.

SELECT * FROM apdd

 where contains( definition, ‘not (species near snake)’ )

It is also possible that a CONTAINS() function must be evaluated by searching the physical data when the entire expression cannot be optimized. For details on this see Advantage Optimized Filters. In general, though, the expression logical OR operator (not the FTS OR operator) is the most common cause of this. The following statement is an example of this:

SELECT * FROM apdd

 where contains( definition, 'polite recognition' )

or id = 36

If the field "id" is not indexed, then the entire expression "contains( … ) or id = 36" cannot be optimized. This means that every record in the table must be evaluated against the entire expression including the CONTAINS() evaluation, which can be expensive. If the field "id" has an index on it and the field "definition" has an FTS index, then the entire statement will be optimized.