Using the Advantage Full Text Search Engine
With the release of Advantage version 7.0, a new Full Text Search (FTS) engine has been added. The FTS engine allows fast searches of text (character) data. This includes the following data types: Character, Memo and BLOB. The BLOB field must contain text data; for example, Rich Text document, MS Word document, or other text document.
Full Text Search Operators
To make a search flexible, several operators can be used. The default operator is AND, which means all records containing all the specified words will be returned. The OR operator returns all records that contain any of the specified words. The NOT operator returns all records that do not contain the specified word(s). The NEAR operator returns all records where the word is within 8 words of the first word. The NEAR operator can also accept a proximity if you wish to change the default of within 8 words. To find an exact match, enclose the search condition in double quotes ().
The CONTAINS() scalar function specifies the search field and condition. This function can be used in either SQL statements or table filter expressions. Two additional scalar functions, SCORE() and SCOREDISTINCT(), are discussed below.
Full Text Search (FTS) Indexes
FTS indexes optimize the text search. It is not necessary to have an FTS index on the field you are searching in order to perform a search. However, FTS indexes will improve performance and allow for searching multiple fields with a single call to the CONTAINS() scalar function. Using an asterisk ( * ) as the field name in the CONTAINS() scalar function will search all fields that have an FTS index defined for the search word(s). For example, the following SQL statement will return records from the product table when Advantage is found in any field with an FTS index defined.
SELECT * FROM products WHERE CONTAINS( * , ‘Advantage’ )
You can define an FTS index for Character, Raw, Memo, Binary and Image data types. FTS indexes are always built on a single field; no index expressions are allowed. When creating an FTS index, several options are available. Minimum word length defines the shortest words to be added to the index. Maximum word length defines the longest word that can be physically stored in the FTS index. Words longer than the maximum value will be partially stored and require a post-processing step when searched for. Delimiters define how to separate words. Drop characters are characters that are ignored by the FTS engine. Conditional drop characters are ignored if they are at the end of a word but used if they appear inside a word. For example, a Web address www.advantagedatabase.com contains periods in the text. These periods will be included in the search, but if a period is at the end of a word, it will be dropped. Noise words are words ignored by the FTS engine. The default list includes common English words that often appear in normal conversation.
Additional options include Keep Score, Protect Numbers, Not Maintained, and Case Sensitive. By default, full text searches are case insensitive. Therefore, if you require a case sensitive search, you can build an FTS index with the Case Sensitive option. Not Maintained is used when you do not want Advantage to update the index with every change to the data. This type of index requires an explicit reindex operation to be updated. More information on all FTS Index options is available in the Advantage Help file.
FTS indexes can be created using Advantage Data Architect, SQL, or an ACE API call. An example SQL statement for creating an FTS index is below. You must use the CONTENT keyword to specify the index as FTS.
CREATE INDEX myfts ON table( myfield ) CONTENT
Once the index has been created, searches on the indexed field will be optimized. You can search using your new index with a statement like the following.
SELECT * FROM table WHERE CONTAINS( myfield, ‘words’ )
Using SCORE and SCOREDISTINCT
We will use the following text for the examples below:
The Advantage Full Text Search engine allows developers to provide quick search capability in their applications. The search functionality is available using SQL statements or table filter expressions. The SCORE() scalar function returns the total count of each word in the search condition. It uses the same parameters as a CONTAINS() scalar function, or it can accept the nth instance of CONTAINS(). For example, the two statements below would produce the same result.
SELECT SCORE( field, ‘search or developers’ ) FROM table WHERE CONTAINS( field, ‘search or developers’ )
SELECT SCORE( 1 ) FROM table WHERE CONTAINS( field, ‘search or developers’ )
The example text above would return 4 because ‘search’ appears three times and ‘developers’ appears once. Again, the SCORE() scalar function returns the total count of all occurrences of every word in the search condition.
The SCOREDISTINCT() scalar function returns the number of search words found in the text. It accepts the same parameters as the SCORE() scalar function. An example of SCOREDISTINCT() is shown below.
SELECT SCOREDISTINCT( field, ‘search or developers’ ) FROM table
SELECT SCOREDISTINCT( field, ‘Delphi or developers’ ) FROM table
The first statement would return 2 because both Search and Developers were found in the example text above. The second statement would return 1 when searching the example text because Delphi is not in the text.
Full Text Search is a powerful and useful feature included with Advantage Database Server v7.0. It is easy to implement and gives your users the ability to quickly search through vast amounts of data.