$stop_words
$stop_words :
Words that we should exclude from the inverted index
Holds a collection to methods to manipulate various types of search index.
tokenize(string $source, boolean $capture_offsets = false) : array
Converts a source string into a series of raw tokens.
string | $source | The source string to process. |
boolean | $capture_offsets | Whether to capture & return the character offsets of the tokens detected. If true, then each token returned will be an array in the form [ token, char_offset ]. |
An array of raw tokens extracted from the specified source string.
compare_indexes(array $oldindex, array $newindex, array $changed, array $removed)
Compares two *regular* indexes to find the differences between them.
array | $oldindex | The old index. |
array | $newindex | The new index. |
array | $changed | An array to be filled with the nterms of all the changed entries. |
array | $removed | An array to be filled with the nterms of all the removed entries. |
merge_into_invindex(array $invindex, integer $pageid, array $index, array $removals = array())
Merge an index into an inverted index.
array | $invindex | The inverted index to merge into. |
integer | $pageid | The id of the page to assign to the index that's being merged. |
array | $index | The regular index to merge. |
array | $removals | An array of index entries to remove from the inverted index. Useful for applying changes to an inverted index instead of deleting and remerging an entire page's index. |
extract_context(string $invindex, string $pagename, string $query, string $source) : string
Extracts a context string (in HTML) given a search query that could be displayed in a list of search results.
string | $invindex | The inverted index to consult. |
string | $pagename | The name of the paget that this source belongs to. Used when consulting the inverted index. |
string | $query | The search queary to generate the context for. |
string | $source | The page source to extract the context from. |
The generated context string.