mirror of
https://github.com/sbrl/Pepperminty-Wiki.git
synced 2024-12-22 13:45:02 +00:00
Bugfix: Increment $i when we hit a stop word when indexing.
There's also another bug here - in that the offsets generated contain are the index in the array of tokens, when we need it to be the index in the source text!
This commit is contained in:
parent
c687a9b029
commit
8403ffd5c3
1 changed files with 3 additions and 1 deletions
|
@ -548,14 +548,16 @@ class search
|
|||
{
|
||||
$nterm = $term;
|
||||
|
||||
|
||||
// Skip over stop words (see https://en.wikipedia.org/wiki/Stop_words)
|
||||
if(in_array($nterm, self::$stop_words)) continue;
|
||||
if(in_array($nterm, self::$stop_words)) { $i++; continue; }
|
||||
|
||||
if(!isset($index[$nterm]))
|
||||
{
|
||||
$index[$nterm] = [ "freq" => 0, "offsets" => [] ];
|
||||
}
|
||||
|
||||
// FIXME: Here we use the index of the token in the array, when we want the number of characters into the page!
|
||||
$index[$nterm]["freq"]++;
|
||||
$index[$nterm]["offsets"][] = $i;
|
||||
|
||||
|
|
Loading…
Reference in a new issue