d131666ff5
Squash all the error messages, but it's not working as intended.
...
All that seems to have happened is that searches are taking longer and
not doing anything different.....
2020-03-15 18:10:23 +00:00
f632c0907c
Integrate didyoumean into the main search engine, but it's crashing.
...
We're getting there though!
2020-03-15 17:54:27 +00:00
390eafb7fc
Refactor search engine out into multiple files
2020-03-14 17:18:51 +00:00
269ba583fd
feature-search: add command
2020-03-11 23:51:49 +00:00
2eb4f73c5e
Add dependency system to build system ahead of a feature-search refactor.
...
We should probably refactor the build script into something more
object-oriented too, since it's getting somewhat complicated. I've added
some ASCII art headers as a stop-gap for now, but a proper refactor of
that too (into a class-based system probably) is incoming I think.
2020-03-11 23:32:10 +00:00
6cfd60b765
feature-search: add search rebuild shell command
2020-03-11 23:07:38 +00:00
1abcd96699
Remove stray debug statement
2019-12-23 22:02:41 +00:00
456f749ffe
Bugfix: Squash bug in new array_simple search optimisation
2019-12-23 21:58:23 +00:00
e4eee4e281
Fix comment typo
2019-12-15 22:38:44 +00:00
6d675fc783
Bugfix: Add missing apostrophes in stop words
2019-12-15 20:21:05 +00:00
6f4b1a62e9
Fix + weighted word support on stas-parse action
2019-12-15 20:03:04 +00:00
c80f26962e
Refactor stas_split to be more fasterererer
...
Informal testing shows that it's gone from taking ~18% of the total time
to ~4% of the total time :D
2019-12-15 17:56:56 +00:00
843f0f7ee9
Update comment
2019-12-10 01:13:51 +00:00
d53f0ed85a
Remove search::transliterate, as it has a hgue performance overhead.
...
Use search::$literator->transliterate() directly instead.
2019-12-08 21:04:59 +00:00
8156055b5c
Improve search index write & lookup performance by implementing new arr_simple system
...
By serialising and deserialising lists of numbers with implode &
explode, we can further cut down on the json_* calls which are
reeeeeally slow.
2019-12-06 23:40:28 +00:00
aea1255f10
Bugfix: Return correct type in StorageBox::delete()
2019-09-21 21:05:14 +01:00
157c6dabdd
If it's a list of strings, then it should be sorted correctly.
2019-09-03 18:16:01 +01:00
f160a82063
Add note to search page linking to query syntax on help page
2019-08-24 20:47:41 +01:00
da5b3a5df8
Do some documentation work, and add missing help sections
2019-08-24 19:56:14 +01:00
e773e36de5
Tweak stas-parse action output
2019-08-23 01:29:11 +01:00
632375417d
Add apiDoc comment
2019-08-23 01:27:35 +01:00
e6ba31df23
Add debug stas-parse action
2019-08-23 01:24:17 +01:00
276b4c808f
Add STAS parsing to query-searchindex output
2019-08-23 00:51:39 +01:00
4d51ae924e
Fix intitle: & intags: syntax - game, set, match.
2019-08-22 22:23:30 +01:00
9505e0653e
Fix some mroe odd bugs in the new search system
2019-08-22 22:11:09 +01:00
edf1be5801
Fix a *huge* number of bugs in the new search system, but it's not ready just yet
2019-08-22 21:38:17 +01:00
e08e775d98
Finish refactoring invindex_query
2019-08-22 17:43:14 +01:00
b93dd3d9cc
Start refactoring query_invindex & rename it to invindex_query
...
....but of course it's not finished yet. We're doing well, but there are
a few thorny issues to go.
Mainly: We need to seriously optimise ids::getpagename(), 'cause we'll
need it a *lot* when we get to implementing the size, before, and after
colon : directives.
2019-08-18 21:25:48 +01:00
ce6df06817
Start refactoring the search system to use a new key-value store backend
...
....but it's not finished yet.
It should improve performance significantly when it's done & optimised,
as we won't have to load the entire search index into memory & decode it
just to perform a single query.
2019-08-18 18:52:29 +01:00
38badd3c1f
[search] Add StorageBox.php as an extra data file
...
It's time to refactor the search system to use an SQLite-backed
key-value data store. It's just a shame that something designed for this
like LevelDB / RocksDB doesn't have a PHP package that we can use :-/
We can always switch later, I suppose.
2019-08-17 20:47:51 +01:00
7088990027
Minor code formatting
2019-08-17 01:19:04 +01:00
5609506def
minor formatting
2019-08-16 01:14:38 +01:00
127270ff89
Bugfix: Correct search query performance metrics
2019-08-15 23:46:23 +01:00
ddc36bf48e
Remove commented code
2019-08-15 23:17:33 +01:00
0a5ba3ff59
Improve search invindex alteration performance
...
This will be especially noticable when using invindex-rebuild
2019-08-15 23:06:06 +01:00
50efd4bb49
Bump versions
2019-05-06 23:48:34 +01:00
c177b66b42
Bugfix: Don't throw a warning if the search index doesn't exist yet
2019-05-06 20:22:36 +01:00
a3330829cb
Bump module versions & go over documentation comments
2019-02-10 23:18:34 +00:00
5b670f5981
Refactor method names in page renderer
2019-01-27 22:56:51 +00:00
c7d7de3d7e
Don't include semicolons in greedy internal links
2018-09-29 23:40:23 +01:00
39098ac0fb
Display an ellipsis at the beginning of a search context if it doesn't start at the beginning of a page
2018-09-29 13:32:17 +01:00
24775724d1
Bugfix: Correctly calculate the end offset of search context snippets
2018-09-29 13:27:17 +01:00
284b404946
Typos in comments
2018-09-12 21:27:51 +01:00
31d555f482
Bump version of search module
2018-07-01 12:14:06 +01:00
1f6f780177
Restyle matching tags in search results
2018-06-30 11:46:07 +01:00
8955d6d131
Save the character offset, not the token offset in the inverted index
2018-06-30 11:19:38 +01:00
cdee30c286
Add $capture_offsets option to tokenize().
...
TODO: Utilise this in the indexer & update the changelog mentioning that
_all_ inverted indexes will need to be rebuilt
2018-06-30 00:08:57 +01:00
8403ffd5c3
Bugfix: Increment $i when we hit a stop word when indexing.
...
There's also another bug here - in that the offsets generated contain
are the index in the array of tokens, when we need it to be the index in
the source text!
2018-06-29 23:51:10 +01:00
9d7a21e993
Format the index action nicely
2018-06-29 12:08:38 +01:00
19e49777b2
Search System; Don't bother getting a page's id if we don't need to
2018-06-26 14:28:11 +01:00