Member
- Joined
- Feb 25, 2019
- Messages
- 27
iirc MD moved to an Elasticsearch backend for most of the data, which also caused a significant change to how the search itself functions which was maybe 2-3 years ago. Before the switch to Elasticsearch search was already kind of spotty, but I tolerated the flakiness of search on ES for a while hoping it'd be refined to be more reliable and precise. This doesn't seem to be the case (or I'm just searching wrong relative to how MD expects people to search).
My most direct examples are things like titles which are composed of a single, relatively common word. "HITS" is a good example of this where the "must include" syntax in double quotes (if still valid) isn't always respected. Likewise, for titles that aren't necessarily comprised of one or more common "garbage" terms (e.g.: the, or, and, is, be, am, etc.), matches either turn nothing up or unrelated entries a surprising amount of the time. A recent example was I attempted to search for "Abandoned Bastard" leaving off the rest of the title initially (full title is "Abandoned Bastard of the Royal Family") which turned up no results whatsoever for the first two terms and didn't match until I had typed the entire name of the title.
The other oddity and something of an annoyance I've seen is where titles may or may not use "wo" instead of "o" for the particle を/ヲ. In several cases where "o" is used, "wo" will be a complete miss.
I've worked with ES a bit in the past, so I'm aware it has capabilities of taking a search term and identifying with certain weights "matches" which helps account for things like misspelling and grammar in user input. I can understand if getting the correct search results more often is complicated by most modern titles having numerous aliases, stylistic Romanization deviations of things like "desu ga" vs "desuga" and "janai" vs "ja nai", and if the initially used title turns out to not be the officially adopted one and in the cases of extremely long winded titles being shortened.
Finally, scrounging around a bit to see if there's a search guide for MD before posting this turned up nothing that was official or along the lines of what I'm asking about. If there's any info MD can chime in on this - even just a "we're aware, but the fix is really painful" - is appreciated. I'd rather understand why things are the way they are than continue putting up with it begrudgingly wondering why it can't be better. 🙇♂️
My most direct examples are things like titles which are composed of a single, relatively common word. "HITS" is a good example of this where the "must include" syntax in double quotes (if still valid) isn't always respected. Likewise, for titles that aren't necessarily comprised of one or more common "garbage" terms (e.g.: the, or, and, is, be, am, etc.), matches either turn nothing up or unrelated entries a surprising amount of the time. A recent example was I attempted to search for "Abandoned Bastard" leaving off the rest of the title initially (full title is "Abandoned Bastard of the Royal Family") which turned up no results whatsoever for the first two terms and didn't match until I had typed the entire name of the title.
The other oddity and something of an annoyance I've seen is where titles may or may not use "wo" instead of "o" for the particle を/ヲ. In several cases where "o" is used, "wo" will be a complete miss.
I've worked with ES a bit in the past, so I'm aware it has capabilities of taking a search term and identifying with certain weights "matches" which helps account for things like misspelling and grammar in user input. I can understand if getting the correct search results more often is complicated by most modern titles having numerous aliases, stylistic Romanization deviations of things like "desu ga" vs "desuga" and "janai" vs "ja nai", and if the initially used title turns out to not be the officially adopted one and in the cases of extremely long winded titles being shortened.
Finally, scrounging around a bit to see if there's a search guide for MD before posting this turned up nothing that was official or along the lines of what I'm asking about. If there's any info MD can chime in on this - even just a "we're aware, but the fix is really painful" - is appreciated. I'd rather understand why things are the way they are than continue putting up with it begrudgingly wondering why it can't be better. 🙇♂️