Regarding the recent site issues (502s) and Bayesian averages and ratings histogram

Not open for further replies.
May 3, 2018
That's what i know, but i don't know what my isp do with the connection since they're very strict for porn in internet lol
This also happen when i'm trying accessing reddit, yeah because they block it, hope the white knight didn't report this site since it has adult content
Dex-chan lover
Jan 22, 2018
@gormadoc thank you for answering...

this Bayesian average and its equations reminds me of my HS and college days in math subject...
Jan 26, 2018
The Bayesian average thing seems like a good idea in general. I know BoardGameGeek (a different hobby, but still) does something similar where they add 100 average ratings to every entry and only allow those with 30+ ratings to appear on ranking lists.

Based on the difference between the Bayesian average and mean of a few popular manga, I've calculated that the m factor (average number of ratings) is around 27 currently. I would argue that this number should be cut down a bit, maybe using m/2 instead of m. If It's too large, it will take too many ratings for the score to converge to the "true" rating. Especially for really awful manga that no one wants to even read. I mean, under the current system even if 20 people gave a manga a 1/10 rating, it'd have a Bayesian rating of 4.96/10 still because more than half of the rating is the 7.9 average of all scores. My change at least lowers that to a 3.78/10 after those 20 votes.
Jan 19, 2018
Not an math expert by any means, so I will gladly leave that aspect to you guys. But, I do agree that it wouldn't be a bad idea at all if the ratings were hidden (and not usable for searches, etc) for x period of time, or until a certain number of ratings has occurred. This could help to keep searches from simply bringing up results that have 1-2 ratings that conveniently were high. Also, if possible, I would also hide ratings for anything that doesn't have any chapters in the language you are searching for.

To offset this some, you could also include an option in the search for including (or ignoring) the above.
Jan 19, 2018
>Feedback on the usage of this formula is welcome. The average rating on the site around 7.9.
That average seems high to me, to a degree that its usefulness might be impacted. You only have 2.1 points out of 9 to distinguish the very best manga from the just-sorta-better-than-usual manga. Set aside any adjustments to the mean score for a moment; I think there are more fundamental problems with the metrics that tweaks can't solve.
Feb 11, 2018
@chevan said:
Feedback on the usage of this formula is welcome. The average rating on the site around 7.9.
That average seems high to me, to a degree that its usefulness might be impacted. You only have 2.1 points out of 9 to distinguish the very best manga from the just-sorta-better-than-usual manga. Set aside any adjustments to the mean score for a moment; I think there are more fundamental problems with the metrics that tweaks can't solve.
For me at least, my personal rating scheme is highly influenced by the anchoring for school grades. So for me, 60 = sorta bad, 70 = meh average, 80 = good, 90 = great, and 100 = excellent. I tend to give 50/100 to series that are thoroughly unimpressive (i.e. completely neutral), and I only give ratings lower than that to manga that really profoundly annoy me in some way (e.g. 1, 2).

But interestingly enough, grade inflation is actually a thing.

EDIT: I suspect selection bias also plays a role in rating inflation, because I rarely stick around to read anything that's lower than a 6/10. Literally >95% of the ratings that I've given are somewhere between 6 and 10. Here's the distribution for the ratings that I have given (CDF in parentheses).

- 10: 30 (1-30)
- 9: 36 (31-76)
- 8: 202 (77-278)
- 7: 195 (279-473)
- 6: 15 (474-488)
- 5: 9 (489-497)
- 4: 2 (498-499)
- 3: 1 (500)
- 2: 1 (501)
Dec 16, 2018
The updates page is broken it no longer shows all updates it just stops not v even at like previously seen one it just randomly stops and has made me miss some updates
is a Reindeer
Jan 24, 2018
@Catpyre The updates page works properly for me, do you have any genre filters set up? It could be blocking some series from appearing on the updates page
Miku best girl
May 29, 2012
Maybe I should just use 5 as the global average in the formula, and not 7.9 or whatever the actual site average is at any one moment.
Dec 16, 2018
@plykiya nope I don't have filters and it does it whether I'm logged in or not. Sometimes it fixes itself but most of the time it's broken.

Yeah I go and click 2 to goto the next page and it just freakin breaks entirely, doesn't display anything and going back it no longer shows i can goto any other pages and ends the list at a random point.

Fixed it by manually entering the url?? Weird bug.
Feb 4, 2018
Initially I was of the impression that the Bayesian average wasn't very useful since most manga should tend toward that anyway with enough votes, but looking at some of my lowest (and highest) rated followed manga I can see quite a few that are closer to sane ratings with that in place. I wouldn't change from using the site average to a base of 5 since that negates most of the benefit since the site's rating scale is currently pretty skewed (I currently consider anything with a non-Bayesian average under 7 probably trash). The ratings could probably use some kind of weighting to re-expand the upper ratings since there is so little variance among the ratings right now. Initially I considered suggesting decreasing the weight of ratings outside of 1 standard deviation of the mean for the final rating but that mostly just serves to dilute extreme ratings from manga with a tight rating clump. Doing something with standard deviation might still be wise though since good manga tend to have a small standard deviation in their ratings right now and bad manga have a tendency toward high standard deviations. Maybe giving a malus to a manga's average rating or sort order when sorted by rating when they have a high standard deviation (or just the ability to filter by a max standard deviation) would help though that also comes with the risk of reducing trust in the average rating of manga.

Back to the topic of the Bayesian average, I feel that it might be smarter to just not show a rating for manga with a lack of ratings since the Bayesian average could skew unrated manga toward looking a lot more average than they are. I also feel like something could be done with manga that didn't have a displayed rating yet if someone was sorting or filtering by rating since they tend to be more of wildcards than manga with established ratings so the typical filtering or sorting rules don't apply as well. If there is no intention to not rate manga with few ratings, being able to filter them our of searches may help with people's confidence in ratings because there is currently a lot of noise when filtering by rating due to manga with only a few ratings not being where they probably ought to be (and the tight clump of ratings among rated manga).
Jan 19, 2018
For me at least, my personal rating scheme is highly influenced by the anchoring for school grades. So for me, 60 = sorta bad, 70 = meh average, 80 = good, 90 = great, and 100 = excellent. I tend to give 50/100 to series that are thoroughly unimpressive (i.e. completely neutral), and I only give ratings lower than that to manga that really profoundly annoy me in some way (e.g. 1, 2).
That IS a very commonly used standard for grades, but personally I've always thought it was a poor scale. I realize this is tilting at windmills and I don't quite think what I'm saying here should inform policy for mangadex, but I've always thought that 50 should be the unremarkable point, not 70. With the completely neutral score at 70, you're basically saying it's of more value to you to distinguish the sort-of-ok-but-bad from the truly awful, rather than distinguishing the amazing from the slightly-above-average. And then, as you say, bias comes in and people dont even rate things 5-6/10 or lower, and suddenly your scale is effectively 6-10, not 1-10, for all the manga on the site.

Do we actually care about whether a manga has a score of 1, 2, 3, or 4? Not really. Why, then, reserve a full 6-7 units of the scale for bad manga? We're just shortchanging our ability to describe how GOOD a manga is.
Mar 4, 2018
OMG, thank you very much @Holo, really appreciate this feature. Will test it out and share feedback later on this.

Quick edit 2: Noted that Bayesian average is only currently applied at select location, just read through page 2 of this thread, too excited to response earlier.

Edit 3:
Did a quick check on a few new title and most popular manga, I find the Bayesian average is working as intended. It is great. Couldn't wait to see it in sorting later.
1 - on popular titles where we have great amount of ratings, the Bayesian average ~= mean
2 - on new titles, the Bayesian pulled the average close to overall mean

I am happy with the current implementation here but after reading through some other user feedback in regards to overall rating skew, I would like to propose normalization of the rating.

1 - Get the Bayesian average rating from all manga - DONE
2 - Get the mean & standard deviation from the Bayesian average ratings
3 - Get the z-factor = (value-mean)/std, which will result in rating range from ~ -6.5 to ~6.5
4 - Then we can use P-value/10 to get the new re-scaled & Bayesian rating

Not sure which programming language are you using for mangadex, but some JS example can be found here,

Edit 4: One downside to this method might be that the final re-scaled rating quite a further away from the simple ratings average especially on the high end where the rating are skewed towards it, while it is mathematical truth, it might not feel natural, so, need further field testing to confirm
Dec 25, 2018
One the bayesian average topic, 7.9 does seem a tad too high. One option could be to to use a genre or demographic specific average as base average instead of site wide average. Another option that can be though upon is the number of ratings used as what some might call the fudge factor. the average number of ratings throughout the site might not be a useful criteria (and this could potentially increase as time goes by thus taking the bayesian average longer time to reach its actual rating). A fixed number say 100 or 200 can be used or something that the community feels is a correct indication for minimum number of votes needed for the user rating to be reliable.
On this context since this is user driven, as other posters have mentioned there is no way to give a "correct rating" since each person has his own taste or measurement scale for what is good or bad, and ratings can be skewed easily by folks wanting the average to go their way, that's the reason why you see a lot of items have a spike at 1 rating since people most likely dont rate it to the number they think it is for them but rate it way low so that the average is lowered to closer their rating.
Jan 19, 2018
I think the observation about no true correct rating and ratings being dependent on user intention is an important one. Is a corrected average the best approach, in that light? Maybe just showing the distribution of votes alongside the simple mean would be sufficient.
Mar 10, 2018
Glad you will upgrade the servers again today and glad i donate again this year, amazing service for scanlators. Thanks Doki. <3
Dec 16, 2018
Updates page stil lconsistently breaking everytime i try to use it. It just ends and wont show anymore updates and only randomly fixes itself for a short time before breaking again, this is infuriating at this point and never happened before the recent updates.
Not open for further replies.

Users who are viewing this thread
