Major Milestone Reached!

Status
Not open for further replies.
Supporter
Joined
Jan 20, 2018
Messages
172
I remember the great migration of Batoto. Sad days. But the end result was better then expected.
 
Staff
Admin
Joined
May 29, 2012
Messages
594
@SGR

Totally agree on the Alexa rank stuff being kinda whack. Before we upgraded to version 3 of the site it looked like Alexa even counted chapter reads into our ranking or something equally fruity. Once v3 came out with the current API our rank slowly went down to where it actually should be. Then actual growth started to raise it again around the new year.

YvRu8vk.png



There's actually quite a lot of stuff related to the various DDoS attempts, sadly it wasn't just one thing that was being targetted. While we have made certain stuff public, I don't believe we have ever actually covered it all in one place. So I'll try cover as much as I can remember, cos I'm sure a few people will find it interesting.

Phase 1: The initial DDoS attempts (successful attempts for the most part) hit us like a truck and caused a fair bit of downtime and wtf spam on Discord.

Remedy wise:

1. Path.net helped us filter out loads of bad IPs that were hitting the site.
2. Reported scores of AWS IPs to Amazon that were hitting the site at unreal rates.
3. Fixed an issue with some certain IPs abusing the database in a certain way.
4. Something else I'm forgetting but will add if i remember what it was.


Phase 2: After that they/or someone else switched to targetting the database server via the search function. This resulted in a lot of 502s.

Remedy wise:

1. Disabled search for guests so they had to actually make accounts and put in some effort if they wanted to keep going - which apparently they did, cos they came right back after this change.
2. Banned even more IPs and I reported more - this time to Azure.
3. We upgraded the DB infrastructure to be able to handle much, much higher amounts of traffic and to be able to cope with literally millions of extra requests per hour. Since we upgraded our database infrastructure it has resisted all attempts. We still need to upgrade some other stuff before we are totally caught up with trying to fully mitigate such attacks, but the current round of upgrades are coping with the almost daily attempts. We've had a few hiccups with slaves falling out of sync and had to update the code and config as we go, but it is pretty stable right now.

Phase 3: Flooding site with requests, and users via botnets... and more search spam attempts. Yeah, they went back to this after the DB started to be able to handle the spam.

Remedy wise:

1. Banned more dodgy accounts for their search spam.
2. Upgraded our banning rate limits to stop a lot of abuse on the servers.
3. Banned the use of over 20,000 fake emails (and counting) to make spam accounts more effort to make and slow down search spam attempts.


I might be forgetting a few things, but that is as much as I can recall at the moment.

Here's some graphs from various attempts since we upgraded our infrastructure. Wish I'd kept more copies but I can only find the examples I linked on Discord.

zkeLkU3.png


1gRupS6.png


The size of the botnet of users hitting the site has increased on sequential attempts. We have seen spikes starting off at 25k extra "users" after our upgrade with more in each attempt. Current attempts are peaking at over 50k extra "users" for a couple of hours at a time. Definitely hostile though, as our analytics show no extra peaks on chapter hits during those times.

DaSVjg8.png


4PSvF3Q.png


And here's the other end of the scale. Certain new accounts spamming literally into the millions of extra requests at various times before we catch them and shut them down.

It's a big old game of whack-a-mole, and we are catching up slowly thanks to the upgrades. More improvements are coming so hopefully good things on the horizon. Once we complete the next round of upgrades and implement the new elastic search we will be able to open search back up to guests and it will be back to business as normal.

Hopefully someone will find this interesting so I don't feel like I wasted my time compiling it!
 
Dex-chan lover
Joined
Dec 2, 2018
Messages
233
Thank you for all your hard work! I'd be lost without this site <3.
 
Instrumentality Instigator
Staff
Super Moderator
Joined
Jan 29, 2018
Messages
1,346
@ObserverofTime
Memes aren't valid criticism. Please do not clutter up the thread with off-topic images. If you have something to say, please say it like a normal poster.

Thanks,

Zeph
 
Dex-chan lover
Joined
Jan 22, 2018
Messages
3,084
All I have to say to this news is:
sqtko.jpg

Now to see if you can shut them down for good and eliminate a hive of adds and villainy.
 

SGR

Dex-chan lover
Joined
Apr 2, 2019
Messages
456
@ixlone
I, for one, found it extremely interesting. The one thing proven here is that there is a large issue with resilience of the software running the site, which resulted in throwing more hardware at the problem to mitigate the issue in the short term. There's no hint at a long term solution, but then again the post you made was not about that. Here's a few random thoughts:
[ul]
[li]A 50k botnet seems like "gate #2" from my random assessment - a few script kiddies.[/li][li]The pre-v3 oddities still don't explain the jagged trend - they only explain the 2-month ago peak. Look at mangakalot's smooth line (just to aim for a decently sized competitor) and compare.[/li][li]Did you start banning temporary e-mail sites yet?[/li][li]How about introducing a throttling mechanism with a higher threshold to reduce the false-positive problem the captcha mechanism had?[/li]
[/ul]
 
is a Reindeer
VIP
Joined
Jan 24, 2018
Messages
3,231
@SGR We have a pretty extensive list of temporary e-mail sites that are banned.

Captcha wasn't based on a rate limit. If we got hit and the site went down as a result, captcha was turned on for everyone on the site as the only preventative measure. But now there are better tuned limits for banning just the offenders without needing to resort to captcha.
 
Staff
Admin
Joined
May 29, 2012
Messages
594
@SGR

Not a coder, so not really gonna comment on the software too much, other than to say that it is constantly being improved and worked on.

What I can say however is that we have a lot of data memcached. For example there are several hundred million database rows alone just for user chapter tracking - I don't think people realise just how much info is involved in tracking all the chapters every user has ever read - which are constantly in need of updating as people are reading all day, all the static pages caches, various other stats and info caches, customised user pages, and so on, and so on... Software can only do so much. We do/did actually need the extra hardware to not only support our growth, but to also help keep the site usable for everyone while we fend off various bad actors. The slave servers now mean that the database master server CPU is no longer perpetually on fire trying to keep all the info up to date and 502ing the site when it can't keep up.

The new search will reduce the load on the our database servers even further once we finish it and move it from the beta site to the live version. Which is one of the future software improvements.

1. Script kiddies probably, but persistent ones - and with 500k~ unique users per day keeping the site reachable is important to us.

2. I think the peak just looks that way cos of the steady increase before we plateaued at our current level, and we have been fluctuating around the same area in rank since - which, as you can see on the graph, is only like 100-150 ranks. We have been working on improving our SEO lately too, Taking tips from Kakalot. My inbox is full of DMCA notices though, so I know we are getting delisted on Google a fair bit which hurts our organic traffic (I imagine the same is true for other crappy sites). I'd be more interested in seeing the same graph in March 2020 when the graph isn't so skewed into showing the drop down to 3500.

3. Yeah. Constantly adding more to it too.

4. We have a new rate limit instead and haven't had to enable the captcha since we added it. In the first week it banned around 1300~ IPs (out of 2,300,000ish unique users that week). The first day it got quite a few legit users and we tweaked it a fair bit over the following weeks. As we stand today we get maybe 1-2 people per day who get banned and come ask for help (not sure how many just accept the 1hour ban and look into fixing the issue or not doing whatever they did to get banned).

4.1. One of the issues seem to be 3rd party apps. We've recommended people poke the devs of 3rd party apps to update them to allow for the new limits. I know the Mangadex Tachiyomi extension is now compliant with the limit, and AllMangaReader also has a lazy-request feature that people can now use to slow requests - so both of these are avoiding false positives now.

4.2. Crazy amount of single RSS feeds instead of using the single follows RSS feed - for anyone interested this only counts as 1 hit on the site and will update all your series and reduce chances of getting banned a lot!

4.3. Opening a vast amount of bookmarked tabs. Which is something we have improved by increasing the burst limit to stop people getting banned so often as a false positive - also for interested people enabling lazy loading tabs will also stop this from happening too.
 
Joined
Apr 4, 2019
Messages
2
Stumbled upon MangaDex by accident and let me tell you, I’m hella happy I did.
Congratulations, keep up the amazing work I can’t wait to see future updates for you guys.
 
Status
Not open for further replies.

Users who are viewing this thread

Top