like mangago?

Joined
Dec 22, 2018
Messages
3
i'm new here.. and found out there is forum here. not sure if my question out of place but I've been wondering.. why mangadex really easy for other website bots steal image/chapter here.. whynot make it like mangago. (although i have zero idea about coding but as far i knew so far mangago really advance when in come to this thing - the only thing work now by save image by manual) they behave as if they official website certainly their content have quite few title that doesn't have in any site.. so to website developer or anyone who has knowledge in coding/web developer. isn't it convenient if making mangadex hard for bots to steal with? instead 'begging' loyal aggreator to visiting here.. or cursing them for being loyal to those site moreover when they already 'brainwash' as if those wsite the one who translating all of chapter there.
 
Group Leader
Joined
Jan 27, 2018
Messages
301
My dude, magago takes from here to begin with, wdym?
Also bots are automated and grab images by their code (i believe), so what you suggested wouldn't in any way apply.
 
Group Leader
Joined
Jan 27, 2018
Messages
301
@StrawHatter my dude i don't think you understand how these bots work but basically you can KINDA compare it to a batch image downloader.

No matter what "restrictions" on "manual download" you add, it simply grabs all the images into a page and you can filter them and download them together, which is what these bots do.

Also I hope you realize that the people "begging' loyal aggreator to visiting here" aren't mangadex staff themselves but translators and scanlators overall, mangadex isn't the ones scanlating (even tho they do have a small team that does scanlation as well now but that's besides the point)
 
Joined
Oct 30, 2018
Messages
8
@filipe I think @strawhatter mean kinda oppposite. first of all i know mangago REALLY HARD to download i had try download rare chapter *official chp* from there using many downloader. I try find the title that i want but doesn't found in any popular site like kissmanga or even if other site who host it exist.. it will be on website that not known many people.. and there no manga downloader for it. so if I guessing he means making mangadex like that.

p/s i'm not talking about host official chapter here ok.
 
Joined
Dec 22, 2018
Messages
3
@Filipe. I really not get what you talking about? it feel like we're discussing completely different topic! @AccomplishedList1 it seem you get it. I though i already ask an opinion/suggestion pretty clear. maybe my english really that bad!? lol
 
Miku best girl
Admin
Joined
May 29, 2012
Messages
1,441
There's no point in trying to protect images, because anyone who puts in enough effort will be able to rip them.
The best we can do is try to break automation, and sometimes we have achieved that by capturing certain aggregators' IP address, and automatically serving them wrong images.

Our time is better spent on improving the site, and winning people over that way. If you build a good product, naturally people will come as you have the superior product.
 
Group Leader
Joined
Jan 27, 2018
Messages
301
@Holo not hard when the other has such shit quality
I just went there for the first time to compare the files

Here the 2nd page of solo leveling chapter 50 has over 2,98MB
There the same page on the same chapter has 455KB

How do people read such shit quality and such compressed images?

If they keep this up people will as they have little by little just convert to better websites (aka mangadex)
 
Joined
Apr 23, 2018
Messages
1,071
If they did make it harder (eg. require you to have an account to view manga), it would be pretty hard to stop bots without also making it too much of a hassle for actual users to use.
Someone who writes a bot can just as easily create a few accounts for the bots to use
it took me about 5 minutes to make my bato.to scrapper to also login, back before the original bato.to died. fwiw, my bots work slower than humans and are set to follow ads, so I don't view myself as being one of the bot programmers they were trying to block.
. Even if account creation is a manual process due to a recaptcha, it would only have to be done once per account. Even if there were download limits, bots utilizing multiple accounts would be able to bypass them (and proxies as well to bypass ip based limits).
The only method I can think of that can be easily implemented would be requiring a recaptcha on every chapter. But screw that shit. Users would go to the scanlators site (as would the bots) and read/download it from there instead.

Besides, not having to have an account to read is a blessing as well, not to mention a good way to attract new users from our manga loving community.

We can stop normal people and crappy programmers from writing bots pretty easily though (the ip based limits already in place is good enough for that), but dedicated programmers and existing complex scripts from popular aggregators are not likely to fall so easily.
IP based limits will fail misrably if the site supports ipv6 for instance. A residential address will have no less than a /64. In other words,
0,000,018,446,744,073,709,551,616 addresses minimum to block per IPv6 user.
Most users should have a /56, which is
0,004,722,366,482,869,645,213,696. Beyond ridiculous right?
But some will have a /48
1,208,925,819,614,629,174,706,176 addresses to block.

What does this mean? Well, you have to decide for IPv6 addresses whether you want to have to catch and block a bot on a /48 address space 65,536 times, Or if you want to potentially falsely block 65,535 real users to get rid of it(there are decent algorithms for this, but I'm not trying to describe the solution, just the complexity of the problem). And that is assuming the ip filter works on IPv6 addresses, and is working on address ranges instead of 1 address at a time (which would be pointless for blocking a moderately well written ipv6 based crawler)

Stopping the actual download method is also an option. We could go for a google books like method. They distribute the images for every page chopped into pieces and scrambled up like a puzzle. CSS is used to piece this puzzle together and recreate the original image.
That said, I will say now that there is at least 1 super easy way to download google books, and at least 1 proper way. The easy way? simple, screenshot each page. It is easy to automate screenshotting a div or a canvas. and then moving onto the next page. You can get full quality simply by ensuring the full size image is displayed (browser window size of 4096x4096 wont fail you here) and saving in a lossless format like png.
I am not recommending that you do this for rented google ebooks, nor am I suggesting similar methods be used to download online collage class ebooks during the 1-2 week free trial period. Not at all....
 
Joined
Dec 14, 2018
Messages
1
of course you cannot download from mangago. i'm not sure if it's automatic but if you try download using manga download like HakuNeko S etc will immediately get banned ip. for now no other way to download moreover they using page by page can't using bulk image downloader either. it pain in ass for now the resolution only can be download by manual click save image.
 
Group Leader
Joined
Jan 27, 2018
Messages
301
why take from mangago in the first place tho? nobody uploads there to begin with, they take from elsewhere, just go to their sources and get the same shit but better quality lol.
 
Joined
Apr 23, 2018
Messages
1,071
Just took a shot at scrapping mangago and let me say first, that was a terrible experience, I will not be re-visiting that site in the future.
Also, while they have done some things to disable popular addons and the web-console(detects if debugger is on and pauses all js. Easy to bypass though), they haven't done crap to make it harder to crawl and download. You can even automate it in js rather easily.
This xpath `//a[@id='pic_container']/img[@style='']` will give you the image on the current page (which you can download using whatever method you wish). The next page can be found via following the link at '//a[@id='pic_container']'.
This may sound like a bit much for a normal user, but it is nothing for an actual website scrapper (or even an incompetent programmer like myself). Meaning that the preventions on their site is only stopping normal users from scrapping manga, not stopping other aggregator sites from scrapping it. Past that all you have to do is pretend your a normal reader by scrapping at human speeds to avoid IP bans. Granted, I only tested this on 2 chapters, so if there is a captcha every 3 chapters or something, it would be quite formidable.

(the fact that I bothered to test this probably means I have way too much free time right now....)
 
Custom title
Staff
Developer
Joined
Jan 19, 2018
Messages
2,670
There's no way to prevent bots completely from downloading images without preventing everyone from reading them.

There are options to make it more annoying for them, but they only have to get around it once and they're golden until we rewrite the entire system again - in other words, it would be exponentially more work for us than them.

The options also basically boil down to a) require user accounts, which would just hurt our vieweship and not them, especially since we have the ability to figure out bots regardless, and b) scramble or otherwise encode the images, which can be easily worked around and only hurt regular users who can't decode them automatically and want to share them, not to mention that would mean modifying the source images which would hurt scanlators.
 
Joined
Dec 7, 2018
Messages
1
@Teasday in the end of the day. good for aggregator they can keep making money off this. and while you at this, please explained how coding website work? i don't have any idea what coding website for, is it important? can normal ppl create website without knowing anything.
 
Custom title
Staff
Developer
Joined
Jan 19, 2018
Messages
2,670
@bisala Well... Obviously you would have to learn how to program first if you wanted to make a complex website from scratch, or at least know html and css for a simpler one. There's generator services like SquareSpace if you want to make one without knowing how, I guess.

It's a pretty involved subject to go into detail here.
 

Users who are viewing this thread

Top