New servers and file optimisation request!

Status
Not open for further replies.
Group Leader
Joined
Jul 18, 2019
Messages
483
@Darris I know that, but I only advise him to use more friendly language, because if he is not careful in the future, he might be able to make someone angry, especially to people who are irritable. But thanks for telling me^.^ and...
for @Rugid I'm sorry if the sentence I said earlier offends you>. < I hope you don't take it to heart, and also it's just my suggestion so that your sentence becomes more friendly so that other people do not misunderstand ^. ^
 
Aggregator gang
Joined
Jan 25, 2018
Messages
521
@Railander
24 chapters uploaded in the past hour, so about 600 images. If processing takes about 5 sec for each on average, that's 50 minutes. And this is just random Tuesday. If someone decides to upload a batch of heavy chapters, or if the CPU is busy with something else, they could easily create congestion for everyone else. If the number of uploaders increases (no reason not to expect that), congestions will become normal. I mean if Holo and the team figured it would be a sustainable solution, they'd have implemented it on their own without bothering to call to uploaders. That suggests it isn't.
 
Dex-chan lover
Joined
Mar 24, 2018
Messages
623
Real funny how the site for pinga/pingo has removed the comment and bug report links.
 
Joined
Dec 15, 2019
Messages
30
@moozooh

High performing image processing library usually takes around 100ms to transform one fat image. Shared CPU and 128mb RAM VM Size even can transform a whole chapter within seconds. It's viable to set up image processing microservices below $10/month budget.
 
Active member
Joined
Mar 26, 2019
Messages
275
Man. Nowadays it's really hard to read anything but I understand that the traffic went up. I'm sad
 
Aggregator gang
Joined
Jan 25, 2018
Messages
521
@NodiX
If you consider the idea worthwhile, pitch it to MD devs. I have no idea about their available resources or the pricing of third-party image processing services. I just see in the posts on the first page is that Holo expects 16 weeks to crunch through a bit under 10 TB of data, meaning an average pace of roughly 1 MB/sec, which I personally wouldn't consider enough by itself (the CPU time is most likely shared with other tasks, meaning the rate goes further down in busy hours).
 
Joined
Dec 15, 2019
Messages
30
@moozooh

I did. They just don't like the idea.

Microservice is basically you buy a single or few VM or container to spawn separate server that only do a certain task. It won't take origin server's resource in any significant matter.

This is one example how you can set up a image processing server without much hassle. Only 10 minutes required to sign up and create your own server. I did this to create my own image proxy to consume mangadex chapters with on the fly optimization.
https://fly.io/launch/github/h2non/imaginary
 
MD@Home
Joined
Jan 18, 2018
Messages
244
@moozooh

i don't know where you're getting 5 sec per image from (seems way too high for me), but if you did that on your PC remember that server CPUs are much faster at compression than your typical consumer CPU so it'd take a fraction of whatever you can do in your PC. but even if it wasn't and even if each image took 1 whole minute, you can still just stagger the compression and release the chapters as soon as they are done in a way that doesn't overload the server. this is normal server procedure and pretty much everything that overloads resources has options to be throttled like that.
compared to @NodiX i wasn't even thinking of adding a server just for image processing as i imagine they could very easily do it on the same server that receives the image with no cost and low complexity (not that a different server would've been all that complex, but still less complex than doing the compression locally).
 
Joined
Feb 18, 2020
Messages
2
Don't batch process. Do it on demand. This will skip a ton of older untouched content and prioritize new and popular material.

The first time a user pulls a page/image, check if it's been run through optimization. If it hasn't, check for and drop a meta file in to mark it as being in conversion, fork it off to pngquant or whatever, put the converted version in place instead, change the meta file to mark it done, and serve it up. Simultaneous first accesses (and even the initiator if you fork correctly) can sleep for the meta file to change and then serve the result. Write a timeout value in the meta file in case compression fails.

This avoids this massive batch-processing job and the far worse idea of burning bandwidth to send anything to external processing. It will prioritize brand new things, even just as soon as the uploader themselves looks at it.

Also: I prefer pngquant, most are spinoffs of it, and lossy doesn't matter if you're keeping or otherwise not responsible for the original.
 
Joined
May 14, 2019
Messages
19
Like a few others have suggested here can we have it so that these credit pages are regulated. Fuck sake I’ve seen 6 page credit pages before lol. I’m fine with 1 or maybe even 2 but over that is really just unnecessary
 
Joined
Dec 15, 2019
Messages
30
@Railander

Of course they can. PHP has libvps wrapper library, which can optimize myriads of formats, and using quite low resources generally. But I wouldn't recommend that since PHP isn't the best language to do concurrent tasks.

And, no, setting up image processing server isn't that complex. That microservice image processing library I posted was written in Golang. I don't know Go, but implementing it I don't use my brain power because most of their API is url based like imageprocessing.com/resize?width=1000&type=webp&url=https://myserver.com/originalimage.png. It can also receive HTTP post request and send back optimized image. Nothing too fancy about it.

Alternatively, other no headache solutions is to use image optimization on the fly services. I found Bunny CDN's service in this area ridiculously cheap for unlimited image processing tasks.
 
Joined
Apr 20, 2018
Messages
8
Please don't auto optimize scanlator files. I'm fine with recommending scanlator to optimized their own files, lossless of course.

The reason I like mangadex because mangadex retain the quality of the image. An archive for scanlation for many group especially dead group. Other aggregator out there "optimized" the image to the point it blurry to hell.
 

e97

Joined
Jan 25, 2020
Messages
3
Similarweb is wrong - we're approaching 200 million page views per month, and that's traffic that google analytics can detect. It doesn't include all the users with adblock.

AWS costs an arm and leg for bandwidth.

@Holo awesome! Thats ~7M per day, the previous web project I worked on I optimized to 345M per day / 10B views per month on a m1.medium instance with complex queries :) I normally measure rps and latency but I dont think the site is doing complex queries. I referred to AWS because its a metric most people are familiar with when comparing hardware resources. The service eventually ran on inhouse servers. Happy to suggest recommendations if you can share more about the stack and where the bottlenecks are. Tweaking some parameters could be really helpful!
 
Dex-chan lover
Joined
Feb 28, 2018
Messages
598
For the past week pages take along time to load for me, other sites are instant but mangadex is so slow.
 
Status
Not open for further replies.

Users who are viewing this thread

Top