New servers and file optimisation request!

Santiasan · May 19, 2020

@Darris I know that, but I only advise him to use more friendly language, because if he is not careful in the future, he might be able to make someone angry, especially to people who are irritable. But thanks for telling me^.^ and...
for @Rugid I'm sorry if the sentence I said earlier offends you>. < I hope you don't take it to heart, and also it's just my suggestion so that your sentence becomes more friendly so that other people do not misunderstand ^. ^

MechaNekoToho · May 19, 2020

Good decision keeping lossless over jpeg.

moozooh · May 19, 2020

@Railander
24 chapters uploaded in the past hour, so about 600 images. If processing takes about 5 sec for each on average, that's 50 minutes. And this is just random Tuesday. If someone decides to upload a batch of heavy chapters, or if the CPU is busy with something else, they could easily create congestion for everyone else. If the number of uploaders increases (no reason not to expect that), congestions will become normal. I mean if Holo and the team figured it would be a sustainable solution, they'd have implemented it on their own without bothering to call to uploaders. That suggests it isn't.

doppler · May 19, 2020

Real funny how the site for pinga/pingo has removed the comment and bug report links.

NodiX · May 19, 2020

@moozooh

High performing image processing library usually takes around 100ms to transform one fat image. Shared CPU and 128mb RAM VM Size even can transform a whole chapter within seconds. It's viable to set up image processing microservices below $10/month budget.

Zirvano · May 19, 2020

Man. Nowadays it's really hard to read anything but I understand that the traffic went up. I'm sad

moozooh · May 19, 2020

@NodiX
If you consider the idea worthwhile, pitch it to MD devs. I have no idea about their available resources or the pricing of third-party image processing services. I just see in the posts on the first page is that Holo expects 16 weeks to crunch through a bit under 10 TB of data, meaning an average pace of roughly 1 MB/sec, which I personally wouldn't consider enough by itself (the CPU time is most likely shared with other tasks, meaning the rate goes further down in busy hours).

NodiX · May 19, 2020

@moozooh

I did. They just don't like the idea.

Microservice is basically you buy a single or few VM or container to spawn separate server that only do a certain task. It won't take origin server's resource in any significant matter.

This is one example how you can set up a image processing server without much hassle. Only 10 minutes required to sign up and create your own server. I did this to create my own image proxy to consume mangadex chapters with on the fly optimization.
https://fly.io/launch/github/h2non/imaginary

Railander · May 20, 2020

@moozooh

i don't know where you're getting 5 sec per image from (seems way too high for me), but if you did that on your PC remember that server CPUs are much faster at compression than your typical consumer CPU so it'd take a fraction of whatever you can do in your PC. but even if it wasn't and even if each image took 1 whole minute, you can still just stagger the compression and release the chapters as soon as they are done in a way that doesn't overload the server. this is normal server procedure and pretty much everything that overloads resources has options to be throttled like that.
compared to @NodiX i wasn't even thinking of adding a server just for image processing as i imagine they could very easily do it on the same server that receives the image with no cost and low complexity (not that a different server would've been all that complex, but still less complex than doing the compression locally).

Ranzear · May 20, 2020

Don't batch process. Do it on demand. This will skip a ton of older untouched content and prioritize new and popular material.

The first time a user pulls a page/image, check if it's been run through optimization. If it hasn't, check for and drop a meta file in to mark it as being in conversion, fork it off to pngquant or whatever, put the converted version in place instead, change the meta file to mark it done, and serve it up. Simultaneous first accesses (and even the initiator if you fork correctly) can sleep for the meta file to change and then serve the result. Write a timeout value in the meta file in case compression fails.

This avoids this massive batch-processing job and the far worse idea of burning bandwidth to send anything to external processing. It will prioritize brand new things, even just as soon as the uploader themselves looks at it.

Also: I prefer pngquant, most are spinoffs of it, and lossy doesn't matter if you're keeping or otherwise not responsible for the original.

KaidaChan · May 20, 2020

Like a few others have suggested here can we have it so that these credit pages are regulated. Fuck sake I’ve seen 6 page credit pages before lol. I’m fine with 1 or maybe even 2 but over that is really just unnecessary

plasma1492 · May 20, 2020

site so lag can't read anything

NodiX · May 20, 2020

@Railander

Of course they can. PHP has libvps wrapper library, which can optimize myriads of formats, and using quite low resources generally. But I wouldn't recommend that since PHP isn't the best language to do concurrent tasks.

And, no, setting up image processing server isn't that complex. That microservice image processing library I posted was written in Golang. I don't know Go, but implementing it I don't use my brain power because most of their API is url based like imageprocessing.com/resize?width=1000&type=webp&url=https://myserver.com/originalimage.png. It can also receive HTTP post request and send back optimized image. Nothing too fancy about it.

Alternatively, other no headache solutions is to use image optimization on the fly services. I found Bunny CDN's service in this area ridiculously cheap for unlimited image processing tasks.

Jela · May 20, 2020

Please don't auto optimize scanlator files. I'm fine with recommending scanlator to optimized their own files, lossless of course.

The reason I like mangadex because mangadex retain the quality of the image. An archive for scanlation for many group especially dead group. Other aggregator out there "optimized" the image to the point it blurry to hell.

MegaXL · May 20, 2020

Is this why the site is running so poorly?

Railander · May 20, 2020

@NodiX

you managed to disagree with me without actually disagreeing lol

e97 · May 20, 2020

Similarweb is wrong - we're approaching 200 million page views per month, and that's traffic that google analytics can detect. It doesn't include all the users with adblock.

AWS costs an arm and leg for bandwidth.

@Holo awesome! Thats ~7M per day, the previous web project I worked on I optimized to 345M per day / 10B views per month on a m1.medium instance with complex queries

I normally measure rps and latency but I dont think the site is doing complex queries. I referred to AWS because its a metric most people are familiar with when comparing hardware resources. The service eventually ran on inhouse servers. Happy to suggest recommendations if you can share more about the stack and where the bottlenecks are. Tweaking some parameters could be really helpful!

Banchou-Sama · May 20, 2020

Good luck, all PNG's I upload are losslessly compressed to the max!

TheGoodFella · May 20, 2020

For the past week pages take along time to load for me, other sites are instant but mangadex is so slow.

Ksetrar · May 20, 2020

Any nice software for jpg files ?

New servers and file optimisation request!

Santiasan

MechaNekoToho

moozooh

Double-page supporter

doppler

Dex-chan lover

NodiX

Zirvano

Fed-Kun's army

moozooh

Double-page supporter

NodiX

Railander

Ranzear

KaidaChan

Member

plasma1492

Member

NodiX

Jela

MegaXL

Dex-chan lover

Railander

e97

Banchou-Sama

TheGoodFella

Dex-chan lover

Ksetrar

Similar threads

Users who are viewing this thread