NeoProject Scanlation [Completed]

Status
Not open for further replies.
୧⍢⃝୨
Staff
Super Moderator
Joined
Jan 7, 2023
Messages
215
Hi again, yet another big scan group from the spanish community with a significant number of translated titles has closed. Like before, this group only seemed to upload to TMO, however, in their Facebook page they also seem to provide links to GDrive folders for specific series containing if not all, most of the chapter they've worked on. But I still recommend cross-checking these GDrive folders with what's available in TMO, in case chapters are missing.

https://visortmo.com/groups/145/neoproject-scanlation
Please upload to this group https://mangadex.org/group/f3414f85-15ef-4017-bff5-3eda6a00fbc6/neoproject-scanlation

Thanks to my last post in here, someone came up with a nice tool to download chapters from TMO, but, as you might've seen, TMO changed URLs and now I'm not sure if this tool will work, lectortmo.com now redirects to visortmo.com
https://github.com/Ylevo/SpanishScrapper
 
Contributor
Joined
Jul 12, 2020
Messages
258
If they changed their URLs it definitely won't work. I also completely forgot to update it with the proper folder format. How often do these guys change their URLs/domain anyway?
 
Contributor
Joined
Jul 12, 2020
Messages
258
Updated (language picking added) : https://github.com/Ylevo/SpanishScrapper/releases/tag/1.0.6

Theirs servers are considerably more capricious than last time, at least on my end. Sometimes you don't get the actual chapter page but some bullshit. I haven't found a consistent way to get to it, so I've resorted to the very sophisticated technique "Retry until it works" with a slightly random incrementing waiting time. Be careful as you may get temp or perma IP banned. I may add a warning later somehow.

I've updated the folder name format to follow this one : https://github.com/ArdaxHz/mupl#fil...g---cxxx-vyy-chapter_title-publish_date-group
Maybe I'm blind or unlucky but I don't see any volume numbering on their site, so there is never any.
It now also supports joint group handling.
It stills skips if folder already exists, so if you interrupt the download you may have to delete the last folder.

EDIT : it's late and I don't know how split/bonus chapters (1.1, 1.5 etc) in above name format are supposed to be handled, so for now they're not padded.
 
Last edited:
୧⍢⃝୨
Staff
Super Moderator
Joined
Jan 7, 2023
Messages
215
Theirs servers are considerably more capricious than last time, at least on my end. Sometimes you don't get the actual chapter page but some bullshit.
Not sure what you mean, but there's a weird system where they send you to a different website with their reader embedded in, the URL scheme is still the same. In this userscript you can see almost all of their "fake servers", and probably a potential solution:
https://greasyfork.org/en/scripts/430361-multi-script-para-tmo/code
 
୧⍢⃝୨
Staff
Super Moderator
Joined
Jan 7, 2023
Messages
215
On another note, I think spa is the wrong ISO code for Spanish on MD.
1698299341114.png
If you have to choose one, I will say that it is better to choose es-la because of the large number of LATAM scan groups compared to Spaniards. But it would be nice if you could choose between es and es-la
 
Contributor
Joined
Jul 12, 2020
Messages
258
Not sure what you mean, but there's a weird system where they send you to a different website with their reader embedded in, the URL scheme is still the same. In this userscript you can see almost all of their "fake servers", and probably a potential solution:
https://greasyfork.org/en/scripts/430361-multi-script-para-tmo/code
Yeah that's what I meant by bullshit. The "https://visortmo.com/view_uploads/1279123" links either get you to the viewer (directly or through redirection(s)), or to an empty page with a script that shits up a form, redirecting you to one of those domains. The problem on my end is that I use an HTML parser and not a true headless browser, so the script is never executed.

devenv_2023-10-26_16-27-17.png
I guess loading the chapter page and then going back to the HTML parser is doable, just annoying.

On another note, I think spa is the wrong ISO code for Spanish on MD.
View attachment 242
If you have to choose one, I will say that it is better to choose es-la because of the large number of LATAM scan groups compared to Spaniards. But it would be nice if you could choose between es and es-la
Sure. I merely followed the bulk uploader's readme. I'll add a box to select it.

EDIT : Looks like I was slightly too optimistic, as the httpclient class ignores javascript as well. Puppeteer-sharp seems to be the only way to go if I want to avoid the bruteforcing.
 
Last edited:
Contributor
Joined
Jul 12, 2020
Messages
258
Updated : https://github.com/Ylevo/SpanishScrapper/releases/tag/1.0.8

Partially switched to puppeteer-sharp, binary is slightly heavier (almost 2MB compared to the previous 300KB).
Pro : way faster since it doesn't force-retry like a piece of shit anymore, also way fewer requests.
Con : it now requires Chromium and downloads it on the first start (around 300MB). Download location is in AppData/Local/Puppeteer-Sharp for those who care.

On a side note, if the software crashes (but it never does 🫃) you may have a chrome process leftover to kill yourself, as I keep a single one around to be faster that is disposed of on exiting.
 
Last edited:
୧⍢⃝୨
Staff
Super Moderator
Joined
Jan 7, 2023
Messages
215
Partially switched to puppeter-sharp, binary is slightly heavier (almost 2MB compared to the previous 300KB).
Not sure if it has something to do with puppeter, but I recently downloaded a couple of chapters for a title and I had instances where the folders came up empty.
I've got a suggestion as well, we have a couple of titles with a few Spanish chapters, so it would be nice if we could pick a range of chapters to download instead of downloading all of them. Also download single chapters by, for example, just putting the chapter URL.
 
Contributor
Joined
Jul 12, 2020
Messages
258
Not sure if it has something to do with puppeter, but I recently downloaded a couple of chapters for a title and I had instances where the folders came up empty.
I've got a suggestion as well, we have a couple of titles with a few Spanish chapters, so it would be nice if we could pick a range of chapters to download instead of downloading all of them. Also download single chapters by, for example, just putting the chapter URL.
Can you give me an example of a manga and a group where that happened?
 
୧⍢⃝୨
Staff
Super Moderator
Joined
Jan 7, 2023
Messages
215
Can you give me an example of a manga and a group where that happened?
Yes, I downloaded Clockwork Planet, and after finishing, chapters 6 (Free Manga), 12 (asdf uploaded), 13 (asdf uploaded), 18 (Free Manga), 19 (Tails Uploader), 25 (Free Manga), 26 (Nozomi No Fansub) and 34 (Clock No Fansub) folders were empty.
 
Contributor
Joined
Jul 12, 2020
Messages
258
Yes, I downloaded Clockwork Planet, and after finishing, chapters 6 (Free Manga), 12 (asdf uploaded), 13 (asdf uploaded), 18 (Free Manga), 19 (Tails Uploader), 25 (Free Manga), 26 (Nozomi No Fansub) and 34 (Clock No Fansub) folders were empty.
Did you download them all in one shot or did you have to restart the download because it stopped, with the log "probably going too fast"? If it's the latter, it's because the folder was created before you got shot down by the site for too many requests (about to add something to detect that), then skipped. I need to add last folder deletion if that happens/when you stop the DL. Leftover file handles were bothering me last time I quickly tried so I skipped it.
If that's not it then no idea atm as I couldn't reproduce.

I'll put default delay at 3000 ms. I didn't get hit once by their request limit with that delay, but I do constantly at 2k.
 
୧⍢⃝୨
Staff
Super Moderator
Joined
Jan 7, 2023
Messages
215
did you have to restart the download because it stopped, with the log "probably going too fast"?
Ah yes, this. I had to manually delete the folders multiple times, it's annoying. Great to know you seem to know a way of fixing it, will wait for the fix then.
 
Contributor
Joined
Jul 12, 2020
Messages
258
Ah yes, this. I had to manually delete the folders multiple times, it's annoying. Great to know you seem to know a way of fixing it, will wait for the fix then.
Updated : https://github.com/Ylevo/SpanishScrapper/releases/tag/1.0.9
  • Now auto deletes last created folder if download was aborted (to avoid skipping an empty folder later).
  • Added ratelimit page detection and auto retrying.
  • Added several other "safeguards" to prevent the download from crashing instead of retrying. It's still possible to abort it with the stop button if it's stuck in a loop for some reason, like being banned.
I actually need to get banned to scrap their message/status code/whatever to be able to detect it. But getting banned shouldn't happen with the "imposed" delays in place anyway so that's not a real concern.
I think there are a few weird, rare edge cases around that I couldn't really reproduce due to the nature of the task, but they should hit the trycatch and end up in a retry. Let me know if you see any weird log.

As for your suggestions, they're not particularly difficulty to implement, it's more about UI autism. I'll probably go with tabs.

Should make a class or two at some point, number of properties on this form getting quite 🫃
 
Contributor
Joined
Jul 12, 2020
Messages
258
Updated : https://github.com/Ylevo/SpanishScrapper/releases/tag/1.0.10

Fixed the chapter number padding for split chapters (such as 7.02 and 7.20, now both turning to 007.2) according to the file format of the bulk uploader. Trailing 0 wasn't removed for split chapters before. If the split chapter doesn't contain a zero (such as 7.55), as it seems possible on TMO, the folder will be named like so and you'll have to deal with it manually since the bulk uploader doesn't allow it.

Should probably open a thread in community projects instead of spamming this one huh.
 
Last edited:
Contributor
Joined
Jul 12, 2020
Messages
258
Little update to say that I only have the volume numbers to add and it should be good to go.
 
Status
Not open for further replies.

Users who are viewing this thread

Top