TMO Scrapper

Power Uploader
Joined
Jul 12, 2020
Messages
150
Scrapper for the spanish aggregator TMO - currently visortmo.com - in shitty C# Winforms.

Repo: https://github.com/Ylevo/TMOScrapper

Worth mentioning : it uses sharp-puppeteer and therefore requires & downloads Chromium (around 300 MB). Download location is in AppData/Local/Puppeteer-Sharp. Although this should never happen, if the software crashes unexpectedly, it might leave a chrome process or two hanging around. Those can be killed manually or left alone and they'll be gone next reboot.

Usage is quite straightforward : enter an url such as https://visortmo.com/library/manga/43131/soredemo-ayumu-wa-yosetekuru in the mango URL box, scan the scannies, select which scannies you want to download the chapters of, click on the download button.
Delay is in milliseconds. I recommend ~2000-3000 ms delay atm to avoid hitting the request limit, which will make you wait 3 to 5 seconds, if not more as it will keep retrying until you stop it or it succeeds.
Folder naming follows this format : https://github.com/ArdaxHz/mupl#fil...g---cxxx-vyy-chapter_title-publish_date-group There is never any volume number however, as I've never seen any indicated on their website. Let me know if I'm blind.

Todo :
  • Single chapter download.
  • Range chapters download.
  • Group chapters download.
  • More options, fewer hardcoded things.
  • Refactoring so I can glance at myself in the mirror.
  • Maybe ditch HtmlAgilityPack and switch to puppeteer completely but I'm lazy.
  • Use an actual logger.
Note that I put it on github solely for the sake of transparency. This has no pretension of being well coded/structured.
 
Last edited by a moderator:
Power Uploader
Joined
Jul 12, 2020
Messages
150
Single/Range/Group are mostly done but I'm missing the volume number for proper upload to MD. I can't seem to find any website with accurate/exhaustive chapter list with volume number. Any idea?

@Roler what did you use for Blackhole Scans? What's your secret? 🫃
 
Power Uploader
Joined
Apr 28, 2020
Messages
92
Single/Range/Group are mostly done but I'm missing the volume number for proper upload to MD. I can't seem to find any website with accurate/exhaustive chapter list with volume number. Any idea?

@Roler what did you use for Blackhole Scans? What's your secret? 🫃
I auto-matched the manga with MD (of course, there were some mismatches that I had to fix) and then matched the chapter numbers too, and used the volume numbers from MD.
But this can still be inaccurate and you'll have to fix some stuff manually.

I suggest you just leave the user to add volume numbers themselves.
 
Power Uploader
Joined
Jul 12, 2020
Messages
150
I auto-matched the manga with MD (of course, there were some mismatches that I had to fix) and then matched the chapter numbers too, and used the volume numbers from MD.
But this can still be inaccurate and you'll have to fix some stuff manually.
That's the only solution I had in mind, I was hoping you were a mad genius. Thanks.

I suggest you just leave the user to add volume numbers themselves.
Yeah it's more for personal use and to see if the scrapping itself is usable. I wasn't planning on adding the script for it.
 
Power Uploader
Joined
Jul 12, 2020
Messages
150
Updated : https://github.com/Ylevo/SpanishScrapper/releases/tag/1.0.14

Added :
  • Single chapter downloading. Accepted URLs : "https://visortmo.com/view_uploads/...", "https://visortmo.com/viewer/...".
  • Range chapters downloading. Check the box and select the chapter range with the same numbering format as TMO's. Exception for chapter number such as 8.02 that becomes 8.20. Starting and ending chapters are included.
  • Group mangos downloading. Accepted URL : "https://visortmo.com/groups/.../.../proyects". Does not create a subfolder with the group name. Includes joint group releases.
  • Mangos skipping for group downloading. Check the "Skip" box and select how many mangos you want to skip downloading. This is mainly to avoid going through dozens of mangos already downloaded after having previously stopped the process midway.
  • Oneshot chapter handling.
Fixed several edge cases and one particular infinite loop (you can tell I'm good at this) when getting the chapter page.
Every page fetching now checks if the ratelimit was hit and wait a few seconds before retrying.

Hopefully I didn't miss any loop-breaking bug/edge case this time.

EDIT : Obviously I did. Fixed mango title cleaning not removing invalid characters like it was supposed to. 🫃

EDIT 2 : Fixed trailing periods on mango title not being removed, creating broken folders.
 
Last edited:
Power Uploader
Joined
Jul 12, 2020
Messages
150
Updated : https://github.com/Ylevo/TMOScrapper/releases/tag/1.0.16

Changed name, fixed logic. Shamefully realized today that I had been playing their silly little game and overengineered the algo when a very simple solution was at hand. I could even ditch puppeteer at this point but if they change their stupid setup again I might need it for real this time, so I'm not sure.

Still haven't refactored. Adding some real logging would be a great idea too. 🫃
 
୧⍢⃝୨
Staff
Super Moderator
Joined
Jan 7, 2023
Messages
169
I've been wondering if you can add an automatic conversion from WebP images to PNG. Maybe using imagemagick?

Also, for some reason the folders created by the program can't be opened on the Windows terminal through the context menu option on Windows 11, but typing CMD in the address bar will open the folder correctly in the terminal.
 
Power Uploader
Joined
Jul 12, 2020
Messages
150
I've been wondering if you can add an automatic conversion from WebP images to PNG. Maybe using imagemagick?
Sure. .NET classes should suffice.
Also, for some reason the folders created by the program can't be opened on the Windows terminal through the context menu option on Windows 11, but typing CMD in the address bar will open the folder correctly in the terminal.
What do you mean by "can't be opened"? The terminal doesn't start at all? I'm on Windows 10 but I could check it out in a VM.
 
୧⍢⃝୨
Staff
Super Moderator
Joined
Jan 7, 2023
Messages
169
What do you mean by "can't be opened"? The terminal doesn't start at all? I'm on Windows 10 but I could check it out in a VM.
Sorry, what I meant is that the terminal doesn't open in the directory.
1711472229049.png
 
Power Uploader
Joined
Jul 12, 2020
Messages
150
Sorry, what I meant is that the terminal doesn't open in the directory.
View attachment 1598
Windows terminal doesn't seem to like brackets : https://github.com/microsoft/terminal/issues/6504 https://github.com/lextm/windowsterminal-shell/issues/35 https://github.com/microsoft/terminal/issues/16024

Works fine with "native" powershell accessible through shift+right click. Upgrading to powershell 7 fixes the issue (you'll have to change the terminal's default shell).
 
Last edited:

Users who are viewing this thread

Top