Scrapdex

Joined
Jan 28, 2024
Messages
7
Hi everyone,

Recently I started learning web scrapping and I wanted my first project to be something fun.
Being a long time mangadex user, I wanted to make something fun with the website. So I made scrapdex, a CLI mangadex navigator and chapter downloader, with python 3.10 and selenium.
The Github link is: https://github.com/IBR-41379/Scrapdex

It is very alpha and I plan to add more features to it in the future. It also has some issues, the biggest being that sadly, it cannot download any webtoon/vertical format manga. However, I hope to fix it in the future.

If anyone has some ideas on how to improve it or about some feature that I can add which can prove to be useful, please comment in this thread.
#Note: It's a CLI, not GUI.
 
Joined
Jan 28, 2024
Messages
7
:thonk: How so? Whether a title is longstrip or not doesn't affect anything with the api.. the only real problem would be displaying it
You know what, you gave answers to all my issues. You are right. It does not affect the API but my program still doesn't work because my program doesn't use the mangadex api. I used selenium firefox webdriver to achieve make it. It was kinda dumb of me to make it this way as I forgot about the api's existence.
 
Head Contributor Wrangler
Staff
Super Moderator
Joined
Jan 18, 2018
Messages
1,746
You know what, you gave answers to all my issues. You are right. It does not affect the API but my program still doesn't work because my program doesn't use the mangadex api. I used selenium firefox webdriver to achieve make it. It was kinda dumb of me to make it this way as I forgot about the api's existence.
Please don't tell me it's screenshotting the reader.
 
Joined
Jan 28, 2024
Messages
7
Its not screenshotting. It gets the blob url of the mangadex chap pages. Downloads the content, decodes it through base64, writes it in a png file via write bin operation.
 
Head Contributor Wrangler
Staff
Super Moderator
Joined
Jan 18, 2018
Messages
1,746
Joined
Jan 28, 2024
Messages
7
You are making it far, far harder for yourself than it needs to be. Webscraping is useful for most sites, but we've got a documented public API for a reason.

https://api.mangadex.org/docs/04-chapter/retrieving-chapter/
1) I now know that I made it harder for myself.
2) I came to know the existence of the api few hours back😭
3) I did it for making a fun web scrapping project.
4) It still works(kinda)

Now that I know about api, I will either rewrite the whole thing from scratch and upload a new scrapper in a few days. But I will still try to keep up this Api less scrapper.
 
Dex-chan lover
Joined
May 18, 2019
Messages
3,451
Your installation step is too much. Just package the bin together with your repo or add a step to automatically download/update the driver. Don't need to rely on windows exclusive commands to download if you just use something like urllib or requests.
 
Joined
Jan 28, 2024
Messages
2
1) I now know that I made it harder for myself.
2) I came to know the existence of the api few hours back😭
3) I did it for making a fun web scrapping project.
4) It still works(kinda)

Now that I know about api, I will either rewrite the whole thing from scratch and upload a new scrapper in a few days. But I will still try to keep up this Api less scrapper.
Haha! We all need a silly exercise in programming from time to time. I once made an API user interface for an assets price list completely in bash. Why? God knows.
 
Joined
Jan 28, 2024
Messages
7
Your installation step is too much. Just package the bin together with your repo or add a step to automatically download/update the driver. Don't need to rely on windows exclusive commands to download if you just use something like urllib or requests.
Ok I will try to make it short.
 

Users who are viewing this thread

Top