Scrapdex

Iron-Breaker · Jan 28, 2024

Hi everyone,

Recently I started learning web scrapping and I wanted my first project to be something fun.
Being a long time mangadex user, I wanted to make something fun with the website. So I made scrapdex, a CLI mangadex navigator and chapter downloader, with python 3.10 and selenium.
The Github link is: https://github.com/IBR-41379/Scrapdex

It is very alpha and I plan to add more features to it in the future. It also has some issues, the biggest being that sadly, it cannot download any webtoon/vertical format manga. However, I hope to fix it in the future.

If anyone has some ideas on how to improve it or about some feature that I can add which can prove to be useful, please comment in this thread.
#Note: It's a CLI, not GUI.

Ndtm · Jan 28, 2024

Iron-Breaker said:
it cannot download any webtoon/vertical format manga

How so? Whether a title is longstrip or not doesn't affect anything with the api.. the only real problem would be displaying it

Iron-Breaker · Jan 28, 2024

Ndtm said:
How so? Whether a title is longstrip or not doesn't affect anything with the api.. the only real problem would be displaying it

You know what, you gave answers to all my issues. You are right. It does not affect the API but my program still doesn't work because my program doesn't use the mangadex api. I used selenium firefox webdriver to achieve make it. It was kinda dumb of me to make it this way as I forgot about the api's existence.

BraveDude8 · Jan 28, 2024

Iron-Breaker said:
You know what, you gave answers to all my issues. You are right. It does not affect the API but my program still doesn't work because my program doesn't use the mangadex api. I used selenium firefox webdriver to achieve make it. It was kinda dumb of me to make it this way as I forgot about the api's existence.

Please don't tell me it's screenshotting the reader.

Iron-Breaker · Jan 28, 2024

Its not screenshotting. It gets the blob url of the mangadex chap pages. Downloads the content, decodes it through base64, writes it in a png file via write bin operation.

Iron-Breaker · Jan 28, 2024

BraveDude8 said:
Please don't tell me it's screenshotting the reader.

To be short and simple, it's inspect elementing it's way through it, hence the need of selenium webdriver.

BraveDude8 · Jan 28, 2024

Iron-Breaker said:
Its not screenshotting. It gets the blob url of the mangadex chap pages. Downloads the content, decodes it through base64, writes it in a png file via write bin operation.

You are making it far, far harder for yourself than it needs to be. Webscraping is useful for most sites, but we've got a documented public API for a reason.

https://api.mangadex.org/docs/04-chapter/retrieving-chapter/

Iron-Breaker · Jan 28, 2024

BraveDude8 said:
You are making it far, far harder for yourself than it needs to be. Webscraping is useful for most sites, but we've got a documented public API for a reason.

https://api.mangadex.org/docs/04-chapter/retrieving-chapter/

1) I now know that I made it harder for myself.
2) I came to know the existence of the api few hours back😭
3) I did it for making a fun web scrapping project.
4) It still works(kinda)

Now that I know about api, I will either rewrite the whole thing from scratch and upload a new scrapper in a few days. But I will still try to keep up this Api less scrapper.

ieatass69 · Jan 28, 2024

Your installation step is too much. Just package the bin together with your repo or add a step to automatically download/update the driver. Don't need to rely on windows exclusive commands to download if you just use something like urllib or requests.

techie410 · Jan 28, 2024

Iron-Breaker said:
1) I now know that I made it harder for myself.
2) I came to know the existence of the api few hours back😭
3) I did it for making a fun web scrapping project.
4) It still works(kinda)

Now that I know about api, I will either rewrite the whole thing from scratch and upload a new scrapper in a few days. But I will still try to keep up this Api less scrapper.

Haha! We all need a silly exercise in programming from time to time. I once made an API user interface for an assets price list completely in bash. Why? God knows.

Iron-Breaker · Jan 28, 2024

ieatass69 said:
Your installation step is too much. Just package the bin together with your repo or add a step to automatically download/update the driver. Don't need to rely on windows exclusive commands to download if you just use something like urllib or requests.

Ok I will try to make it short.

Iron-Breaker · Jan 28, 2024

techie410 said:
Haha! We all need a silly exercise in programming from time to time. I once made an API user interface for an assets price list completely in bash. Why? God knows.

Hell yea!!

Scrapdex

Iron-Breaker

Ndtm

File Attacher

Iron-Breaker

BraveDude8

Head Contributor Wrangler

Iron-Breaker

Iron-Breaker

BraveDude8

Head Contributor Wrangler

Iron-Breaker

ieatass69

Dex-chan lover

techie410

Iron-Breaker

Iron-Breaker

Similar threads

Users who are viewing this thread