Software dev wants to make scanlation tools

Joined
Sep 23, 2018
Messages
3
I've got some programming experience and I'd like to use it to make something that helps with scanlation.
But I don't know what people need -- google turns up a few projects that look useful and a few people saying they just use photoshop for everything.

- I've been thinking of trying neural network cleaning / redrawing. Has a lot of potential, but requires a lot of data.
- Typesetting seems to have tools already, but if there's anything to improve there, I'd be happy to try.
- Full machine translation is hard and produces bad quality, but there might be a way to do good machine-assisted translation.
- Or it could be that the biggest gains come from linking up other programs / tools.

If you have an idea for a program that would help your group, I'd like to help out!
 
Member
Joined
Jan 21, 2018
Messages
90
I'm not sure if AI nowadays are sophisticated enough to do any of those. I mean, waifu2x just enlarges pictures and that apparently is an accomplishment.
 
Group Leader
Joined
Feb 4, 2018
Messages
245
A good OCR . Currently using Capture2Text, but it tends to do dumb mistakes:

- confuses small and large kana
- scans certain dakuten kana as handakuten kana
- tends to recognise 'し' as 'L'
- generally quality suffers depending on different font

Also it's by default not portable to Linux. The CLI part (which is what I'm interested in, as it integrates with another program), does work on Linux when you pile a dozen hacks on top of it, but works differently than the Windows version (it tends to make less mistakes on specific fonts, but on others it completely breaks while the previous version somehow works).

Computer-assisted translation is yet a relatively unexplored area, but I'm currently working on a program of my own which satisfies all my use cases there (though it could satisfy others' needs too).

Regarding typesetting, some people really dislike Photoshop. Even disregarding the price tag, it tends to crash every now and then. I also find it annoying to automate. I'm currently using Photoshop, but not willingly.
 
Member
Joined
Apr 18, 2018
Messages
1,274
@milleniumbug
I agree with you, a good OCR.
I have never tried OCR which directly translates sentences like capture2text or Abbyfine reader. Because I avoid common mistakes as you mentioned. And, it will be more helpful in terms of shortcuts.

There is software that I have used for a long time and this is very helpful, I highly recommend to all translators. To use this OCR, you must have understood the Japanese writing system. So it's not based on Google Translate. But this depends on everyone, I prefer to read directly, more like a dictionary, because I know more about how the sentence pattern works, and the rest is helpful if there are some of the Kanji that I forgot.

NW1N2pr.jpg
Good, simple but extraordinary. The name is indeed for Kanji specifically, but it can also help several sentences in Katakana (not thoroughly, only if lucky.)
There are important factors that need to be considering, the images used must be clear. If it is not clear (well, as long as not blurry), it will be difficult to detect the letters. Usually, like reading the same radical wrong, and the rest may not have been added. But fortunately the disadvantages in it can be overcome by using the "zoom" feature. Or if the image has been edited before it will be easier.

Your software also looks good and simpler. I'll try it next time.

Well, I'm not a part of scanlation groups because I only read for my own needs.
But as someone who has the ability to edit, I've been using Photoshop since 2010 and I think (for Redrawing) it's already good enough, even beyond that. The most important are: the tools. I'm having a hard time imagining a "simpler" one because, a lot of things need to be fixed in the image (you have to clean the image with various techniques, cloning features, and so on)...like you said, it needs a lot of data.

I have no problem with the latest photoshop CC 2018, running or not sometimes depends on the specifications of each computer.
 
Active member
Joined
Jan 22, 2018
Messages
123
anyone who would think this isn't possible probably also thinks that decensoring hentai isn't possible with software scripts
 
Group Leader
Joined
Jan 18, 2018
Messages
494
I also think a good OCR program would make life easier for many TLs, especially if it could somehow be combined with a pad for writing kanji.
 
Instrumentality Instigator
Staff
Super Moderator
Joined
Jan 29, 2018
Messages
1,343
We encourage anything that helps scanlator groups put out more stuff. If a program could be developed that would make TL's lives easier, we're all for it.
 
Joined
Sep 23, 2018
Messages
3
Looks like OCR is what most people want, which is a bit interesting, since it seems like there are a few that exist already.
Is there a reason why existing stuff isn't useful? @goroyanpai mentions that KanjiTomo doesn't perform well on katakana or blurry images, which probably should be fixable.
I'll get started, see if I can improve upon what's out there.
 
Member
Joined
Apr 18, 2018
Messages
1,274
Well, it doesn't work for Katakana because the OCR is not made for that, only for Kanji.
But there are some common words that appear.

For example :
ホーム (Home)
アイス・クリーム (Ice Cream)
アイ・ラブ・ユー (I Love You)
スウィッチ (Switch), etc.

It is useful. But if anyone wants to make it better by looking at the shortcomings of existing OCR, why not? As I already mentioned. There is still something to improve.
The OCR that you make can be :
- Cultivated includes translations for Hiragana, Katakana and Kanji.
Maybe you can try two forms, as mentioned by milleniumbug which use capturing text and the second one is by using images like the software I use.
Also, Katakana is hard in my case, maybe you can improve in this section. Like for example, how to read English name or place.
For example : スミス (Smith) or エリザベス (Elizabeth)
- Provide many examples of sentences and explanations.
This one, may have to be helped by people who is understand Japanese.
- Dictionary service
- etc
I don't know the OCR I use using what kind of dictionary, maybe you can try adding Kenkyusha dictionary, which is called the green goddess. I have an EPWING version, but I don't know how it works.

And..
If you want to make the better version of Photoshop, much simpler one, why don't you try it out?
 
Group Leader
Joined
Feb 19, 2018
Messages
134
i want an ocr that can at least read characters up to down, instead of right to left.
 
Joined
Apr 29, 2018
Messages
83
I've tried capture2text, Kanjitomo, and online OCR's, but they don't seem to pick up a lot of stuff right, especially up and down text, fonts that are supposed to have a hand written look, and text on the side of the panels, like when a character is whispering or thinking/monologuing something. Also, sometimes the OCR's pick up characters as shapes or number for some reason.

Some of it could be the mix of Hiragana, Katakana and Kanji confusing it I guess.

As for a program that can help in translating stuff, if it actually works and isn't like google/bing/microsoft translate, then I am all for it. It would be especially cool, if it could convert characters into romaji as well.
 

Users who are viewing this thread

Top