Tutorial For Hardsub Ripping

Quickstart

Upload a picture (or a batch of pictures as a .zip file);
Click “Execute”;
Wait for the script execution;
Read & copy the line from the pseudo terminal.

Advanced Settings

This section introduces the advanced settings that may be useful.

Optional OCR Modes

There are three OCR modes available: Full Scale (which is the default and scans the whole picture), Auto Cropping (which scans the bottom part of the picture where most hardsubs appear), and Custom Area (which allows you to scan a specific area).

Auto Cropping is helpful when your picture has logos (like 腾讯视频), which may result in redundant and unwanted characters. The area that is scanned in Auto Cropping mode goes as follows:

If you choose Custom Area, you can type the coordinates of the area you want to scan (which must be rectangular) in the Text Area box in the following format: top left x, top left y, bottom right x, bottom right y (with the top left corner of the original picture being 0,0). For example, if you have a 1280 x 720 picture, the Text Area can be “200,550,1100,700”.

Save the result as a .txt file

If you upload a batch of pictures as a .zip file, you may want to check the “Save the result as a .txt file” box, which will automatically generate a “result.txt” that contains the recognized lines for you when the execution is over, so that you won’t have to copy and paste all the lines on your own.

Notes & Tips

The server is located in Beijing and only has a bandwidth of 1MB and an 8GB RAM, so it may take you a long time to upload and process a massive zip file. If there are a lot of pictures that you want to transcribe, it will be best if you preprocess them first by batch cropping out the text areas before you upload them. (This app might help: https://www.addictivetips.com/windows-tips/batch-crop-images-on-windows-10/)
The maximal file size allowed is 100MB.
The accuracy of the OCR part is generally satisfactory, but there still may be typos (in my test set 偌 was misrecognized as 诺 and 斩 –> 新). It will be best to double check the result after it’s out. Most typos are easy to identify if you are familiar with Chinese.
Please make sure that there are no Chinese characters in the names of the files that you upload.
The pictures in a .zip file will be scanned in a numerical and alphabetical order (number goes first, i.e., “1a.png” will be scanned earlier than “a1.png”, and “a1.png” earlier than “a2.png”), so you should rename the pictures if their names are random.

Best Practice

Take screenshots of the video whenever a new line pops up;
(Optional) Crop the screenshots to reduce their sizes;
Upload the screenshots and check “Save the result as a .txt file”;
Execute the script;
Download the result and check for possible typos.

Credits

bugy/script-server: https://github.com/bugy/script-server

PaddleOCR/PaddleOCR: https://github.com/PaddlePaddle/PaddleOCR

Valar Morghulis

知进退，明得失。尽人事，听天命。