What is EasyOCR?

It is used for OCR.
EasyOCR is used as easyocr.Reader(lang_list, gpu, model_storagedirectory, download_enabled, user_network_directory, recog_network, detector, recognizer) where,

lang_list(list) : List of language code you want to recognize, for example [‘ko’,’en’]. You can see the list of language codes here.
gpu(bool, string, default=True) : enable GPU
model_storagedirectory(string, default=None) : Path to directory for model data. If not specified, models will be read from a directory as defined by the environment variable EASYOCR_MODULE_PATH (preferred), MODULE_PATH (if defined), or ~/.EasyOCR/.
download_enabled(bool, default=True) : Enable download if EasyOCR is not able locate model files.
user_network_directory(bool, default=None) : Path to user-defined recognition network. If not specified, models will be read from MODULE_PATH + ‘/user_network’ (~/.EasyOCR/user_network).
recog_network(string, default=’standard’) : Instead of standard mode, you can choose your own recognition network.
detector(bool, default=True) : Load detection model into memory. Default = True
recognizer(bool, default=True) : Load recognition model into memory. Default = True

Methods of EasyOCR

Reader.readtext(image, decoder, beamWidth, bath_size, workers, allowlist, blocklist, detail, paragraph, min_size, rotation_info, contrast_ths, adjust_contrast, text_threshold, low_text, link_threshold, canvas_size, mag_ratio, slope_ths, ycenter_ths, height_ths, width_ths, add_margin, x_ths, y_ths) -> list of result

Main method for Reader object. It returns a list of result.

image(string, numpy array, byte) : Input image
decoder(string, default=’greedy’) : options are ‘greedy’, ‘beamsearch’ and ‘wordbeamsearch’.
beamWidth(int, default=5) : How many beam to keep when decoder = ‘beamsearch’ or ‘wordbeamsearch’. Default = 5
bath_size(int, default=1) : batch_size>1 will make EasyOCR faster but use more memory.
workers(int, default=0) : Number thread used in of dataloader.
allowlist(string, default=None) : Force EasyOCR to recognize only subset of characters.
blocklist(string, default=None) : Block subset of character. This argument will be ignored if allowlist is given.
detail(int, default=1) : Set this to 0 for simple output.
paragraph(bool, default=False) : Combine result into paragraph.
min_size(int, default=10) : Filter text box smaller than minimum value in pixel.
rotation_info(list, default=None) : Allow EasyOCR to rotate each text box and return the one with the best confident score. Eligible values are 90, 180 and 270. For example, it can be set [90, 180, 270].
contrast_ths(float, default=0.1) : Text box with contrast lower than this value will be passed into model 2 times. First is with original image and second with contrast adjusted to ‘adjust_contrast’ value. The one with more confident level will be returned as a result.
adjust_contrast(float, default=0.5) : target contrast level for low contrast text box
text_threshold(float, default=0.7) : Text confidence threshold
low_text(foat, default=0.4) : Text low-bound score
link_threshold(float, default=0.4) : Link confidence threshold
canvas_size(int, default=2560) : Maximum image size. Image bigger than this value will be resized down.
mag_ratio(float, default=1.0) : Image magnification ratio
slope_ths(float, default=0.1) : Maximum slope (delta y/delta x) to considered merging. Low value means tiled boxes will not be merged.
ycenter_ths(float, default=0.5) : Maximum shift in y direction. Boxes with different level should not be merged.
height_ths(float, default=0.5) : Maximum different in box height. Boxes with very different text size should not be merged.
width_ths(float, default=0.5) : Maximum horizontal distance to merge boxes.
add_margin(float, default=0.1) : Extend bounding boxes in all direction by certain value. This is important for language with complex script.
x_ths(float, default=1.0) : Maximum horizontal distance to merge text boxes when paragraph=True.
y_ths(float, default=0.5) : Maximum vertical distance to merge text boxes when paragraph=True.

Reader.detect(image, min_size, text_threshold, low_text, link_threshold, canvas_size, mag_ratio, slope_ths, ycenter_ths, height_ths, width_ths, add_margin, optimal_num_chars)

Method for detecting text boxes. It returns two lists which are horizontal_list, free_list.

image(string, numpy array, byte) : Input image.
min_size(int, default=10) : Filter text box smaller than minimum value in pixel.
text_threshold(float, default=0.7) : Text confidence threshold.
low_text(float, default=0.4) : Text low-bound score.
link_threshold(float, default=0.4) : Link confidence threshold
canvas_size(int, default=2560) : Maximum image size. Image bigger than this value will be resized down.
mag_ratio(float, default=1.0) : Image magnification ratio
slope_ths(float, default=0.1) : Maximum slope to considered merging. Low value means tiled boxes will not be merged.
ycenter_ths(float, default=0.5) : Maximum shift in y direction. Boxes with different level should not be merged.
height_ths(float, default=0.5) : Maximum different in box height. Boxes with very different text size should not be merged.
width_ths(float, default=0.5) : Maximum horizontal distance to merge boxes.
add_margin(float, default=0.1) : Extend bounding boxes in all direction by certain value. This is important for language with complex script
optimal_num_chars(float, default=0.1) : If specified, bounding boxes with estimated number of characters near this value are returned first.
horizontal_list : a list of rectangular text boxes. [x_min, x_max, y_min, y_max]
free_list : a list of free-form text boxes. [[x1, y1], [x2, y2], [x3, y3], [x4, y4]]

Reader.recognize(image, horizontal_list, free_list, decoder, beamWidth, batch_size, workers, allowlist, blocklist, detail, paragraph, contrast_ths, adjust_contrast)

Method for recognizing characters from text boxes. If horizontal_list and free_list are not given. It will treat the whole image as one text box. It returns list of results.

image(string, numpy array, byte) : Input image.
horizontal_list(list, default=None) : Format from output of detect method.
free_list(list, default=None) : Format from output of detect method.
decoder(string, default=’greedy’) : options are ‘greedy’, ‘beamsearch’ and ‘wordbeamsearch’.
beamWidth(int, default=5) : How many beam to keep when decoder = ‘beamsearch’ or ‘wordbeamsearch’.
batch_size(int, default=1) : batch_size>1 will make EasyOCR faster but use more memory.
workers(int, default=0) : Number thread used in of dataloader.
allowlist(string) : Force EasyOCR to recognize only subset of characters. Useful for specific problem.
blocklist(string) : Block subset of character. This argument will be ignored if allowlist is given.
detail(int, default=1) : Set this to 0 for simple output,
paragraph(bool, default=False) : Combine result into paragraph,
contrast_ths(float, default=0.1) : Text box with contrast lower than this value will be passed into model 2 times. First is with original image and second with contrast adjusted to ‘adjust_contrast’ value. The one with more confident level will be returned as a result.
adjust_contrast(float, default=0.5) : Target contrast level for low contrast text box.

EasyOCR

What is EasyOCR?

Methods of EasyOCR

Reader.detect(image, min_size, text_threshold, low_text, link_threshold, canvas_size, mag_ratio, slope_ths, ycenter_ths, height_ths, width_ths, add_margin, optimal_num_chars)

Reader.recognize(image, horizontal_list, free_list, decoder, beamWidth, batch_size, workers, allowlist, blocklist, detail, paragraph, contrast_ths, adjust_contrast)

Further Reading

Gestures and Actions recognition

Image Transformation

Image Resizing, Scaling