Home EasyOCR
Post
Cancel

EasyOCR

What is EasyOCR?

It is used for OCR.
EasyOCR is used as easyocr.Reader(lang_list, gpu, model_storagedirectory, download_enabled, user_network_directory, recog_network, detector, recognizer) where,

  • lang_list(list) : List of language code you want to recognize, for example [‘ko’,’en’]. You can see the list of language codes here.

  • gpu(bool, string, default=True) : enable GPU

  • model_storagedirectory(string, default=None) : Path to directory for model data. If not specified, models will be read from a directory as defined by the environment variable EASYOCR_MODULE_PATH (preferred), MODULE_PATH (if defined), or ~/.EasyOCR/.

  • download_enabled(bool, default=True) : Enable download if EasyOCR is not able locate model files.

  • user_network_directory(bool, default=None) : Path to user-defined recognition network. If not specified, models will be read from MODULE_PATH + ‘/user_network’ (~/.EasyOCR/user_network).

  • recog_network(string, default=’standard’) : Instead of standard mode, you can choose your own recognition network.

  • detector(bool, default=True) : Load detection model into memory. Default = True

  • recognizer(bool, default=True) : Load recognition model into memory. Default = True



Methods of EasyOCR





Main method for Reader object. It returns a list of result.

  • image(string, numpy array, byte) : Input image

  • decoder(string, default=’greedy’) : options are ‘greedy’, ‘beamsearch’ and ‘wordbeamsearch’.

  • beamWidth(int, default=5) : How many beam to keep when decoder = ‘beamsearch’ or ‘wordbeamsearch’. Default = 5

  • bath_size(int, default=1) : batch_size>1 will make EasyOCR faster but use more memory.

  • workers(int, default=0) : Number thread used in of dataloader.

  • allowlist(string, default=None) : Force EasyOCR to recognize only subset of characters.

  • blocklist(string, default=None) : Block subset of character. This argument will be ignored if allowlist is given.

  • detail(int, default=1) : Set this to 0 for simple output.

  • paragraph(bool, default=False) : Combine result into paragraph.

  • min_size(int, default=10) : Filter text box smaller than minimum value in pixel.

  • rotation_info(list, default=None) : Allow EasyOCR to rotate each text box and return the one with the best confident score. Eligible values are 90, 180 and 270. For example, it can be set [90, 180, 270].

  • contrast_ths(float, default=0.1) : Text box with contrast lower than this value will be passed into model 2 times. First is with original image and second with contrast adjusted to ‘adjust_contrast’ value. The one with more confident level will be returned as a result.

  • adjust_contrast(float, default=0.5) : target contrast level for low contrast text box

  • text_threshold(float, default=0.7) : Text confidence threshold

  • low_text(foat, default=0.4) : Text low-bound score

  • link_threshold(float, default=0.4) : Link confidence threshold

  • canvas_size(int, default=2560) : Maximum image size. Image bigger than this value will be resized down.

  • mag_ratio(float, default=1.0) : Image magnification ratio

  • slope_ths(float, default=0.1) : Maximum slope (delta y/delta x) to considered merging. Low value means tiled boxes will not be merged.

  • ycenter_ths(float, default=0.5) : Maximum shift in y direction. Boxes with different level should not be merged.

  • height_ths(float, default=0.5) : Maximum different in box height. Boxes with very different text size should not be merged.

  • width_ths(float, default=0.5) : Maximum horizontal distance to merge boxes.

  • add_margin(float, default=0.1) : Extend bounding boxes in all direction by certain value. This is important for language with complex script.

  • x_ths(float, default=1.0) : Maximum horizontal distance to merge text boxes when paragraph=True.

  • y_ths(float, default=0.5) : Maximum vertical distance to merge text boxes when paragraph=True.





Method for detecting text boxes. It returns two lists which are horizontal_list, free_list.

  • image(string, numpy array, byte) : Input image.

  • min_size(int, default=10) : Filter text box smaller than minimum value in pixel.

  • text_threshold(float, default=0.7) : Text confidence threshold.

  • low_text(float, default=0.4) : Text low-bound score.

  • link_threshold(float, default=0.4) : Link confidence threshold

  • canvas_size(int, default=2560) : Maximum image size. Image bigger than this value will be resized down.

  • mag_ratio(float, default=1.0) : Image magnification ratio

  • slope_ths(float, default=0.1) : Maximum slope to considered merging. Low value means tiled boxes will not be merged.

  • ycenter_ths(float, default=0.5) : Maximum shift in y direction. Boxes with different level should not be merged.

  • height_ths(float, default=0.5) : Maximum different in box height. Boxes with very different text size should not be merged.

  • width_ths(float, default=0.5) : Maximum horizontal distance to merge boxes.

  • add_margin(float, default=0.1) : Extend bounding boxes in all direction by certain value. This is important for language with complex script

  • optimal_num_chars(float, default=0.1) : If specified, bounding boxes with estimated number of characters near this value are returned first.

  • horizontal_list : a list of rectangular text boxes. [x_min, x_max, y_min, y_max]

  • free_list : a list of free-form text boxes. [[x1, y1], [x2, y2], [x3, y3], [x4, y4]]



Reader.recognize(image, horizontal_list, free_list, decoder, beamWidth, batch_size, workers, allowlist, blocklist, detail, paragraph, contrast_ths, adjust_contrast)

Method for recognizing characters from text boxes. If horizontal_list and free_list are not given. It will treat the whole image as one text box. It returns list of results.

  • image(string, numpy array, byte) : Input image.

  • horizontal_list(list, default=None) : Format from output of detect method.

  • free_list(list, default=None) : Format from output of detect method.

  • decoder(string, default=’greedy’) : options are ‘greedy’, ‘beamsearch’ and ‘wordbeamsearch’.

  • beamWidth(int, default=5) : How many beam to keep when decoder = ‘beamsearch’ or ‘wordbeamsearch’.

  • batch_size(int, default=1) : batch_size>1 will make EasyOCR faster but use more memory.

  • workers(int, default=0) : Number thread used in of dataloader.

  • allowlist(string) : Force EasyOCR to recognize only subset of characters. Useful for specific problem.

  • blocklist(string) : Block subset of character. This argument will be ignored if allowlist is given.

  • detail(int, default=1) : Set this to 0 for simple output,

  • paragraph(bool, default=False) : Combine result into paragraph,

  • contrast_ths(float, default=0.1) : Text box with contrast lower than this value will be passed into model 2 times. First is with original image and second with contrast adjusted to ‘adjust_contrast’ value. The one with more confident level will be returned as a result.

  • adjust_contrast(float, default=0.5) : Target contrast level for low contrast text box.

This post is licensed under CC BY 4.0 by the author.