Name		Name	Last commit message	Last commit date
parent directory ..
notebooks		notebooks
README.md		README.md
image.png		image.png

README.md

L0 - 07: Optical Character Recognition (OCR)

Overview

This is an Optical Character Recognition (OCR) projet, a specific type of Image-To-Text technique using traditional computer vision and deep learning approaches, to recognize and extract text from images.

Popular OCR Tools

Name	Description	Pros	Cons
Tesseract-OCR (Tesseract)	Core OCR engine (C++)	Powerful, open-source	Not directly usable in Python
Tesserocr	Python wrapper for Tesseract-OCR C++ API	Less common, allows Python use	Might be less user-friendly than PyTesseract
PyTesseract	Python library for Tesseract-OCR	Simple to use, powerful	Requires separate Tesseract-OCR installation
EasyOCR	Separate Python library for OCR	Lightweight, good for receipts/PDFs, multi-language	Might not be as powerful as Tesseract-OCR for all tasks
Keras-OCR	Python library built on Keras/TensorFlow	User-friendly, deep learning approach	Might require more computing resources
Ocrad/GOCR	Open-source OCR engines	Established, free options	May have lower accuracy compared to newer libraries
ocropus/Calamari	Free, open-source OCR with LSTMs	Potentially high accuracy	May be more complex to set up
Amazon Textract	Cloud-based OCR service (AWS)	High accuracy, scalable, pre-built functionalities	Requires paid AWS account
Microsoft Azure Cognitive Services - Computer Vision	Cloud-based OCR solution (Microsoft)	Various features, text extraction, document processing	Requires paid Azure account
Google Cloud Vision API	Cloud-based OCR with image intelligence functionalities	Offered by Google Cloud Platform	Requires paid GCP account

Requirements

- tesseract-ocr
- libtesseract-dev
- pytesseract
- pillow
- io
- matplotlib.pyplot

Applications

Image-To-Text
Convert scanned documents to text
Handwritten text to machine digital text
Text extraction and Recognition from images

Pipeline

(Src: TheAILearner)

Usage

Open the notebook using Google Colab (link below), Kaggle, Jupyter notebook/lab or a similar tool.

Tip

Create a new cell and add the following command to download project resources from github to your Colab or Kaggle environment if needed:

!wget https://github.com/afondiel/computer-vision-hello-world-challenges/tree/main/06_Zero_Feature_Extraction_Alignment/image_missing_files.png

Notebook	Colab	Kaggle
Go to notebook

Contributing

If you want to contribute to this project, you are welcome to do so. You can either add new projects, improve existing ones, or fix bugs and errors.

Please follow these steps to contribute:

Fork this repository and clone it to your local machine.
Create a new branch with a descriptive name for your contribution.
Add your code and files to the branch and commit your changes.
Push your branch to your forked repository and create a pull request to the main repository.
Wait for your pull request to be reviewed and merged.

References

HuggingFace

https://huggingface.co/tasks/image-to-text

Keras:

Google - tesseract

OCR in age of LMM:

Tradition OCR vs Text Extraction from Image using Google Gemini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

L0_07_Optical_Character_Recognition_OCR

L0_07_Optical_Character_Recognition_OCR

README.md

L0 - 07: Optical Character Recognition (OCR)

Overview

Requirements

Applications

Pipeline

Usage

Contributing

References

Files

L0_07_Optical_Character_Recognition_OCR

Directory actions

More options

Directory actions

More options

Latest commit

History

L0_07_Optical_Character_Recognition_OCR

Folders and files

parent directory

README.md

L0 - 07: Optical Character Recognition (OCR)

Overview

Requirements

Applications

Pipeline

Usage

Contributing

References