Optical Character Recognition (OCR) With Python Using Tesseract and PIL on BrainyPI
by Pushkar_Shepal in Circuits > Raspberry Pi
565 Views, 0 Favorites, 0 Comments
Optical Character Recognition (OCR) With Python Using Tesseract and PIL on BrainyPI
This blog provides a step-by-step guide to performing Optical Character Recognition (OCR) on images using Python. We will utilize the Tesseract OCR engine and the Python Imaging Library (PIL) to extract text from images. The goal is to demonstrate how to process an image, convert it to grayscale, and extract text using Tesseract OCR.
OCR has become increasingly important in today's digital age, as it enables the conversion of printed or handwritten text into machine-readable formats. This technology finds applications in diverse fields such as document digitization, automated form processing, text mining, and more. By automating the extraction of text from images, OCR streamlines workflows, enhances data accessibility, and opens up opportunities for further analysis and decision-making.
Python, with its extensive libraries and ease of use, provides a robust platform for implementing OCR solutions. In this guide, we will utilize the Tesseract OCR engine, an open-source OCR library known for its accuracy and versatility, along with the Python Imaging Library (PIL), a powerful library for image processing tasks.
Through the following step-by-step instructions, we will cover the essential aspects of image processing and OCR, including image loading, preprocessing techniques, text extraction, and result visualization. Additionally, we will provide insights into optimizing OCR accuracy and explore potential avenues for further exploration in the OCR domain.
Whether you are a beginner looking to dive into the world of OCR or an experienced Python developer seeking to enhance your text extraction capabilities, this guide will equip you with the knowledge and tools needed to leverage OCR using Python and Tesseract.
So, let's embark on this OCR journey and unlock the power of extracting valuable text from images using Python and Tesseract
Supplies
Libraries: Pillow, Tesseract-ocr, pytesseract
Implementation: BrainyPI
Set Up the Environment
To get started, we need to set up our environment and install the necessary dependencies. Follow the instructions below for setting up the environment on Linux:
- Install Tesseract OCR:
sudo apt update
sudo apt install tesseract-ocr
- Install pytesseract:
pip install pytesseract
- Install PIL (Python Imaging Library):
pip install Pillow
Code OverView
- Import the necessary libraries:
from PIL import Image, ImageOps
import pytesseract
import argparse
import os
- Parse command line arguments:
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True, help="path to input image to be OCR'd")
args = vars(ap.parse_args())
- Open and preprocess the image:
image = Image.open(args["image"])
gray = ImageOps.grayscale(image)
- Save the grayscale image temporarily:
filename = "{}.png".format(os.getpid())
gray.save(filename)
- Perform OCR on the grayscale image:
text = pytesseract.image_to_string(Image.open(filename))
os.remove(filename)
- Print the extracted text:
print(text)
In the above code snippet, we import the necessary libraries, parse the command-line arguments, open and preprocess the image by converting it to grayscale, save the grayscale image temporarily, perform OCR on the grayscale image using pytesseract, remove the temporary grayscale image file, and finally print the extracted text.
These steps demonstrate the image processing and OCR workflow using Python and Tesseract OCR. By following this consolidated step, you can extract text from images effectively.
By following these steps, you can easily perform OCR on images using Python and Tesseract OCR.
Implementing on BrainyPI :
To implement the motion detection code on BrainyPI using GitLab, you can follow these steps:
a. Ensure that you have Git installed on your local machine. You can download Git from the official website and follow the installation instructions.
b. Create a Git repository on a GitLab server to host your text recognition code. You can create a new repository or use an existing one.
c. Clone the Git repository to your local machine using the following command:
git clone git@gitlab.com:your-username/your-repository.git
Replace your-username with your GitLab username and your-repository with the name of your GitLab repository.
d. Copy the motion detection Python file to the local Git repository directory.
e. Commit the changes and push them to the remote GitLab repository using the following commands:
git add colour_detection.py
git commit -m "Added text recognition code"
git push origin main
Replace main with the branch name of your GitLab repository if it's different.
f. Open a terminal or command prompt on your device. Use the following command to establish an ssh connection with your BrainyPI device.
ssh -X pi@auth.iotiot.in -p 65530
g. Now, on your BrainyPI device, make sure you have Git installed. If not, you can install it using the package manager of your operating system.
h. Open a terminal or command prompt on your BrainyPI device.
i. Clone the Git repository from your GitLab server to your BrainyPI device using the following command:
git clone git@gitlab.com:your-username/your-repository.git
Replace your-username with your GitLab username and your-repository with the name of your GitLab repository.
j. Navigate to the cloned repository directory on your BrainyPI device.
k. Run the text recognition code using the following command:
python3 text_recognition.py --image /path/to/image
The text recognition program will start running, and you will see the output on the terminal of your BrainyPI device.
Conclusion
In conclusion, this blog has provided a step-by-step guide to perform OCR on images using Python and Tesseract OCR. The code and explanations provided here will help you understand the process and enable you to extract text from images efficiently.