Introduction: Reading Text From Image or Video
In this blog we will learn how to use `opencv` to read texts from photos and videos.
1. This is a Python script that uses computer vision and optical character recognition (OCR) to capture and read text from an image or video stream.
2. The code contains two files `ocr.py` and `ocrUtils.py`.
ocr.py
1. Imports necessary libraries including OpenCV, pytesseract, numpy, ImageGrab, time, and ocrUtils.
2. It defines a main function that does the following:
a. Initializes a camera object (using either a Raspberry Pi camera or a connected webcam).
b. Captures an image from the camera.
c. Uses ocrUtils to read text from the captured image using pytesseract, an OCR engine that can recognize text from images.
d. Displays the captured image with text overlaid on it.
e. Waits for user input before capturing another image.
ocrUtils.py
1. It contains a Python function that reads characters from an input image using the Tesseract OCR (Optical Character Recognition) engine. The function takes an input image as a cv2.Mat object and an optional draw parameter that, if set to True, draws boxes around the detected characters and labels them with the recognized text.
2. The function first sets the path to the Tesseract executable using `pytesseract.pytesseract.tesseract_cmd = '/usr/bin/tesseract'.`
3. It then gets the height, width, and number of channels of the input image using img.shape. It uses the `pytesseract.image_to_string()` function to extract the text from the input image, and `pytesseract.image_to_boxes()` to get the coordinates of the bounding boxes around each character.
4. It then iterates over the bounding boxes, draws a rectangle around each character using `cv2.rectangle()`, and optionally labels the character with the recognized text using `cv2.putText()`. Finally, the function returns the recognized text and the modified input image (if draw is True).
Supplies
- Web cam
- Brainy Pi
- Keyboard
- Mouse
- Internet Connection
Step 1: Setting Up
1. We need to install all the dependencies before we can run the code.
2. We also need to get the code before proceeding with the dependencies installation.
git clone https://github.com/brainypi/brainypi-opencv-examples.git cd text-detection
3. Let us create a virtual environemnt
python -m venv venv
4. Now, activate the virtual environment
cd venv/Scripts
activate
5. Installing the dependencies
pip install -r requirements.txt
Step 2: Running the Code
Now, we are in a position to run the code
1. We can now run the code and start scanning the QR codes
python ocr.py
2. Press `q` exit out of the program.