Published on

Level 6

Table of Contents

OPENCV

OpenCV is an open Source package/library which is aimed at real time computer vision.It can be used for a variety of different applications including face detection, objection tracking, landmark detection, and much more.

Applications of OpenCV:

  • 2D and 3D feature toolkits
  • Facial recognition system
  • Gesture recognition
  • Human-computer interaction
  • Motion understanding
  • Object identification

Initially, it might be difficult to explore all the variety of functions that the OpenCV library offers, but as you work on more OpenCV-based projects, you will get the hang of it.

Getting started with images

Importing libraries

import cv2
import numpy as np
import matplotlib.pyplot as plt

Reading images using OpenCV

OpenCV allowa to read different format of images like PNG, JPG, etc. Image can be loaded in grayscal, color or using alpha channel. This uses cv2.imread() function. The syntax of the function is: image = cv2.imread( filename , [flags])

image is the required image if it loads successfully otherwise it's None

  1. filename: This can be the absolute path or the relative path of the image. It is a mandatory argument.
  2. Flags: These flags are use dto read image in a particular format. a. cv2.IMREAD_GRAYSCALE or 0: Loads image in grayscale b. cv2.IMREAD_COLOR or 1: Loads a color image. c. cv2.UNCHANGED or -1: Loads image including alpha channel
# Read image as gray scale.
image = cv2.imread("checkerboard_18x18.png",0)

# Print the image data (pixel values), element of a 2D numpy array.
# Each pixel value is 8-bits [0,255]
print(image)

Display images using Matplotlib

# Display image.
plt.imshow(image)
Data Types

What happened? Even though the image was read in as a gray scale image, it won't necessarily display in gray scale when using imshow(). matplotlib uses different color maps and it's possible that the gray scale color map is not set.

# Set color map to gray scale for proper rendering.
plt.imshow(cb_img, cmap='gray')
Data Types

Working with Color Images

Until now, we have been using gray scale images in our discussion. Let us now discuss color images.

# Read in image
coke_img = cv2.imread("coca-cola-logo.png",1)
plt.imshow(coke_img)
Data Types

Huhh! This isn't what we expected? The color displayed above is different from the actual image. This is because matplotlib expects the image in RGB format whereas OpenCV stores images in BGR format. Thus, for correct display, we need to reverse the channels of the image. We will discuss about the channels in the sections below.

coke_img_channels_reversed = coke_img[:, :, ::-1]
plt.imshow(coke_img_channels_reversed)
Data Types

Splitting Color Channels

cv2.split() Divides a multi-channel array into several single-channel arrays.

# Split the image into the B,G,R components
img_NZ_bgr = cv2.imread("New_Zealand_Lake.jpg",cv2.IMREAD_COLOR)
b,g,r = cv2.split(img_NZ_bgr)

# Show the channels
plt.figure(figsize=[20,5])
plt.subplot(141);plt.imshow(r,cmap='gray');plt.title("Red Channel");
plt.subplot(142);plt.imshow(g,cmap='gray');plt.title("Green Channel");
plt.subplot(143);plt.imshow(b,cmap='gray');plt.title("Blue Channel");

# Merge the individual channels into a BGR image
imgMerged = cv2.merge((b,g,r))
# Show the merged output
plt.subplot(144);plt.imshow(imgMerged[:,:,::-1]);plt.title("Merged Output");
Data Types

Converting to different Color Spaces

cv2.cvtColor() Converts an image from one color space to another. The function converts an input image from one color space to another. In case of a transformation to-from RGB color space, the order of the channels should be specified explicitly (RGB or BGR).

Function syntax: image = cv2.cvtColor( src, code ) The function has 2 required arguments:

  1. src input image
  2. code color space conversion code (see ColorConversionCodes).

OpenCV Documentation cv2.cvtColor(): Documentation

# OpenCV stores color channels in a differnet order than most other applications (BGR vs RGB).
image = cv2.cvtColor(img_NZ_bgr, cv2.COLOR_BGR2RGB)
plt.imshow(image)

Accessing the camera

import cv2

video = cv2.VideoCapture(0)
video.set(3,640)
video.set(4,480)
video.set(10,100)

while(True):
    success, img = video.read()
    cv2.imshow("Video",img)
    if cv2.waitKey(1) & 0xFF ==ord('q'):
        break

source.release()
cv2.destroyWindow(win_name)

Basic Image manipulation

import cv2
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image

Original image

# Read image as gray scale.
cb_img = cv2.imread("checkerboard_18x18.png",0)

# Set color map to gray scale for proper rendering.
plt.imshow(cb_img, cmap='gray')
print(cb_img)

Accessing individual pixels

For accessing any pixel in a numpy matrix, you have to use matrix notation such as matrix[r,c], where the r is the row number and c is the column number. Also note that the matrix is 0-indexed.

# print the first pixel of the first black box
print(cb_img[0,0])
# print the first white pixel to the right of the first black box
print(cb_img[0,6])

Cropping images

Cropping an image is simply achieved by selecting a specific (pixel) region of the image.

cropped_region = img_NZ_rgb[200:400, 300:600]
plt.imshow(cropped_region)

Before

Data Types

After

Data Types

Resizing images

The function resize resizes the image src down to or up to the specified size. The size and type are derived from the src,dsize,fx, and fy.

resized_cropped_region_2x = cv2.resize(image,None,fx=2, fy=2)
plt.imshow(resized_cropped_region_2x)

Flipping images

The function flip flips the array in one of three different ways (row and column indices are 0-based):

Function Syntax dst = cv.flip( src, flipCode )

flipCode: a flag to specify how to flip the array; 0 means flipping around the x-axis and positive value (for example, 1) means flipping around y-axis. Negative value (for example, -1) means flipping around both axes.

img_NZ_rgb_flipped_horz = cv2.flip(img_NZ_rgb, 1)
img_NZ_rgb_flipped_vert = cv2.flip(img_NZ_rgb, 0)
img_NZ_rgb_flipped_both = cv2.flip(img_NZ_rgb, -1)

# Show the images
plt.figure(figsize=[18,5])
plt.subplot(141);plt.imshow(img_NZ_rgb_flipped_horz);plt.title("Horizontal Flip");
plt.subplot(142);plt.imshow(img_NZ_rgb_flipped_vert);plt.title("Vertical Flip");
plt.subplot(143);plt.imshow(img_NZ_rgb_flipped_both);plt.title("Both Flipped");
plt.subplot(144);plt.imshow(img_NZ_rgb);plt.title("Original");
Data Types

Image Annotation

Drawing a Line

We will use cv2.line function for this.

Function Syntax img = cv2.line(img, pt1, pt2, color, thickness, lineType, shift) Example

cv2.line(image, (200, 100), (400, 100), (0, 255, 255), thickness=5, lineType=cv2.LINE_AA);
Data Types

Drawing a Circle

We will use cv2.circle function for this.

Function Syntax img = cv2.circle(img, center, radius, color, thickness, lineType, shift) Example

cv2.circle(image, (900,500), 100, (0, 0, 255), thickness=5, lineType=cv2.LINE_AA);

Data Types

Drawing a Rectangle

We will use cv2.rectangle function to draw a rectangle on an image.

Function Syntax img = cv2.rectangle(img, pt1, pt2, color, thickness, lineType, shift) Example

cv2.rectangle(image, (500, 100), (700,600), (255, 0, 255), thickness=5, lineType=cv2.LINE_8);

Data Types

Adding Text

Finally, let's see how we can write some text on an image using cv2.putText function.

Function Syntax img = cv2.putText(img, text, org, fontFace, fontScale, color, thickness, lineType, bottomLeftOrigin) Example

cv2.putText(imageText, text, (200, 700), fontFace, fontScale, fontColor, fontThickness, cv2.LINE_AA);

Data Types

Image Enhancement

Changing brightness

The first operation we discuss is simple addition of images. This results in increasing or decreasing the brightness of the image since we are eventually increasing or decreasing the intensity values of each pixel by the same amount. So, this will result in a global increase/decrease in brightness.

matrix = np.ones(img_rgb.shape, dtype = "uint8") * 50

img_rgb_brighter = cv2.add(img_rgb, matrix)
img_rgb_darker   = cv2.subtract(img_rgb, matrix)
Data Types

Changing contrast

Contrast is the difference in the intensity values of the pixels of an image. Multiplying the intensity values with a constant can make the difference larger or smaller ( if multiplying factor is < 1 ).

matrix1 = np.ones(img_rgb.shape) * .8
matrix2 = np.ones(img_rgb.shape) * 1.2

img_rgb_darker   = np.uint8(cv2.multiply(np.float64(img_rgb), matrix1))
img_rgb_brighter = np.uint8(cv2.multiply(np.float64(img_rgb), matrix2))
Data Types

What happened? Can you see the weird colors in some areas of the image after multiplication?

The issue is that after multiplying, the values which are already high, are becoming greater than 255. Thus, the overflow issue. How do we overcome this?

img_rgb_higher = np.uint8(np.clip(cv2.multiply(np.float64(img_rgb), matrix2),0,255))

Image thresholding

Binary Images have a lot of use cases in Image Processing. One of the most common use cases is that of creating masks. Image Masks allow us to process on specific parts of an image keeping the other parts intact. Image Thresholding is used to create Binary Images from grayscale images. You can use different thresholds to create different binary images from the same original image.

Function Syntax retval, dst = cv2.threshold( src, thresh, maxval, type[, dst] ) dst: The output array of the same size and type and the same number of channels as src.

The function has 4 required arguments:

src: input array image thresh: threshold value. maxval: maximum value to use with the THRESH_BINARY and THRESH_BINARY_INV thresholding types. type: thresholding type (see ThresholdTypes).

img_read = cv2.imread("building-windows.jpg", cv2.IMREAD_GRAYSCALE)
retval, img_thresh = cv2.threshold(img_read, 100, 255, cv2.THRESH_BINARY)

# Show the images
plt.figure(figsize=[18,5])
plt.subplot(121); plt.imshow(img_read, cmap="gray");         plt.title("Original");
plt.subplot(122); plt.imshow(img_thresh, cmap="gray");       plt.title("Thresholded");

print(img_thresh.shape)
Data Types

Thresholding Tutorial

BITWISE operations

Example API for cv2.bitwise_and(). Others include: cv2.bitwise_or(), cv2.bitwise_xor(), cv2.bitwise_not()

img_rec = cv2.imread("rectangle.jpg", cv2.IMREAD_GRAYSCALE)

img_cir = cv2.imread("circle.jpg", cv2.IMREAD_GRAYSCALE)

plt.figure(figsize=[20,5])
plt.subplot(121);plt.imshow(img_rec,cmap='gray')
plt.subplot(122);plt.imshow(img_cir,cmap='gray')
print(img_rec.shape)
Data Types

Example Bitwise AND Operator

result = cv2.bitwise_and(img_rec, img_cir, mask = None)
plt.imshow(result,cmap='gray')
Data Types

Courses

This link here covers pretty much everything from numpy and then moves onto OpenCV - NumPy and OpenCV - Course

Follow these youtube playlists:

This covers all necessary concepts on OpenCV

Using OpenCV is something we will be working with a lot so make sure to go through this -

This will cover everything that is required to get started with OpenCV in python-

For further information on OpenCV refer their official website which has the required documentation,tutorials and github repos for OpenCV Home - OpenCV


Assignment

Assignment 1

Logo Manipulation In this assignment you have to fill in the white lettering of the Coca-Cola logo below with a background image.

Assignment 1