NAV
• OpenCV
• # OpenCV

import cv2
print(cv2.__version__)


OpenCV - Open Source Computer Vision is a library of programming functions mainly aimed at real-time computer vision.

You can process images as well as run deep learning frameworks Tensorflow, Torch/PyTorch and Caffe in OpenCV.

## Images

Reading & displaying an image in OpenCV

import cv2

# Display the image
cv2.imshow("WINDOW_NAME", img)
cv2.waitKey(0)
cv2.destroyAllWindows()


Functions we used here:

• cv2.imread() - to read an image
• cv2.imshow() - to display an image
• cv2.waitKey(5000) - wait 5s for a keyboard event.
• cv2.waitKey(0) waits indefinitely
• cv2.destroyAllWindows() - simply destroys all the windows we created (for resource management).

Write or save an image

Save the image

cv2.imwrite('image_name.png', img)



Use cv2.imwrite() to save an image. The first argument is the file name, and the second argument is the image you want to save.

You can mention the file format while naming the file (.png or .jpg).

## Videos

import numpy as np
import cv2
cap = cv2.VideoCapture(0)
while(True):
# Capture frame-by-frame
# Our operations on the frame come here
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# Display the resulting frame
cv2.imshow('frame',gray)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()


To capture a video, you need to create a VideoCapture object. Its argument can be either the device index or the name of a video file.

• cv2.VideoCapture() - object to capture the video. Returns True if video is read correctly.
• cap.read() - returns a bool (True/False) value, whether the frame can be read or not. Second returned value (frame in this case) is the actual frame (if True)
• cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) - this is our OpenCV operation where we are converting our frame from BGR to a grayscale frame.
• cv2.VideoCapture("FILE_NAME.mp4") - instead of capturing from a camera, you can directly read the video file saved on your disk as well.
• make sure you have a proper version of ffmpeg or gstreamer installed.

## Drawing Functions

Drawing geometric shapes with OpenCV

import numpy as np
import cv2

# Create a black image
img = np.zeros((512,512,3), np.uint8)

# Draw a diagonal blue line with thickness of 5 px

img = cv2.line(img,(0,0),(511,511),(255,0,0),5)

# Draw a rectangle
img = cv2.rectangle(img,(384,0),(510,128),(0,255,0),3)

# Draw a circle
img = cv2.circle(img,(447,63), 63, (0,0,255), -1)

# Draw an ellipse
img = cv2.ellipse(img,(256,256),(100,50),0,0,180,255,-1)

# Draw a polygon
pts = np.array([[10,5],[20,30],[70,20],[50,10]], np.int32)
pts = pts.reshape((-1,1,2))
img = cv2.polylines(img,[pts],True,(0,255,255))

# Write some text
font = cv2.FONT_HERSHEY_SIMPLEX
cv2.putText(img,'OpenCV',(10,500), font, 4,(255,255,255),2,cv2.LINE_AA)

cv2.imshow('image',img)
cv2.waitKey(0)
cv2.destroyAllWindows()


OpenCV allows us to perform all kind of operations like drawing shapes, writing text, positioning them, etc.

Let us learn how is it done:

• img - the image where we want to draw the shape.
• color - remember, in OpenCV we use BRG, so (255, 0, 0) represents Blue and not Red.
• cv2.line - draw on img, a line from (0, 0) to (511, 511), with Blue Color with the thickness of 5 px.
• cv2.rectange - draw a rect on img with (384, 0) as top-left and (510, 128) as bottom-right corner, with RED color and edge-thickness of 3px.
• cv2.circle - draw a circle on img, with (447, 63) as center and radius of 63px, with RED (0, 0, 255) color, fully-filled in (-1).
• cv2.ellipse - draw an ellipse on img, with (256, 256) as center, (100, 50) as (major,minor) axises, rotated (0) degrees, with (0, 180) as startAngle and endAngle (basically half) with only Blue color (255) but fully-filled in (-1).
• np.array - create a list of vertices.
• pts.reshape((-1, 1, 2)) - reshape the array to ROWx1x2 shape
• cv2.polylines - draw polylines on img, between pts points, and True to connect first and last points, with Yellow color (0, 255, 255)
• cv2.FONT_HERSHEY_SIMPLEX - use inbuilt font.
• cv2.putText - write on img, words OpenCV at (10, 500) location, using HERSHEY font, with a size of 4, in White color, with a thickness of 2 using Anti-Aliasing (cv2.LINE_AA).

## Image Operations

Accessing Pixel Values

import cv2
import numpy as np
px = img[100,100]
print px
# [157 166 200]
# accessing only blue pixel
blue = img[100,100,0]
print blue
# 157

# You can modify the pixel values the same way
img[100,100] = [255,255,255]
print img[100,100]
# [255 255 255]


Accessing Pixel Values

Almost all the image related operations are mainly related to Numpy rather than OpenCV here, so a good knowledge of Numpy is required to write better-optimized code with OpenCV.

Loading image is simple, lust use cv2.imread() function.

You can access a pixel value by its row and column coordinates. For BGR image, it returns an array of Blue, Green and Red values. For a grayscale image, it's corresponding intensity is returned.

You can modify the pixel value by assigning it new (B, G, R) values.

Image Properties

# Print image shape
print img.shape
# (342, 548, 3) would be different for your image

# Print image size
print img.size
# 562248

# Print image datatype
print img.dtype
# uint8


Access Image Properties

Image properties include the number of rows, the columns, and the channels, type of image data, number of pixels etc.

The shape of the image is accessed via img.shape. It returns a tuple of the number of rows, columns and channels (if image is color).

If the image is grayscale, tuple returned contains only the number of rows and columns. So it is a good method to check if the loaded image is a grayscale or a colored image.

The total number of pixels is accessed by img.size.

Image datatype is obtained by img.dtype.

img.dtype is very important while debugging because a large number of errors in OpenCV-Python code is caused by invalid datatype.

Image ROI

# Read a region in an image
roi = img[280:340, 330:390]

# Place this region somewhere else
img[273:333, 100:160] = roi


Splitting and Merging Image Channels

# Method 1
b,g,r = cv2.split(img)
img = cv2.merge((b,g,r))

# Or, Method 2
new_img = img[:, :, 0]


Region of Interest

Sometimes, you will have to play with a certain region of images. For eye detection in images, first, perform face detection over the image until the face is found, then search within the face region for eyes. This approach improves accuracy (because eyes are always on faces :D ) and performance (because we search for a small area).

ROI is again obtained using Numpy indexing. Here we are selecting the ball and copying it to another region in the image.

Splitting and Merging Image Channels

The BGR channels of an image can be split into their individual planes when needed. Then, the individual channels can be merged back together to form a BGR image again.

## Arithmatic Operations on Images

# Make sure these images are of same size.
# If you do not have same sized images, then read the size of smaller images, select roi of same size from bigger images and use this roi below.

cv2.waitKey(0)
cv2.destroyAllWindows()


Bitwise Operations

# Load two images
# I want to put logo on top-left corner, So I create a ROI
rows,cols,channels = img2.shape
roi = img1[0:rows, 0:cols ]
# Now create a mask of logo and create its inverse mask also
img2gray = cv2.cvtColor(img2,cv2.COLOR_BGR2GRAY)
ret, mask = cv2.threshold(img2gray, 10, 255, cv2.THRESH_BINARY)
# Now black-out the area of logo in ROI
# Take only region of logo from logo image.
# Put logo in ROI and modify the main image
img1[0:rows, 0:cols ] = dst
cv2.imshow('res',img1)
cv2.waitKey(0)
cv2.destroyAllWindows()


You can add two images by OpenCV function, cv2.add() or simply by numpy operation, res = img1 + img2. Both images should be of same depth and type, or second image can just be a scalar value.

There is a difference between OpenCV addition and Numpy addition. OpenCV addition is a saturated operation while Numpy addition is a modulo operation.

In OpenCV: x = 250, y = 10, cv2.add(x, y) = 255 while in numpy behaviour is different: x + y = 260.

Image Blending

We use cv2.addWeighted to blend two images. In the code we have written, add img1 with 0.7 and img2 with 0.3 weight. Last 0 means, we are not adding any scalar (bias) values to these images.

Bitwise Operations

This includes bitwise AND, OR, NOT and XOR operations. They will be highly useful while extracting any part of the image (as we will see in coming chapters), defining and working with non-rectangular ROI etc. Below we will see an example of how to change a particular region of an image.

We want to put OpenCV logo above an image. If we add two images, it will change color. If we blend it, we get a transparent effect. But we want it to be opaque. If it was a rectangular region, we could use ROI as we did in last chapter. But OpenCV logo is a not a rectangular shape.

See the result below. Left image shows the mask we created. Right image shows the final result. For more understanding, display all the intermediate images in the above code, especially img1_bg and img2_fg.

## Playing with Colors

Original image (a) and its channels with color: hue (b), saturation (c) and value or brightness (d). On the second row, each channel in grayscale (single channel image), respectively.

There are more than 150 color-space conversion methods available in OpenCV. But we will look into only two which are most widely used ones, BGR ↔ Gray and BGR ↔ HSV

For color conversion, we use the function cv2.cvtColor(input_image, flag), where the flag determines the type of conversion.

Few useful flags:

• COLOR_RGB2RGBA
• COLOR_RGB2BGR
• COLOR_RGB2LUV
• COLOR_RGB2GRAY
• COLOR_RGB2HSV
• Full List is here.

Tracking Object using Colors

import cv2
import numpy as np
cap = cv2.VideoCapture(0)
while(1):
# Take each frame
# Convert BGR to HSV
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
# define range of blue color in HSV
lower_blue = np.array([110,50,50])
upper_blue = np.array([130,255,255])
# Threshold the HSV image to get only blue colors
# Bitwise-AND mask and original image
cv2.imshow('frame',frame)
cv2.imshow('res',res)
k = cv2.waitKey(5) & 0xFF
if k == 27:
break
cv2.destroyAllWindows()


Object Tracking

Now we know how to convert BGR image to HSV, we can use this to extract a colored object. In HSV, it is more easier to represent a color than in BGR color-space. In our application, we will try to extract a blue colored object. So here is the method:

• Take each frame of the video.
• Convert from BGR to HSV color-space.
• We threshold the HSV image for a range of blue color.
• Now extract the blue object alone, we can do whatever on that image we want.

On Right is the code which is commented in detail. And tracking Blue object should look like this:

• There are some noises in the image. We will later see how to remove them.
• This is a very simple method. Later we would use advanced methods.

Scaling

import cv2
import numpy as np
res = cv2.resize(img,None,fx=2, fy=2, interpolation = cv2.INTER_CUBIC)
#OR
height, width = img.shape[:2]
res = cv2.resize(img,(2*width, 2*height), interpolation = cv2.INTER_CUBIC)


Translation

import cv2
import numpy as np
rows,cols = img.shape
M = np.float32([[1,0,100],[0,1,50]])
dst = cv2.warpAffine(img,M,(cols,rows))
cv2.imshow('img',dst)
cv2.waitKey(0)
cv2.destroyAllWindows()


Perspective Transformation

img = cv2.imread('sudoku.png')
rows,cols,ch = img.shape
pts1 = np.float32([[56,65],[368,52],[28,387],[389,390]])
pts2 = np.float32([[0,0],[300,0],[0,300],[300,300]])
M = cv2.getPerspectiveTransform(pts1,pts2)
dst = cv2.warpPerspective(img,M,(300,300))
plt.subplot(121),plt.imshow(img),plt.title('Input')
plt.subplot(122),plt.imshow(dst),plt.title('Output')
plt.show()


Scaling

Scaling is just resizing of the image. OpenCV comes with a function cv2.resize() for this purpose. The size of the image can be specified manually, or you can specify the scaling factor. Different interpolation methods are used. Preferable interpolation methods are cv2.INTER_AREA for shrinking and cv2.INTER_CUBIC (slow) & cv2.INTER_LINEAR for zooming. By default, interpolation method used is cv2.INTER_LINEAR for all resizing purposes.

Translation

Translation is the shifting of object's location. If you know the shift in $(x,y)$ direction, let it be $(t_x,t_y)$, you can create the transformation matrix $M$ as follows:

$$M = \begin{bmatrix} 0 & 0 & t_x \\ 1 & 1 & t_y \end{bmatrix}$$

For warping and rotation check the documentation

Perspective Transformation

For perspective transformation, you need a 3x3 transformation matrix. Straight lines will remain straight even after the transformation. To find this transformation matrix, you need 4 points on the input image and corresponding points on the output image. Among these 4 points, 3 of them should not be collinear. Then transformation matrix can be found by the function cv2.getPerspectiveTransform. Then apply cv2.warpPerspective with this 3x3 transformation matrix.

## Image Thresholding

import cv2
import numpy as np
from matplotlib import pyplot as plt
img = cv2.medianBlur(img,5)
ret,th1 = cv2.threshold(img,127,255,cv2.THRESH_BINARY)
cv2.THRESH_BINARY,11,2)
cv2.THRESH_BINARY,11,2)
titles = ['Original Image', 'Global Thresholding (v = 127)',
images = [img, th1, th2, th3]
for i in xrange(4):
plt.subplot(2,2,i+1),plt.imshow(images[i],'gray')
plt.title(titles[i])
plt.xticks([]),plt.yticks([])
plt.show()


There are other simpler methods, but we are skipping them.

In Adaptive Thresholding, the algorithm calculates the threshold for a small region of the image. So we get different thresholds for different regions of the same image and it gives us better results for images with varying illumination.

It has three ‘special’ input params and only one output argument.

Adaptive Method - It decides how thresholding value is calculated.

• cv2.ADAPTIVE_THRESH_MEAN_C : threshold value is the mean of neighbourhood area.
• cv2.ADAPTIVE_THRESH_GAUSSIAN_C : threshold value is the weighted sum of neighbourhood values where weights are a gaussian window.
• block-size - It decides the size of neighbourhood area.
• C - It is just a constant which is subtracted from the mean or the weighted mean calculated.

Results:

## Image Blurring

Averaging

import cv2
import numpy as np
from matplotlib import pyplot as plt
blur = cv2.blur(img,(5,5))
plt.subplot(121),plt.imshow(img),plt.title('Original')
plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(blur),plt.title('Blurred')
plt.xticks([]), plt.yticks([])
plt.show()


Gaussian Blur

blur = cv2.GaussianBlur(img,(5,5),0)


Bilateral Filter

blur = cv2.bilateralFilter(img,9,75,75)


Bilateral Filter Results

This is done by convolving the image with a normalized box filter. It simply takes the average of all the pixels under kernel area and replaces the central element. This is done by the function cv2.blur() or cv2.boxFilter(). Check the docs for more details about the kernel. We should specify the width and height of kernel. A 3x3 normalized box filter would look like below:

$$K = 1/9 \begin{bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{bmatrix}$$

Gaussian Blur

In this, instead of box filter, Gaussian kernel is used. It is done with the function, cv2.GaussianBlur(). We should specify the width and height of kernel which should be positive and odd. We also should specify the standard deviation in X and Y direction, sigmaX and sigmaY respectively. If only sigmaX is specified, sigmaY is taken as same as sigmaX. If both are given as zeros, they are calculated from kernel size. Gaussian blurring is highly effective in removing Gaussian noise from the image.

Bilateral Filtering

cv2.bilateralFilter() is highly effective in noise removal while keeping edges sharp.

Remember this operation is slower compared to other filters, so do not use if you are looking for real-time performance.

In future, we will cover faster methods as well.

## Canny Edge Detector

Canny Edge Detection

import cv2
import numpy as np
from matplotlib import pyplot as plt
edges = cv2.Canny(img,100,200)
plt.subplot(121),plt.imshow(img,cmap = 'gray')
plt.title('Original Image'), plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(edges,cmap = 'gray')
plt.title('Edge Image'), plt.xticks([]), plt.yticks([])
plt.show()


Canny Edge Detection

It is a popular edge detection algorithm.

1. It is a multi-stage algorithm and we will go through each stage.
2. Noise Reduction.
3. Since edge detection is susceptible to noise in the image, the first step is to remove the noise in the image with a 5x5 Gaussian filter.
4. Finds Intensity Gradient of the Image.
5. Performs Non-maximum Suppression - very important algorithm also used in DNN Object Detection algorithms.
6. Hysteresis Thresholding.

OpenCV puts all the above in a single function, cv2.Canny(). The first argument is our input image. The second and the third arguments are our minVal and maxVal respectively. The third argument is aperture_size.

## Background Subtraction

MOG

import numpy as np
import cv2
cap = cv2.VideoCapture('vtest.avi')
fgbg = cv2.createBackgroundSubtractorMOG()
while(1):
k = cv2.waitKey(30) & 0xff
if k == 27:
break
cap.release()
cv2.destroyAllWindows()


Background subtraction is a major preprocessing steps in many vision-based applications. For example, consider the cases like visitor counter where a static camera takes the number of visitors entering or leaving the room, or a traffic camera extracting information about the vehicles etc. In all these cases, first, you need to extract the person or vehicles alone. Technically, you need to extract the moving foreground from static background.

## Last Note

MOG2 - Background Subtraction

import numpy as np
import cv2
cap = cv2.VideoCapture('vtest.avi')
fgbg = cv2.createBackgroundSubtractorMOG()
while(1):
k = cv2.waitKey(30) & 0xff
if k == 27:
break
cap.release()
cv2.destroyAllWindows()


The algorithms available in OpenCV are numerous, and we need to make sure we know the best of them, specially the ones relevant to us.

We have tried and cover the basic ones here.

As we advance in our course for MLBLR, we will add more sections here.

We also suggest you tell us what else you'd like to learn right now.

You can find some awesome tutorials here:

The second link has awesome resources covering many other things along with OpenCV. Do check it out!