/ObjectDetectionOpenCV

Object Detection using OpenCV

Primary LanguageJupyter Notebook

Object Detection using OpenCV

OpenCV is an awesome, flexible, extensible platform to build Machine Learning models in the Computer Vision space. Here is a tutorial explaining how to build a haar cascade (Object Detection) from scratch and use it in your application. The cascade I am building here is to detect coca cola logos in pictures and videos. The demo of this cascade is here https://www.youtube.com/watch?v=erSePe_KtNU and https://www.youtube.com/watch?v=qQywEw6g9rI .

The entire code and data set used for positive and negative samples are here https://github.com/ckarthic/objectdetectionopencv

These are some awesome tutorials describing how to build cascades here. So please read them first to get the theory behind.

  1. https://docs.opencv.org/3.2.0/dc/d88/tutorial_traincascade.html
  2. https://docs.opencv.org/2.4/doc/user_guide/ug_traincascade.html
import numpy as np
import cv2 
import os

Prepare positive samples

Positive samples have to be manually generated by using the Opencv_annotate tool. But, first we need to get a collection of pictures that have the object that we need to train the cascaded to detect. In our case it is the coke logo. Coke logos are usually seen in advertisements. So use google image search for 'coca cola advertisements' and download the positive samples one by one. Or even better download the 'Google Images download' python package here https://github.com/hardikvasa/google-images-download. This is an awesome package for creating datasets. It is a god's send for me actually. Here is how I used it to download pictures with coke logos

googleimagesdownload.exe -k "coca cola" -sk advertisements -f png -o Pos -s medium

Extract the object from positive samples

Now that we have the positive samples. We need to extract the object (coke logo in our case) from the samples. It is the object that the cascasde will be trained to detect. The best way or (at least my way) to do it is use the opencv_annotate application to iterate through each of the sample and mark the rectagle region of the object to create an annotation file. My powershell script to run the app is below

$datafile = 'info/info_pos_round.data' 
$opencv_annotations = 'C:\Users\rithanya\Documents\Python\opencv-master\Build\opencv\build\x64\vc15\bin\opencv_annotation.exe'
$folderpath = './Source'

& $opencv_annotations --annotations=$datafile --images=$folderpath```

Once we have the annotation file, Run the following python script to extract the objects (logo in our case) and resize them to the same size. It seems the smaller the size of the object the better it is interms of training time and accuracy. Also try to get objects from as many image image samples as possible. I extracted 58 logo images to train the cascasde in my project



```python
def ExtractObject(datafile = "info_nike_demo.data", # annotation file
                  pathtowrite = "./Train/"):

    #open datafile
    f = open(datafile)
    content = f.read()
    i = 1
    for l in content.split('\n'):
        words = l.split()
        if(len(words) >= 6): # coz sometimes the images have no region. 
            img_path = ' '.join(words[:-5]) # path the positive sample file
            img_path = img_path.replace('\\','/') # replace back-slash if you are window user
            img = cv2.imread(img_path,0) # read the read the sample using OpenCV
            logo = [int(w) for w in words[-4:]] #extract the logo
            x,y,w,h = logo
            logograb = img[y:y+h, x:x+w]
            # keep the size small to keep the training time short
            img = cv2.resize(logograb, (60,20), interpolation = cv2.INTER_AREA)
            cv2.imwrite(pathtowrite + str(i) + '.jpg', img)
            i = i + 1

Create .vec file from extracted objects

The opencv_traincascade application that trains the cascade takes the positive images in the form of a .vec file. We can use the opencv_createsamples application to create the .vec file. But before that we need to build an another annotation file for the resized object (logo) samples. This is because the createsamples application takes this annotation file as an input to create the .vec file. Since the logo comprises the entirety of image files extracted in the step above the annotation file contents will look like like this

Source/logo_orig/1.jpg 1 0 0 60 20
Source/logo_orig/10.jpg 1 0 0 60 20

This annotation file can be created quickly by the following python script

def create_infodata(imgfolder = 'Source/logo_orig'):
    for img in os.listdir(imgfolder):
        line = imgfolder + "/" + img + " 1 0 0 60 20\n"
        with open('info_pos_orig.data','a') as f:
            f.write(line)

Once the annotation file is create, the following powershell command will create the .vec file from the positive samples

$opencv_createsamples = 'C:\Users\rithanya\Documents\Python\opencv-master\Build\opencv\build\x64\vc15\bin\opencv_createsamples.exe'
& $opencv_createsamples -info info_pos_orig.data -num 58 -w 60 -h 20 -vec pos_orig.vec -show

Prepare Negative samples

The accuracy of your cascade depends on the quantity and diversity of the negative samples. Good negative samples are those that are in the background of the image or video we are going to detect the object in. In our project, good negative files are advertisements from another soft drink similar to coke. So I ran the following script to download 100 pepsi advertisements from google image search. One important thing to do here is preview these negative images and delete those who have coke logos in them. Negative images shouldn't have any positive objects by accident.

googleimagesdownload.exe -k "pepsi" -sk advertisements -f png -o Neg -s medium

From what I read, it seems the negative samples should be in the thousands. So I used the Scikit-Learn's PatchExtractor module to create about 6000 or so patches of size 100 x 100 to act as negative samples to train the cascade

from sklearn.feature_extraction.image import PatchExtractor
from skimage import data, transform

def extract_patches(img, N, scale=1.0, patch_size=(100,100)):
    extracted_patch_size = tuple((scale * np.array(patch_size)).astype(int))
    extractor = PatchExtractor(patch_size=extracted_patch_size,
                               max_patches=N, random_state=0)
    patches = extractor.transform(img[np.newaxis])
    if scale != 1:
        patches = np.array([transform.resize(patch, patch_size)
                            for patch in patches])
    return patches

Here is the code that creates 75 patches from every advertisement downloaded above, resizes them to size 100 x 100

images = []
rootfolder = 'Neg'
for imgfolder in os.listdir(rootfolder): #iterate thru each of the 5 celeb folders
    if(imgfolder == 'pepsi advertisements'):
        for filename in os.listdir(rootfolder + '/' + imgfolder):# iterate thru each image in a celeb folder
            filename = rootfolder + '/' + imgfolder + '/' + filename # build the path to the image file
            if(filename.endswith('.jpg')):
                img = cv2.imread(filename)
                if(img != None):
                    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
                    images.append(img)

negative_patches = np.vstack([extract_patches(im, 75, scale)
                             for im in images for scale in [1.0]])

Create bg.txt

Now we need to create bg.txt file the lists all the negative samples one per line. The content of the bg.txt look like this

NegFromAds/Patches/1.jpg
NegFromAds/Patches/2.jpg

It isn't trivial to build this bg.txt by hand for our thousands of negative samples. The following python script does the trick.

def create_bgtxt(imgfolder = 'Neg/Patches'):
    for img in os.listdir(imgfolder):
        line = imgfolder + "/" + img + "\n"
        with open('bg.txt','a') as f:
            f.write(line)

Train the cascade

Now that we have both positive and negative samples, it is time to train the cascade using the following script. For 6500 negative samples of size 100 x 100 and 58 positive samples of size 60 x 20, the cascade trained for about 30 minutes in my laptop.

$opencv_traincascade = 'C:\Users\rithanya\Documents\Python\opencv-master\Build\opencv\build\x64\vc15\bin\opencv_traincascade.exe'
& $opencv_traincascade -data cascade -vec Pos.vec -bg Negative.txt -numPos 11 -numNeg 12 -numStages 10 -w 20 -h 20

Test the cascade

Test the cascade using the following python script. It turns out that the parameters for the detectMultiScale is as important as the cascade itself to optimize the detection accuracy. To find the right balance between selectivity and sensitivity. Here is a very good explanation of the parameters of the detectMultiScale function https://stackoverflow.com/questions/20801015/recommended-values-for-opencv-detectmultiscale-parameters

cokelogo_cascade = "C:/Users/rithanya/Documents/Python/Industrial_Safety/coke/cascade.xml"
cokecascade = cv2.CascadeClassifier(cokelogo_cascade)

#utility function to apply different cascade function on the images at difference scaleFactor
def detect(faceCascade, gray_,  scaleFactor_ = 1.1, minNeighbors = 5):
    faces = faceCascade.detectMultiScale(gray_,
                    scaleFactor= scaleFactor_,
                    minNeighbors=5,
                    minSize= (30,30), #(60, 20),
                    flags = cv2.CASCADE_SCALE_IMAGE
                )
    return faces

def DetectAndShow(imgfolder = 'NegFromAds/coca cola advertisements/'):
    cokelogo_cascade = "./cascade4/cokelogoorigfullds.xml"
    cokecascade = cv2.CascadeClassifier(cokelogo_cascade)
    for i in os.listdir(imgfolder):
        filepath = imgfolder + i
        img = cv2.imread(filepath)
        
        gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
        cokelogos = detect(cokecascade, gray, 1.25, 6)
        for (x, y, w, h) in cokelogos:
            cv2.rectangle(img, (x, y), (x+w, y+h), (0, 255, 0), 2)
        cv2.imshow('positive samples',img)
        k = 0xFF & cv2.waitKey(0)
        if k == 27:         # q to exit
            break