Anda di halaman 1dari 36

I NDIVIDUAL P ROJECT LATEX D OCUMENTATION M ULTI -B LOB T RACKING A ND F OREGROUND E XTRACTION USING OPEN CV

By
Bharat Singhvi 10305912

Computer Science and Engineering

Indian Institute of Technology, Bombay


October 29,2010

Contents

1 Introduction 1.1 1.2 1.3 Reasons for use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Computer vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Project Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 1 2 3 5 5 6 9 9 10 13 14 14 15 15

2 Working 1: Foreground Blob Extraction 2.1 2.2 Foreground Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . Blob Extraction using CCA Algorithm . . . . . . . . . . . . . . . . . .

3 Working 2: Blob Correspondence 3.1 3.2 Blob Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Proposed Reasoning Schemes . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 3.2.2 3.2.3 3.2.4 3.2.5 Isolated, Unoccluded and Well Tracked blobs . . . . . . . . . . Identifying New Blobs . . . . . . . . . . . . . . . . . . . . . . . Losing Track of Blobs . . . . . . . . . . . . . . . . . . . . . . . . Merging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Partial Occlusions by Background Objects . . . . . . . . . . . . 2

CONTENTS
3.2.6 3.2.7 3.3 Detecting blobs Exit . . . . . . . . . . . . . . . . . . . . . . . . Splitting of Blobs . . . . . . . . . . . . . . . . . . . . . . . . . .

3 16 16 17 17 17 18 20 20 21 22 22 23 24 24 24 25 25 25 29 29

Failure of Forward Tracking . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 3.3.2 3.3.3 Parameter values . . . . . . . . . . . . . . . . . . . . . . . . . . Understanding the Output Image . . . . . . . . . . . . . . . . Output Showing Failure of forward tracking . . . . . . . . . .

4 Solution Design 4.1 4.2 Proposed Solution Architecture . . . . . . . . . . . . . . . . . . . . . . Solution Architecture Followed . . . . . . . . . . . . . . . . . . . . . .

5 Implementation Aspects 5.1 5.2 5.3 5.4 5.5 API Used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Programming Language . . . . . . . . . . . . . . . . . . . . . . . . . . Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Complexity of Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6 Illustrative Code 6.1 6.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Code Snippets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7 Results: ScreenShots 7.1 Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CONTENTS
8 Conclusion 8.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4 31 31

List of Figures
1.1 2.1 3.1 4.1 5.1 6.1 6.2 6.3 7.1 Flow Chart describing the Project . . . . . . . . . . . . . . . . . . . . . Results of Foreground Extraction . . . . . . . . . . . . . . . . . . . . . Images showing failure of forward tracking . . . . . . . . . . . . . . 4 8 19 21 23 26 27 28 30

Solution Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . OpenCV Logo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Some sample data images . . . . . . . . . . . . . . . . . . . . . . . . . CPP Code 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CPP Code 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Program Execution Screenshots . . . . . . . . . . . . . . . . . . . . . .

Chapter 1 Introduction
Blob tracking is a technique/method used to track the number and direction of blobs traversing a certain passage or entrance per unit time. The resolution of the measurement is entirely dependent on the sophistication of the technology employed. The device is often used at public places such as railway stations , shopping malls, air ports etc so that the movement of each individaul blob can be analysed. Many different technologies are used in tracking devices, such as infrared beams, computer vision and thermal imaging.Ours is computer vision.

1.1

Reasons for use

There are various reasons for blob tracking. One such usage is people counting.In retail stores, counting is done as a form of intelligence-gathering. The use of people counting systems in the retail environment is necessary to calculate the Conversion Rate, i.e. the percentage of a stores visitors that makes purchases. This is the key performance indicator of a stores performance and is far superior to traditional methods, which only take into account sales data. Trafc counts and conversion rates together tell you how you got to your sales. i.e. if year-over-year sales are 1

CHAPTER 1. INTRODUCTION

down: did fewer people visit my store, or did fewer people buy? Although trafc counting is widely accepted as essential for retailers, it is estimated that less than 25% of major retailers track trafc in their stores. Since staff requirements are often directly related to density of visitor trafc, accurate visitor counting is essential in the process of optimizing staff shifts.For many locations such as bars or factories, it is essential to know how many people are inside the building at any given time, so that in the event of an evacuation due to re they can all be accounted for. This can only be automated with the use of extremely accurate people counting systems. Although, no people counting system is 100% accurate and therefore must not be entirely relied upon for the purposes of health & safety, an electronic people counting system offers a more accurate means of managing occupancy than tally counting by hand. Many public organizations use visitor counts as evidence when making applications for nance. In cases where tickets are not sold, such as in museums and libraries, counting is either automated, or staff keep a log of how many clients use different services. Second and the most important use of this technique is to track a particular blob of our interest and to maintian a record of status of that blob which can be analysed for further information like for example in case of suspicious left laguage detection in public places like railway stations etc. and in video survelliance systems to keep track of the movements of suspicious activities.

1.2

Computer vision

Computer vision systems typically use either a closed-circuit television camera or IP camera to feed a signal into a computer or embedded device. Some computer vision systems have been embedded directly into standard IP network cameras. This

CHAPTER 1. INTRODUCTION

allows for distributed, cost efcient and highly scalable systems where all image processing is done on the camera using the standard built in CPU. This also dramatically reduces band width requirements as only the counting data has to be sent over the Ethernet. Accuracy varies between systems and installations as background information needs to be digitally removed from the scene in order to recognize, track and count people. This means that CCTV based counters can be vulnerable to light level changes and shadows, which can lead to inaccurate counting. Lately, robust and adaptive algorithms has been developed that can compensate for this behavior and excellent counting accuracy can today be obtained for both outdoor and indoor counting using computer vision.

1.3

Project Flow

This section gives an insight of various steps involved in the project. Step 1: Videos are captured using video survelliance cameras and transmitted to proccessing centers where the videos are further proccessed. Step 2: Video is converted to frames and foreground is extracted and proccessed to obtain the characteristics features of the blob data using noise removal techniques and connected component analysis (Explained in Chapter 2). Step 3: The blob data is further used for tracking blob in the next frame using reasoning matrices and motion model using histogram based prediction (Explained in Chapter 3). Step 4: Graph like images are generated for simple visual understanding of the statuses of each Blob *.

CHAPTER 1. INTRODUCTION

Figure 1.1: Description of the process

Chapter 2 Working 1: Foreground Blob Extraction


2.1 Foreground Extraction

Efcient intrusion detection is the primary task of a smart surveillance system and is generally achieved by the process of background subtraction. Several techniques have been proposed in the domains of feature computations for background image representation and statistical modeling of the obtained feature vectors. The most commonly used approaches consider the intensity value in (normalized) RGB color space and or image gradients as the background image features and are statistically modeled through pixel-wise single Gaussian or mixture of Gaussian or the nonparametric approach with a Gaussian kernel function. In this section, we briey describe the single gaussian based approach to foreground extraction. Foreground Blobs are extracted from every frame so that we get information about individual blobs.First, a change detection is performed.In that current images are compared with an base image and on the basis of change in pixel values

CHAPTER 2. WORKING 1: FOREGROUND BLOB EXTRACTION

change is estimated in that image.Then,voting is done to remove the noise and disturbances where we get only blobs in white color.A little shadow is also present in each frame.by performing shadow removal the effect of shadow is removed.Then small blob removal is performed to remove small blobs by keeping a threshold for blob size in terms of number of pixels.Finally, Connected Component Analysis(CCA) is performed on the output of small blob removal to get information about individual blobs.

2.2

Blob Extraction using CCA Algorithm

The algorithms discussed can be generalized to arbitrary dimensions, albeit with increased time and space complexity.

Two-pass Algorithm Relatively simple to implement and understand, the two-pass algorithm[6] iterates through 2-dimensional, binary data. The algorithm makes two passes over the image: one pass to record equivalences and assign temporary labels and the second to replace each temporary label by the label of its equivalence class. The input data can be modied in situ (which carries the risk of data corruption), or labeling information can be maintained in an additional data structure. Connectivity checks are carried out by checking the labels of pixels that are North-East, North, NorthWest and West of the current pixel (assuming 8-connectivity). 4-connectivity uses only North and West neighbors of the current pixel. The following conditions are checked to determine the value of the label to be assigned to the current pixel (4connectivity is assumed). Conditions to check: 1 Does the pixel to the left (West) have the same value?

CHAPTER 2. WORKING 1: FOREGROUND BLOB EXTRACTION


a Yes - We are in the same region. Assign the same label to the current pixel b No - Check next condition

2 Does the pixel to the North of the current pixel have the same value but not the same label? a Yes - We know that the North and West pixels belong to the same region and must be merged. Assign the current pixel the minimum of the North and West labels, and record their equivalence relationship b No - Check next condition 3 Do the pixels North and West neighbors have different pixel values? a Yes - Create a new label id and assign it to the current pixel The algorithm continues this way, and creates new region labels whenever necessary. The key to a fast algorithm, however, is how this merging is done. This algorithm uses the union-nd data structure which provides excellent performance for keeping track of equivalence relationships . Union-nd essentially stores labels which correspond to the same blob in a disjoint-set data structure, making it easy to remember the equivalence of two labels by the use of an interface method. Once the initial labeling and equivalence recording is completed, the second pass merely replaces each pixel label with the its equivalent disjoint-set representative element.

CHAPTER 2. WORKING 1: FOREGROUND BLOB EXTRACTION

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

Figure 2.1: Results of foreground extraction after the post processing stages of voting,
shadow removal and small blob removal. Original frames are shown in pictures (a)(d) and the corresponding results of foreground blob extraction are shown in pictures (e)(h).

Chapter 3 Working 2: Blob Correspondence


3.1 Blob Characteristics

The characteristic features of the blob include the set of pixels B[i](p) which the current blob B[i](t) occupies, its weighted color distribution h[i](t) and the starting x coordinate , starting y coordinate ,ending x coordinate and ending y coordinate to calculate the bounding box dimension . The pixel set B[i](p) and weighted color distribution h[i](t) are initially learned from the foreground blob extracted at the rst appearance of the blob and are updated through out the sequence when (s)he is isolated, unoccluded and well tracked. The color distribution h[j](t) is computed from the b-bin color histogram of the region B[i](p) .Now by using motion model we estimate the possible position of B[i](t) in (t)th frame. Now using the bhattacharya coefcient we calculate the possibility of occurence of the blob.

CHAPTER 3. WORKING 2: BLOB CORRESPONDENCE

10

3.2

Proposed Reasoning Schemes

Many attempts have been made for counting and tracking people in a given area Recently, Gabriel J. Brostow and Roberto Cipolla. [?] proposed a Bayesian approach for tracking multiple persons under occlusions by computing MCMC based MAP estimates with prior informations about camera model and human appearance along with a ground plane assumption .Also Jae-Won Kim, Kang-Sun Choi, Byeong-Doo Choi, and Sung-Jea Ko [?] implemented method for the people counting system which detects and tracks moving people using a xed single camera.There system counts the number of moving objects (people) entering the security door .But they have an limitation in the number of people present in the scene .If this number exceeds 6 then there systems fails.Well Known Osama Masoud and Nikolaos P. Papanikolopoulos [?] presents a real time system for pedestrian tracking in sequence of grayscale images accquired by a stationary camera.The system outputs the spatiotemporal coordinate of each pedestrian during the pedestrian is in the scene.processing is down in three steps,raw images ,blob extraction and pedestrians.Arthur E.C. Pece of Institute of Computer Science University of Copenhagen Universitetsparken Denmark [?] developed cluster tracker to detect track split merge and remove clusters of pixels signicantly different from the corresponding pixels in the reference image.The cluster with same featires are grouped together and the number of people is counted by size of super cluster and their pattern of merging and splitting. Same merge Splittng algorithm is followed by Yunyoung Nam, Junghun Ryu, Yoo-Joo Choi, and We-Duke Cho of Computer Vision Laboratory, University of Maryland, College park [?]We have modifed this merge split algorithm for few more cases and implemented the same to track the multiple blobs in the system. In this section, we discuss the proposed reasoning schemes for tracking multiple blobs in a surveillance scenario. A monocular surveillance system is typically challenged by the problems of occlusion due to unavailability of adequate depth information from a single view.

CHAPTER 3. WORKING 2: BLOB CORRESPONDENCE

11

In addition to tracking isolated blobs and logging entries or exits, it should also be able to estimate approximate positions of the blobs when occluded by another blob or a background object. Occlusions caused by blobs occur in cases of crowding, where the system should approximately locate the individual blobs while differentiating the occluded ones from the occluding ones. Furthermore, an blob can also get occluded completely by a bigger object ,thereby disappearing from the set of detected foreground regions. More so, thinner background elements like far away standing vehicle can give rise to multiple foreground regions, which might confuse the system as an appearance of multiple blobs.The proposed algorithm addresses each of these issues for keeping track of individual blob by processing low level image features guided by higher level intelligent reasoning. Over all we are basically proposing an algorithm which will make use of the above reasoning scheme. It is worth noting that maintaining an exact track of all the blobs is not always possible. As for example, when a person hides behind another person or a bigger object, the system can only provide an approximate estimate where he has disappeared. Thus, the process of reasoning is performed over two sets, viz. the current and the previous set of blobs. The current set Bj (t) consists of the blobs that are well tracked till the tth instant. On the other hand, the previous set Bi (t 1) contains the blobs from the previous frame. The system typically initializes itself with empty sets and the blobs are added or removed accordingly as they enter or leave the eld of view. During the process of reasoning, the blobs are often swapped between the current and previous sets as the track is lost or restored. We start the process of symbolic reasoning at the tth frame based on the blob sets available from the (t 1)th instant. The image region occupied by the j th blob Bj (t 1) in the tth frame is predicted from its motion history information and is used for initializing a mean-shift algorithm to localize it further to Bj (t). The process of association is established by computing the overlaps between the set of predicted blob regions and the extracted foreground regions. Equation 3.1 denes the fractional overlap measure (a, b) between two re-

CHAPTER 3. WORKING 2: BLOB CORRESPONDENCE

12

gions a and b, given by the fraction of a overlapped with b. It is worth noting that (a, b) is an asymmetric measure and lies in the interval [0, 1]. (a, b) = |a b| |a| (3.1)

Now We dene a Localization Condence Matrix L.Lji = fraction of ith blob overlap with j th . Lji (t) = (Bi (t 1)), Bj (t)) |Bi (t 1) Bj (t)| = |Bi (t 1)|

(3.2)

We dene the matrix Lji (t) whose (j, i)th element (j = 1, . . . , mt , i = 1, . . . , nt1 )(where m,n are the number of blobs in present and previous frames respectively) is set to 1 if the localization condence of the j th blob in present frame, given by (Bj (t), Bi (t 1)) exceeds a certain threshold L and to 0 otherwise.

(3.3) Similarly ,we can also dene the matrix A as Attribution Matrix whose (i, j)th element (i = 1, . . . , nt1 ; j = 1, . . . , mt ) is set to 1 if the attribution condence of Bi (t 1) to the j th blob i.e Bj (t) , given by (Bi(t 1), Bj (t)) exceeds a certain threshold A and to 0 otherwise.

(3.4) . A number of measures can be dened from the thresholded localization and attribution condence matrices and are given by,

CHAPTER 3. WORKING 2: BLOB CORRESPONDENCE

13

nt 1

PL [i](t) =
j=0

Lji
mt1 1

QL [j](t) =
i=0 nt 1

Lji Aji
j=0 mt1 1

PA [i](t) =

QA [j](t) =
i=0

Aji

(3.5)

The proposed reasoning scheme is performed by extensively using the quantities introduced in equation 3.5 for handling the situations introduced in section ??. The following subsections address each of these issues and denes the predicates for identifying the same.

3.2.1 Isolated, Unoccluded and Well Tracked blobs


The ith blob in the previous frame is completely visible, if it is isolated from others and not occluded by any background objects. If this blob is properly tracked, then the localization condence should be signicantly high. Thus, the Boolean predicate I SO U N O CC T RKi (t) signifying the isolated, unoccluded and well tracked state of the ith can be expressed as,

I SO U N O CC T RKi (t) = i[PL [i](t) = 1] [PA [i](t) = 1]

(3.6)

here L and A are introduced in equation 3.3 and 3.4 Once the isolated and unoccluded status of the Blob is ensured, its color distribution, pixel set and current position are updated so as to achieve best tracking performances in the subsequent frames.

CHAPTER 3. WORKING 2: BLOB CORRESPONDENCE

14

3.2.2 Identifying New Blobs


The occurrence of a new blob is either caused by the entry of an blob or the reappearance of one from the merged set. Thus, a new blob does not have any prior association (here overlap) with the blobs in the present set . Now, in case of the entry of an blob, a certain number of frames are required for him to appear completely and hence it is not a wise choice to learn his features from partial information available from the initial frames. Thus, an blob detected as the new region j(t) is only added to the present number of blobs if its fractional overlap with an inner region (typically chosen by leaving a few border pixels of the image from each side) exceeds a certain threshold E . Hence, combining these conditions, the boolean predicate N EW B LOB[j](t) signifying the identication of a new region can be expressed as,

N EW B LOBj (t) = [QL [j](t) = 0] [QA [j](t) = 0]

(3.7)

Once it is ensured that, j(t) is a new region, we match its features against the blobs in the already occurred blob set . If a match is found, the track of the corresponding blob is restored by re-initializing its features computed from j(t) and is transferred to the active set. However, if no match is found, occurrence of B[j](t) is accounted to the appearance of a new blob and is added to the blobs of the current frame.

3.2.3 Losing Track of Blobs


The system might lose track of an Blob due to two main reasons. Firstly, it may fail to associate the predicted blob with any of the detected foreground blob, if the blob is occluded by larger background objects, like a vehicle. Secondly, the meanshift algorithm might fail to reach the detected foreground region corresponding

CHAPTER 3. WORKING 2: BLOB CORRESPONDENCE

15

to the blob due to inadequate motion information. In this case, the ith blob under consideration will have no signicant overlap with any of the detected foreground regions. Thus, the boolean predicate L OST T RACK[i](t) signifying the track loss of the ith blob can be expressed as,

L OST T RACK[i](t) = [PL [i](t) = 0] [PA [i](t) = 0]

(3.8)

The ith blob is transferred from the current blobs data set to the previous blob data set as the system loses the track of the same and none of its current position or color distribution data are updated.

3.2.4 Merging
The process of associating blobs with the corresponding (detected) foreground regions is very often challenged by the phenomenon of merging. In this case, multiple blobs group together giving rise to a single foreground blob. More so, one blob might very often occlude others while merging. Such events of merging of blob regions are very common in cases of blobs crossing each other, standing together, etc. Figure shows an example of a crowding condition, where a few blobs merge to form a single blob. In such cases, multiple blobs will have signicant overlaps with a single foreground region. Thus, the j th foreground region is detected to be congested by multiple blobs if [QL [j](t) > 1] and[QA [j](t) = 0] is true.

3.2.5 Partial Occlusions by Background Objects


In most practical cases, the eld of view is not just a collection of objects at innity. There might be objects like vehicles ,walls etc. at nite distances from the camera

CHAPTER 3. WORKING 2: BLOB CORRESPONDENCE

16

which can occlude the blobs in the scene. Such occlusions account for partial visibility or disappearance of blobs in the scene. The system loses track of an blob as it is not detected by background subtraction under complete occlusion and is transferred to the putative set. However, a partially occluded blob is detected as either a single or a collection of disjoint foreground blobs which are the respective distorted or fragmented form of the unoccluded case. The detected foreground blob(s) corresponding to the partially occluded blob B[i](t) exhibit overlap(s) with the localized image region B[i](t). PARTIAL O CCLUSION[i](t) = [PL [i](t) = 0] [PA [i](t) = 1] (3.9)

3.2.6 Detecting blobs Exit


The process of detecting an blobs exit from the eld of view is of prime importance as the system needs to identify the blobs to be removed from the active set. The exit of the j th blob from the boundary of the image is detected when the fractional overlap between the region Aj (t) and the inner region i t falls below 1 E . Thus, the boolean predicate E XITj (t) signifying the exit of the j th blob is given by, Once an blob is detected to exit from the eld of view, it is removed from the current number of blobs.

3.2.7 Splitting of Blobs


A surveillance system often encounters the case of splitting, where two or more blobs might enter the scene in a group and separate later within the eld of view. In such a case, they will be initially learned as a single blob. A split occurs when the relative velocity between the separating blobs is considerably high. It is worth noting that a split may eventually get detected as a fragmentation for a few frames, depending on the relative velocity between the separating blobs. Afterward, the

CHAPTER 3. WORKING 2: BLOB CORRESPONDENCE

17

tracker converges on one of the blobs and lose the others, which eventually emerge as new regions and are added as new blobs. In some cases, it might also lose track of both, which are then learned as two new blobs.

3.3

Failure of Forward Tracking

3.3.1 Parameter values


A and L which are used to calculate Localization and Attribution matrices are set to 0.51 ideally when tracked in Forward direction tracking algorithm scenario.In that scenario,only some cases of split and merging are only detected i.e. there are cases where merges are not detected and also where splits are not detected(see g 3.1). Increasing A and decreasing L will contribute to detection of fewer split cases but provides the advantage of detection of many of the merging cases but vice versa was found to be true, for then we trade off in detecting some of the isolation cases. So we conclude that detecting Split is difcult in one direction(forward/backward) alone in certain cases.So we opt for forward-backward analysis.

3.3.2 Understanding the Output Image


The output image consists of ve(can be varied by changing the parameter) horizontal strips.Each strip consists of template images of size 61 x 61(can be varied thru parameter value ) pixels of each blob in particular frame.The maximum number of blobs in any frame is assumed to be 10(also can be varied ).The current frame is the central strip and is surrounded by blue box to distinguish easily.And the blobs in consecutive and preceeding frames are drawn in the strips above and below it respectively.The lines are drawn across the blobs in one horizontal strip to the blobs in the previous frame drawn i.e. in the strip above it, purely based on status of

CHAPTER 3. WORKING 2: BLOB CORRESPONDENCE


formation of each of the blob.

18

3.3.3 Output Showing Failure of forward tracking


Output in the cases in which A and L are equal to 0.51 are shown below. Some cases in which Merge is not detected are aligned to left and some cases in which Split is not detected are aligned to right for convenience are shown below.

(a)

(b)

In g(a), 3rd blob in current frame is formed by merge of 3rd and 5th blob of previous frame above it, is not detected. In g(b), the 3rd blob also formed by split of 2nd blob from previous frame shown above it in the gure is not detected.

CHAPTER 3. WORKING 2: BLOB CORRESPONDENCE

19

(c)

(d)

In g(c), 4th blob in current frame is formed by merge of 3rd and 5th blob of previous frame above it, is not detected. In g(d), the 3rd blob also formed by split of 2nd blob from previous frame shown above it in the gure is not detected.

(e)

(f)

Figure 3.1: In g(e), 3rd blob in current frame is formed by merge of 3rd and 4th blob of
previous frame above it i.e a merge is not detected. In g(f), the 1st and 2nd blob also formed by split of 2nd blob from previous frame shown above it in the gure is not detected.

Chapter 4 Solution Design


4.1 Proposed Solution Architecture

Let us refer back to the owchart discussed in Chapter 1 while describing the project ow. The owchart is reproduced here: Step 1: Videos are captured using video survelliance cameras and transmitted to proccessing centers where the videos are further proccessed. Step 2: Video is converted to frames and foreground is extracted and proccessed to obtain the characteristics features of the blob data using noise removal techniques and connected component analysis (Explained in Chapter 2). Step 3: The blob data is further used for tracking blob in the next frame using reasoning matrices and motion model using histogram based prediction (Explained in Chapter 3). Step 4: Graph like images are generated for simple visual understanding of the statuses of each Blob *. The corresponding ow diagram was: 20

CHAPTER 4. SOLUTION DESIGN

21

Figure 4.1: Flow chart showing various sequetial steps involved in the System

4.2

Solution Architecture Followed

1: Data is pre-processed and disintegrated in form of frames of a video. 2: A foreground extraction algorithm as discussed in Chapter 2 is implemeneted for extracting blob foreground. 3: The output is then fed for blob characterisation. The process discussed in Chapter 3 is implemeneted to achieve the same. 4: The nal output detects multiple blobs present in the current frame. The data and code snippets corresponding to each step discussed above is provided in Chapter 6.

Chapter 5 Implementation Aspects


5.1 API Used

Since the project contains a lot of computing and complex image processing for which many functions have been developed for use, I have used Computer Vision library called openCV. About openCV: OpenCV is a computer vision library originally developed by Intel and now supported by Willow Garage. It is free for use under the open source BSD license. The library is cross-platform. It focuses mainly on real-time image processing.

OpenCVs application areas include: Egomotion estimation Facial Recognition Gesture Recognition

22

CHAPTER 5. IMPLEMENTATION ASPECTS


Motion Tracking Segmentation and Recognition Structure from motion Human-Computer Interface

23

The library is mainly written in C, which makes it portable to some specic platforms such as digital signal processors. Wrappers for languages such as C sharp, Python, Ruby and Java have been developed to encourage adoption by a wider audience. However, since version 2.0, OpenCV includes both its traditional C interface as well as a new C++ interface, that seeks to reduce common programming errors when using OpenCV in C. Much of the new developments and algorithms in OpenCV are in the C++ interface. Unfortunately, it is much more difcult to provide wrappers in other languages to C++ code as opposed to C code; therefore the other language wrappers are generally lacking some of the newer OpenCV 2.0 features.

Figure 5.1: Logo: OpenCV

5.2

Programming Language

The programming language used in the project implementation is C++. The written code amounts to around 600 lines of code (excluding comments). Extensive use of header les and dynamic compilation has been practised.

CHAPTER 5. IMPLEMENTATION ASPECTS

24

5.3

Data

The dataset used comprises of a total of 1107 images. Some of the images have been displayed in Chapter 6.

5.4

Complexity of Code

Considering the complexity aspect, i found the implementation quite difcult and totally worth an individual project. The entire implementation was highly mathematical, especially the blob characterisation part where relation among different frames has to be processed for dening object/blob boundaries.

5.5

Resources

Since the project uses openCV as the dependency, the conguration is really important. Here are some useful links for setting up openCV on Ubuntu. Other OS slightly differ in the settings but the process is almost the same. 1: Download - http://bit.ly/a9e3cT 2: Setup OpenCV - http://bit.ly/1j2pjL

Chapter 6 Illustrative Code


6.1 Data

The next page shows some of the images from the input data to the program. The used dataset has a number of people moving in and out of the scene. The program understands the motion and detects multiple objects moving in a frame to best extent possible.

6.2

Code Snippets

The project includes a large range of codes pertaining to specic functions required for the proper motion detection. Some of the processes expressed as functions in the programs include: Average Filtering Background modelling Blob Tracking 25

CHAPTER 6. ILLUSTRATIVE CODE

26

Figure 6.1: Out of a total of 1107 frames, these are a few. The human beings in the frames
are the objects/blobs of concern for the code as they represent the moving objects.

CHAPTER 6. ILLUSTRATIVE CODE


Template Matching Multiple Object Tracking Graph Generation*

27

Next are some snippets. Their explanation is too wide to be included in the report and would be best to be left for the time of the demo.

Figure 6.2: Snippet: Part of multiobjecttracking.cpp [Total LOC: 193]

CHAPTER 6. ILLUSTRATIVE CODE

28

Figure 6.3: Snippet: Part of blobTracking.cpp [Total LOC: 229]

Chapter 7 Results: ScreenShots


7.1 Execution

The following gures show the ouput of the program upon rendering:

29

CHAPTER 7. RESULTS: SCREENSHOTS

30

Figure 7.1: Beauty !!

Chapter 8 Conclusion
8.1 Conclusion

I have implemented a forward-backward analysed tracking algorithm. I have also done analysis of multiple objects for blob correspondence. The objects are represented by their appearance model (pixel co-ordinates and color values), color distribution and a motion model and are localized in consecutive frames by using Histogram Matching.Forward-backward analysed tracking algorithm proves out to be a better algorithm than simple forward tracking, where in the former algorithm detects all of the cases with out much error whereas in the latter algorithm even when experimented with various parameter values for A , L , yielded much varied and errored reuslts.The occlusion states are detected in the process of tracking and are used in identifying several problem situations to decide on selective object appearance feature and/or motion model update.

31

Anda mungkin juga menyukai