1. Homepage
  2. Programming
  3. [2022] Fundamentals of Computer Vision - Project 3 Tracking Objects in Videos

[2022] Fundamentals of Computer Vision - Project 3 Tracking Objects in Videos

Engage in a Conversation
UMDCMSC 828DFundamentals of Computer VisionMatlabTracking Objects in Videos

Fundamentals of Computer Vision CourseNana.COM

Project 3 Tracking Objects in Videos CourseNana.COM

Instructions CourseNana.COM

  1. Integrity and collaboration: Students are encouraged to work in groups but each student must submit their own work. If you work as a group, include the names of your collaborators in your write-up. Code should NOT be shared or copied. Please DO NOT use external code unless permitted. Plagiarism is strongly prohibited and may lead to failure of this course.
  2. Start early! Running the code on the videos can take a lot of time, making debugging very slow.
  3. Verify your implmenentation as you proceed! Otherwise you risk having a huge mess of malfunctioning code that can go wrong anywhere.
  4. Questions: If you have any question, please look at Discussion on canvas first. Other students may have encountered the same problem, and might have been solved already. If not, post your question on the discussion board. TA will respond as soon as possible.
  5. Write-up: Your write-up should mainly consist of two parts, your answers to theory questions and your insights as you attempt the programming questions. Specific items to be included in the write-up are mentioned in each question.
  6. Code: Stick to the function prototypes mentioned in the handout. This makes veri- fying code easier for the TA. If you do want to change a function prototype or add an extra parameter, please talk to the TA.
  7. Submission: Your submission for this assignment should be a zip file, <First_FmilyName.zip>. composed of your write-up, your Matlab implementations (including helper functions). Please make sure to remove the data/ folder and any other files that are not required. Ensure that your submission is of a reasonable size. You may want to use video compression if your videos are huge.

Your final upload should have the files arranged in this layout: 1 CourseNana.COM

<First_FmilyName>.zip CourseNana.COM

<First_FmilyName>/
<First_FmilyName>.pdf CourseNana.COM

matlab/ CourseNana.COM

· LucasKanade.m.m
· affineMBTracker.m
· initAffineMBTracker.m · lk demo.m
· mb demo.m CourseNana.COM

results/
·           car.mp4 CourseNana.COM

Overview CourseNana.COM

One incredibly important aspect of human and animal vision is the ability to follow objects and people in our view. Whether it is a tiger chasing its prey, or you trying to catch a basketball, tracking is so integral to our everyday lives that we forget how much we rely on it. In this assignment, you will be implementing an algorithm that will track an object in a video. CourseNana.COM

You will first implement the Lucas-Kanade tracker, and then a more computationally efficient version called the Matthew-Baker (or inverse compositional) method [1]. This method is one of the most commonly used methods in computer vision due to its simplicity and wide applicability. We have provided two video sequences: a car on a road, and a helicopter approaching a runway. CourseNana.COM

To initialize the tracker you need to define a template by drawing a bounding box around the object to be tracked in the first frame of the video. For each of the subsequent frames the tracker will update an affine transform that warps the current frame so that the template in the first frame is aligned with the warped current frame. CourseNana.COM

Preliminaries CourseNana.COM

An image transformation or warp is an operation that acts on pixel coordinates and maps pixel values from one place to another in an image. Translation, rotation and scaling are all examples of warps. We will use the symbol W to denote warps. A warp function W has a set of parameters p associated with it and maps a pixel with coordinates x = [u v]T to ′′′T x = [u v ] . x= W(x; p) (1) An affine transform is a warp that can include any combination of translation, anisotropic scaling and rotations. An affine warp can be parametrized in terms of 6 parameters p = CourseNana.COM

Note that for convenience when we want to refer to the warp as a function we will use W(x; p) and when we want to refer to the matrix for an affine warp we will use W(p). Table 1 contains a summary of the variables used in the next two sections. It will be useful to keep these in mind. CourseNana.COM

Table 1: Summary of Variables CourseNana.COM

[p1 p2 p3 p4 p5 p6] . One of the convenient things about an affine transformation is that it is linear; its action on a point with coordinates x = [u v]T can be described as a matrix operation CourseNana.COM

Jacobian of affine warp wrt its parameters Jacobian of error function L wrt p Pseudo Hessian of L wrt p CourseNana.COM

Lucas-Kanade: Forward Additive Alignment CourseNana.COM

A Lucas Kanade tracker maintains a warp W(x; p) which aligns a sequence of images It to a template T. We denote pixel locations by x, so I(x) is the pixel value at location x in image I. For the purposes of this derivation, I and T are treated as column vectors (think of them as unrolled image matrices). W(x;p) is the point obtained by warping x with a transform that has parameters p. W can be any transformation that is continuous in its parameters p. Examples of valid warp classes for W include translations (2 parameters), affine transforms (6 parameters) and full projective transforms (8 parameters). The Lucas Kanade tracker minimizes the pixel-wise sum of square difference between the warped image I(W(x; p)) and the template T. CourseNana.COM

In order to align an image or patch to a reference template, we seek to find the parameter vector p that minimizes L, where: CourseNana.COM

L =  [T (x) I (W (x; p))]2 (4) x CourseNana.COM

In general this is a difficult non-linear optimization, but if we assume we already have a close estimate p of the correct warp, then we can assume that a small linear change ∆p is enough to get the best alignment. This is the forward additive form of the warp. The objective can then be written as: CourseNana.COM

L =  [T (x) I (W (x; p + ∆p))]2 (5) x CourseNana.COM

Expanding this to the first order with Taylor Series gives us: CourseNana.COM

Here I(x) = I(x) I(x), which is the vector containing the horizontal and vertical CourseNana.COM

T(x)I(W(x;p))I(x) p p (6) gradient at pixel location x. Rearranging the Taylor expansion, it can be rewritten as a CourseNana.COM

typical least squares approximation ∆p= argmin||Ap b||2 p CourseNana.COM

I p p−{T(x)I(W(x;p))} This can be solved with ∆p= (AT A)1AT b where: CourseNana.COM

p =argmin p x CourseNana.COM

b = T (x) I (W (x; p))
Once ∆
p is computed, the best estimate warp can be updated p p + ∆p, and the CourseNana.COM

whole procedure can be repeated again, stopping when ∆p is less than some threshold. CourseNana.COM

Matthew-Baker: Inverse Compositional Alignment CourseNana.COM

While Lucas-Kanade alignment works very well, it is computationally expensive. The inverse compositional method is similar, but requires less computation, as the Hessian and Jacobian only need to be computed once. One caveat is that the warp needs to be invertible. Since affine warps are invertible, we can use this method. CourseNana.COM

In the previous section, we combined two warps by simply adding one parameter vector to another parameter vector, and produce a new warp W(x,p+p). Another way of combining warps is through composition of warps. After applying a warp W(x; p) to an image, another warp W(x; q) can be applied to the warped image. The resultant (combined) warp is CourseNana.COM

W(x; q) W(x; p) = W(W(x; p), q) (11) Since affine warps can be implemented as matrix multiplications, composing two affine CourseNana.COM

warps reduces to multiplying their corresponding matrices
W(x; q) W(x; p) = W(W(x; p), q) = W(W(p)x, q) = W(q)W(p)x (12) CourseNana.COM

An affine transform can also be inverted. The inverse warp of W(p) is simply the matrix CourseNana.COM

The minimization is performed using an iterative procedure by making a small change (∆p) to p at each iteration. It is computationally more efficient to do the minimization by finding the ∆p that helps align the template to the image, than applying the inverse warp to the image. This is because the image will change with each frame of the video, but the template is fixed at initialization. We will see soon that doing this allows us to write the Hessian and Jacobian in terms of the template, and so this can be computed once at the beginning of the tracking. Hence at each step, we want to find the ∆p to minimize CourseNana.COM

L = [T (W (x; ∆p)) I (W (x; p))]2 (13) x CourseNana.COM

For tracking a patch template, the summation is performed only over the pixels lying inside the template region. We can expand T (W (x; ∆p)) in terms of its first order linear approximation to get CourseNana.COM

inverse of W(p), W(p)
affine warp as a set of 6 parameters in a vector
p and it will sometimes be easier to work with the matrix version W(p). Fortunately, switching between these two forms is easy (Equation 3). CourseNana.COM

Where T(x) = it to zero CourseNana.COM

. In this assignment it will sometimes be simpler to consider an CourseNana.COM

T p T(x)+T(x) p pI(W(x;p)) (15) 6 CourseNana.COM

T(x)+T(x) p pI(W(x;p)) (14) T(x) T(x) CourseNana.COM

To minimize we need to take the derivative of L and set CourseNana.COM

Setting to zero, switching from summation to vector notation and solving for ∆p we get ∆p = H1JT [I(W(x; p)) T] (16) CourseNana.COM

where J is the Jacobian of T(W(x;∆p)), J = TW, H is the approximated Hessian p CourseNana.COM

H = JT J and I (W (x; p)) is the warped image. Note that for a given template, the Jacobian J and Hessian H are independent of p. This means they only need to be computed once and then they can be reused during the entire tracking sequence. CourseNana.COM

Once ∆p has been solved for, it needs to be inverted and composed with p to get the new warp parameters for the next iteration. CourseNana.COM

W(x; p) W(x; p) W(x; ∆p)1 (17) CourseNana.COM

The next iteration solves Equation 16 starting with the new value of p. Possible termi- nation criteria include the absolute value of ∆p falling below some value or running for some fixed number of iterations. CourseNana.COM

1 Theory Questions (25 points) CourseNana.COM

Type down your answers for the following questions in your write-up. Each question should only take a couple of lines. In particular, the “proofs” do not require any lengthy calculations. If you are lost in many lines of complicated algebra you are doing something much too complicated (or wrong). CourseNana.COM

Q1.1: Calculating the Jacobian (15 points) Assuming the affine warp model defined in Equation 3, derive the expression for the CourseNana.COM

CourseNana.COM

Find the computational complexity (Big O notation) for the initialization step (Pre- computing J and H1) and for each runtime iteration (Equation 16) of the Inverse Compositional method. Express your answers in terms of n, m and p where n is the number of pixels in the template T, m is the number of pixels in an input image I and p is the number of parameters used to describe the warp W. How does this compare to the run time of the regular Lucas-Kanade method? CourseNana.COM

Jacobian Matrix J in terms of the warp parameters p = [p1 p2 p3 p4 p5 p6] .
Q1.2: Computational complexity (10 points) CourseNana.COM

8 CourseNana.COM

2 Lucas-Kanade Tracker (60 points) CourseNana.COM

For this section, TA will grade your tracker based on the performance you achieved on the two provided video sequences: (1) data/car1/. The provided script files lk demo.m and mb demo.m handle reading in images, template region marking, making tracker function calls and displaying output onto the screen. The function prototypes provided are guidelines. Please make sure that your code runs functionally with the original script and generates the outputs we are looking for (a frame sequence with the bounding box of the target being tracked on each frame) so that we can replicate your results. CourseNana.COM

Note that the only thing TA would do for you during grading is change the input data directory, and initialize your tracker based on what you mentioned in your write-up. Please submit one video for each of them in the results/ directory, with file name car.mp4. Also, please mention the initialization coordinates of your tracker for both video sequences in your write-up and in your code. CourseNana.COM

Q2.1: Write a Lucas-Kanade Tracker for a Flow Warp (20 points) Write the function with the following function signature:
[u,v] = LucasKanade(It, It1, rect) CourseNana.COM

that computes the optimal local motion from frame It to frame It+1 that minimizes Equation 1. Here It is the image frame It, It1 is the image frame It+1, and rect is the 4×1 vector that represents a rectangle on the image frame It. The four components of the rectangle are [x, y, w, h], where (x, y) is the top-left corner and (w, h) is the width and height of the bounding box. The rectangle is inclusive, i.e., in includes all the four corners. To deal with fractional movement of the template, you will need to interpolate the image using the Matlab function interp2. You will also need to iterate the estimation until the change in warp parameters (u, v) is below a threshold. Use the forward compositional (Lucas-Kanade method) for this question. CourseNana.COM

Q2.2: Initializing the Matthew-Baker Tracker (10 points) Write the function initAffineMBTracker() that initializes the inverse compositional CourseNana.COM

tracker by precomputing important matrices needed to track a template patch. CourseNana.COM

function [affineMBContext] = initAffineMBTracker(img, rect)
The function will input a greyscale image (img) along with a bounding box (rect) (in CourseNana.COM

the format [x y w h]). CourseNana.COM

The function should output a Matlab structure affineMBContext that contains the Jacobian of the affine warp with respect to the 6 affine warp parameters and the inverse of the approximated Hessian matrix (J and H1 in Equation 16). CourseNana.COM

Q2.3: The Main Matthew-Baker Tracker (20 points) Write the function affineMBTracker() that does the actual template tracking. function [Wout] = affineMBTracker(img, tmp, rect, Win, context) CourseNana.COM

The function will input a greyscale image of the current frame (img), the template image (tmp), the bounding box rect that marks the template region in tmp, The affine warp matrix for the previous frame (Win) and the precomputed J and H1 matrices context. CourseNana.COM

The function should output the 3 × 3 matrix Wout that contains the new affine warp matrix updated so that it aligns the current frame with the template. CourseNana.COM

You can either used a fixed number of gradient descent iterations or formulate a stop- ping criteria for the algorithm. You can use the included image warping function to apply affine warps to images. CourseNana.COM

Q2.4: Tracking a Car (10 points) CourseNana.COM

Test your trackers on the short car video sequence (data/car1/) by running the wrap- per scripts lk demo.m and mb demo.m. What sort of templates work well for tracking? At what point does the tracker break down? Why does this happen? CourseNana.COM

In your write-up: Submit your best video of the car being tracked. Save it as results/car.mp4. CourseNana.COM

Get in Touch with Our Experts

WeChat WeChat
Whatsapp WhatsApp
UMD代写,CMSC 828D代写,Fundamentals of Computer Vision代写,Matlab代写,Tracking Objects in Videos代写,UMD代编,CMSC 828D代编,Fundamentals of Computer Vision代编,Matlab代编,Tracking Objects in Videos代编,UMD代考,CMSC 828D代考,Fundamentals of Computer Vision代考,Matlab代考,Tracking Objects in Videos代考,UMDhelp,CMSC 828Dhelp,Fundamentals of Computer Visionhelp,Matlabhelp,Tracking Objects in Videoshelp,UMD作业代写,CMSC 828D作业代写,Fundamentals of Computer Vision作业代写,Matlab作业代写,Tracking Objects in Videos作业代写,UMD编程代写,CMSC 828D编程代写,Fundamentals of Computer Vision编程代写,Matlab编程代写,Tracking Objects in Videos编程代写,UMDprogramming help,CMSC 828Dprogramming help,Fundamentals of Computer Visionprogramming help,Matlabprogramming help,Tracking Objects in Videosprogramming help,UMDassignment help,CMSC 828Dassignment help,Fundamentals of Computer Visionassignment help,Matlabassignment help,Tracking Objects in Videosassignment help,UMDsolution,CMSC 828Dsolution,Fundamentals of Computer Visionsolution,Matlabsolution,Tracking Objects in Videossolution,