Introduction to Kalman Filtering-Part 1

31 03 2011

Why would anyone use it?

Suppose you have constructed a model that predicts the behavior of an actual physical process at any instance of time. This could be the trajectory of a projectile, or the path followed by your car, or any other process. You cannot guarantee that the model that you have constructed is 100% accurate, and you want to check to make sure if it behaves the way you predicted.

One way to do that is to take a couple of measurements of the system at various time intervals. However, the sensors used to take the measurements, or the person who is recording them cannot be trusted 100% of the time. So you are left with a couple of sources of error or noise.

So how do you combine the measurements and incorporate them in your model of the system so that you have a better estimate of the model or system? One way to do that is by employing Kalman Filtering. The upside is that you hopefully end up with some useful data or at least a less noisier version of it. However, the downside to filtering is that you add some delay.

In general the problem can be visually represented as follows:

So what is a filter?

Anything that blocks certain things and lets other things to go through, resulting in an enhancement or reduction of the quality or quantity of the original system.

Linear vs. Non-Linear Systems

Before we move on, it is important to consider what is a linear or a non-linear system, as we would be using these terms from now on. Given the inputs to the system, if the output is directly proportional to the inputs, or is a just a sum of all responses then the system is said to be linear. Whereas in a non-linear system the output can not be expressed as a linear combination (addition or scalar multiplication) of the inputs.

In other words, a function f(x,y) which depends on the variables x and y is said to be linear if and only if the following holds true:

  • f(x+y)=f(x) + f(y). This property if called additive.
  • f(ax)=af(x), where a is just a constant. This property is called Homogeneity, which implies if we double the strength of the input, the output will double up too. If we triple the input, the output’s strength gets tripled too.

These two properties together describe a linear system. Examples of linear systems include amplification or attenuation of signals, motion of a spring-damper system, electrical circuits formed of resistors, capacitors, inductors, etc.

However, many of the processes that we observe in nature are typically non-linear, like a bird’s flight or heat transfer during cooking. In order to simplify calculations, often we approximate them as linear by making certain assumptions.

As shown in the first figure on top, the next state of the system depends on the initial or previous state, the controls applied to change the state and the error sources associated with it. For a linear system, we can generalize this by writing it as:

x_{t+1}=Ax_t + Bu_t + w_t,

where, t is the time instance, x_t is the initial state, x_{t+1} is the next state, u_t is the control applied to change the state, w_t is the associated noise, and A, B are some constants, vectors or matrices. If we are describing a process, w_t is often referred to as process noise.

Now, we consider the internal state of the system x to be hidden, i.e. the estimate of the system state can only be carried out by observing and/or measuring the output of the system, represented by y. However, there will be some noise associated with the measurements themselves. For a linear system, we can generalize this as:

y_t=Cx_t+z_t

where, the output y is a function of the input x, C is a constant, vector or matrix, and z_t is the measurement noise.

A Simple Example

Lets take a simple example of estimating the position of a robot driving on a surface. This would be our System. Making many assumptions here to simplify the case, lets say it moves along in a single direction with a certain constant velocity v_0 of 5m/s and starts at an initial position p_0 of zero. We can now form its equation of motion as:

p = p_0 + v_0* \Delta t

In other words, we can get its position p at any time instant t by adding the distance traveled in the given time v_0* \Delta t to its initial position x_0.

The the state of the system consists of two variables: position and velocity. This can be represented in vector form as:

x_k= \begin{bmatrix} p_k \\ v_k \end{bmatrix}

In MATLAB it is now easy to plot the ground truth positions of the robot for 100 instances with a time step (\Delta t) of 0.1 seconds:

% Initial Position
p_0=0;
% Initial Velocity
v_0=5;
% Number of time instances or samples, in other words the duration of simulation
N=100;
% Time Step per instance/sample
dt=0.1;
% Generate a time vector formed of all tme instances with time steps
t=0:dt:dt*N;
% Generate the equation of motion for the robot and the ground truth
% positions for every time instance
p=p_0 + v_0*t;

% Plot the figure with the ground truth positions of the robot
figure;
plot(t,p,'g');
xlabel('Time (s)');
ylabel('Position (m)');
title('Ground Truth position of the robot');

Now, lets say we got someone to measure the position of the robot every 0.1 seconds. But, due to the inherent noise in the measuring device, or due to the lack of trust in the person recording its position, we get some measurement inaccuracies. We can model them here as Gaussian noise i.e. noise having a continuous probability distribution with a single mean value…or any bell shaped function.

So to simulate we add some random Gaussian noise to each one of the ground truth data points above. This is accomplished with the MATLAB code below:

% Add some random Gaussian noise with mean as the current true value of the
% position and a standard deviation of 3 meters, to each one of the Ground
% Truth positions above
m =zeros(0,N);
for k=1:N+1
    m(k) = p(k)+ 3*randn;
end;

% Plot the figure with the ground truth positions of the robot and the
% measurements with noise
figure;
plot(t,p,'g');
hold on;
plot(t,m,'r');
xlabel('Time (s)');
ylabel('Position (m)');
legend('GroundTruth position of the robot','Measurements with Noise');
title('Ground Truth Position with Noise');

Basics of Kalman Filtering

Kalman filter named after Rudolf E. Kalman is one such filter which can estimate the state of the system given the measurements, prediction, controls, and associated errors.





Correlation based similarity measures-Summary

11 04 2010

Correlation based matching typically produces dense depth maps by calculating the disparity at each pixel within a neighborhood. This is achieved by taking a square window of certain size around the pixel of interest in the reference image and finding the homologous pixel within the window in the target image, while moving along the corresponding scanline. The goal is to find the corresponding (correlated) pixel within a certain disparity range d (d E [0,….,dmax]) that minimizes the associated error and maximizes the similarity.

In brief, the matching process involves computation of the similarity measure for each disparity value, followed by an aggregation and optimization step. Since these steps consume a lot of processing power, there are significant speed-performance advantages to be had in optimizing the matching algorithm.

The images can be matched by taking either left image as the reference (left-to-right matching, also known as direct matching) or right image as the reference (right-to-left matching, also known as reverse matching) [2].

Similarity Measure Formula
Sum of Absolute Differences (SAD) Sum of Absolute Differences
Zero-mean Sum of Absolute Differences (ZSAD)
Locally scaled Sum of Absolute Differences (LSAD)
Sum of Squared Differences (SSD) Sum of Squared Differences
Zero-mean Sum of Squared Differences (ZSSD)
Locally scaled Sum of Squared Differences (LSSD)
Normalized Cross Correlation (NCC) Normalized Cross Correlation
Zero-mean Normalized Cross Correlation (ZNCC)
Sum of Hamming Distances (SHD) Sum of Hamming Distances

Sum of Absolute Differences (SAD) is one of the simplest of the similarity measures which is calculated by subtracting pixels within a square neighborhood between the reference image I1 and the target image I2 followed by the aggregation of absolute differences within the square window, and optimization with the winner-take-all (WTA) strategy [1]. If the left and right images exactly match, the resultant will be zero.

In Sum of Squared Differences (SSD), the differences are squared and aggregated within a square window and later optimized by WTA strategy. This measure has a higher computational complexity compared to SAD algorithm as it involves numerous multiplication operations.

Normalized Cross Correlation is even more complex to both SAD and SSD algorithms as it involves numerous multiplication, division and square root operations.

Sum of Hamming Distances is normally employed for matching census-transformed images (can be used on images that have not been census transformed) by computing bitwise-XOR of the values in left and right images, within a square window. This step is usually followed by a bit-counting operation which results in the final Hamming distance score.

Example: Tsukuba

TsukubaLeftLeft Image TsukubaRightRight Image
SAD Disparity Map ZSAD Disparity Map LSAD Disparity Map
SSD Disparity Map ZSSD Disparity Map LSSD Disparity Map
NCC Disparity Map ZNCC Disparity Map TsukubaSHD9x9DispRange=0-16ColorSHD Disparity Map
TsukubaGroundTruthColor

Ground Truth Disparity Map

Disparity Range: 0-16 ,Window Size: 9×9, The hotter the color, the closer it is; the cooler the color the farther it is.

Generic MATLAB code:

% *************************************************************************
% Title: Function-Compute Correlation between two images using various 
% similarity measures with Left Image as reference.
% Author: Siddhant Ahuja
% Created: March 2010
% Copyright Siddhant Ahuja, 2010
% Inputs: 
% 1. Left Image (var: rightImage), 
% 2. Right Image (var: leftImage),
% 3. Correlation Window Size (var: corrWindowSize), 
% 4. Minimum Disparity in X-direction (var: dMin), 
% 5. Maximum Disparity in X-direction (var: dMax),
% 6. Method used for calculating the correlation scores (var: method)
% Valid values include: 'SAD', 'LSAD', 'ZSAD', 'SSD', 'LSSD', ZSSD', 'NCC',
% 'ZNCC'
% Outputs: 
% 1. Disparity Map (var: dispMap), 
% 2. Time taken (var: timeTaken)
% Example Usage of Function: [dispMap, timeTaken]=denseMatch('tsukuba_left.tiff', 'tsukuba_right.tiff', 9, 0, 16, 'ZNCC');
% *************************************************************************
function [dispMap, timeTaken]=denseMatch(rightImage, leftImage, corrWindowSize, dMin, dMax, method)
% Grab the image information (metadata) of left image using the function imfinfo
leftImageInfo=imfinfo(leftImage);
% Grab the image information (metadata) of right image using the function imfinfo
rightImageInfo=imfinfo(rightImage);
% Since Dense Matching is applied on a grayscale image, determine if the
% input left image is already in grayscale or color
if(getfield(leftImageInfo,'ColorType')=='truecolor')
% Read an image using imread function, convert from RGB color space to
% grayscale using rgb2gray function and assign it to variable leftImage
    leftImage=rgb2gray(imread(leftImage));
else if(getfield(leftImageInfo,'ColorType')=='grayscale')
% If the image is already in grayscale, then just read it.        
        leftImage=imread(leftImage);
    else
        error('The Color Type of Left Image is not acceptable. Acceptable color types are truecolor or grayscale.');
    end
end
% Since Dense Matching is applied on a grayscale image, determine if the
% input right image is already in grayscale or color
if(getfield(rightImageInfo,'ColorType')=='truecolor')
% Read an image using imread function, convert from RGB color space to
% grayscale using rgb2gray function and assign it to variable rightImage
    rightImage=rgb2gray(imread(rightImage));
else if(getfield(rightImageInfo,'ColorType')=='grayscale')
% If the image is already in grayscale, then just read it.        
        rightImage=imread(rightImage);
    else
        error('The Color Type of Right Image is not acceptable. Acceptable color types are truecolor or grayscale.');
    end
end
% Find the size (columns and rows) of the left image and assign the rows to
% variable nrLeft, and columns to variable ncLeft
[nrLeft,ncLeft] = size(leftImage);
% Find the size (columns and rows) of the right image and assign the rows to
% variable nrRight, and columns to variable ncRight
[nrRight,ncRight] = size(rightImage);
% Check to see if both the left and right images have same number of rows
% and columns
if(nrLeft==nrRight && ncLeft==ncRight)
else
    error('Both left and right images should have the same number of rows and columns');
end
% Convert the left and right images from uint8 to double
leftImage=im2double(leftImage);
rightImage=im2double(rightImage);
% Check the size of window to see if it is an odd number.
if (mod(corrWindowSize,2)==0)
    error('The window size must be an odd number.');
end
% Check whether minimum disparity is less than the maximum disparity.
if (dMin>dMax)
    error('Minimum Disparity must be less than the Maximum disparity.');
end
% Create an image of size nrLeft and ncLeft, fill it with zeros and assign
% it to variable dispMap
dispMap=zeros(nrLeft, ncLeft);
% Find out how many rows and columns are to the left/right/up/down of the
% central pixel based on the window size
win=(corrWindowSize-1)/2;
% The objective of CC, NCC and ZNCC is to maxmize the
% correlation score, whereas other methods try to minimize
% it.
maximize = 0;
if strcmp(method,'NCC') || strcmp(method,'ZNCC')
    maximize = 1;
end
tic; % Initialize the timer to calculate the time consumed.
for(i=1+win:1:nrLeft-win)
    % For every row in Left Image
    for(j=1+win:1:ncLeft-win-dMax)
        % For every column in Left Image
        % Initialize the temporary variable to hold the previous
        % correlation score
        if(maximize)
            prevcorrScore = 0.0;
        else
            prevcorrScore = 65532;
        end
        % Initialize the temporary variable to store the best matched
        % disparity score
        bestMatchSoFar = dMin;
        for(d=dMin:dMax)
            % For every disparity value in x-direction
            % Construct a region with window around central/selected pixel in left image
            regionLeft=leftImage(i-win : i+win, j-win : j+win);
            % Construct a region with window around central/selected pixel in right image
            regionRight=rightImage(i-win : i+win, j+d-win : j+d+win);
            % Calculate the local mean in left region
            meanLeft = mean2(regionLeft);
            % Calculate the local mean in right region
            meanRight = mean2(regionRight);
            % Initialize the variable to store temporarily the correlation
            % scores
            tempCorrScore = zeros(size(regionLeft));
            % Calculate the correlation score
            if strcmp(method,'SAD')
                tempCorrScore = abs(regionLeft - regionRight);
            elseif strcmp(method,'ZSAD')
                tempCorrScore = abs(regionLeft - meanLeft - regionRight + meanRight);
            elseif strcmp(method,'LSAD')
                tempCorrScore = abs(regionLeft - meanLeft/meanRight*regionRight);
            elseif strcmp(method,'SSD')
                tempCorrScore = (regionLeft - regionRight).^2;
            elseif strcmp(method,'ZSSD')
                tempCorrScore = (regionLeft - meanLeft - regionRight + meanRight).^2;          
            elseif strcmp(method,'LSSD')
                tempCorrScore = (regionLeft - meanLeft/meanRight*regionRight).^2;
            elseif strcmp(method,'NCC')
                % Calculate the term in the denominator (var: den)
                den = sqrt(sum(sum(regionLeft.^2))*sum(sum(regionRight.^2)));
                tempCorrScore = regionLeft.*regionRight/den;
            elseif strcmp(method,'ZNCC')
                % Calculate the term in the denominator (var: den)
                den = sqrt(sum(sum((regionLeft - meanLeft).^2))*sum(sum((regionRight - meanRight).^2)));
                tempCorrScore = (regionLeft - meanLeft).*(regionRight - meanRight)/den;
            end
            % Compute the final score by summing the values in tempCorrScore,
            % and store it in a temporary variable signifying the distance
            % (var: corrScore)
            corrScore=sum(sum(tempCorrScore));
            if(maximize)
                if(corrScore>prevcorrScore)
                    % If the current disparity value is greater than
                    % previous one, then swap them
                    prevcorrScore=corrScore;
                    bestMatchSoFar=d;
                end
            else
                if (prevcorrScore > corrScore)
                    % If the current disparity value is less than
                    % previous one, then swap them
                    prevcorrScore = corrScore;
                    bestMatchSoFar = d;
                end
            end
        end
        % Store the final matched value in variable dispMap
        dispMap(i,j) = bestMatchSoFar;
    end
end
% Stop the timer to calculate the time consumed.
timeTaken=toc;
end

For other MATLAB Codes, please visit:

1. Sum of Absolute Differences
2. Sum of Squared Differences
3. Normalized Cross Correlation
4. Sum of Hamming Distances

References

1. T. Kanade, H. Kano, and S. Kimura, “Development of a video-rate stereo machine,” in Image UnderstandingWorkshop, Monterey,CA, 1994, p. 549–557.
2. D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” International Journal of Computer Vision, vol. 47(1/2/3), pp. 7-42, Apr. 2002.
3. S. Chambon and A. Crouzil. évaluation et comparaison de mesures de corrélationrobustes aux occultations. Rapport de recherche 2002-34-R, IRIT, Université PaulSabatier, Toulouse, France, December 2002.





Camera Calibration using Singular Value Decomposition (SVD) in MATLAB

20 02 2010

Before i jump to the section on calibration of a single camera, it is worthwhile to present a refresher on basic concepts of orthogonality, properties of homogeneous linear equations, matrix decomposition and basics of Singular Value Decomposition.

What is an Orthogonal or Orthonormal Matrix?

If two vectors x,y are perpendicular to each other, they are termed orthogonal, represented by x\bot y. Their inner/dot product represented by \langle x,y \rangle is zero. For example vectors (1,3,2) and (3,-1,0) are orthogonal to each other, since their dot product: (1)(3)+(3)(-1)+(2)(0) is equal to zero.

Two vectors are orthonormal if they are orthogonal and are of unit length, i.e. norms of vectors are one. For example, vectors (1,0) and (0,1) are orthonormal because:

  1. Dot product: (1)(0)+(0)(1)=0,
  2. Norm of (1,0)=\|(1,0)\|=\sqrt {1^2+0^2}=1, and
  3. Norm of (0,1)=\|(0,1)\|=\sqrt {0^2+1^2}=1.

A similar concept is applied to matrices. A matrix is said to be orthogonal if:

  1. Matrix elements are real numbers (i.e. all floating point numbers that are not complex in nature),
  2. Matrix is a square matrix (i.e. same number of rows and columns), and
  3. Either column or row vectors are orthogonal.

In other words, if the product of a matrix M and its transpose M^T is equal to an identity matrix I, i.e. M^TM=MM^T=I. This is also true if the transpose of the matrix is the same as its inverse,  i.e. M^T=M^{-1}.

If in such a matrix, column or row vectors are of unit length, then these matrices are called orthonormal matrices.

For example,

M=\begin{bmatrix} 1&0\\ 0&-1 \end{bmatrix} is an orthogonal and orthonormal matrix because,

  1. Dot product of column vectors: (1)(0)+(0)(-1)=0,
  2. Norm of column vector (1,0)=\|(1,0)\|=\sqrt{1^2+0^2}=1, and
  3. Norm of column vector (0,-1)=\|(0,-1)\|=\sqrt{0^2+(-1)^2}=1.

Also,

  1. M^TM= \begin{bmatrix} 1&0\\ 0&-1 \end{bmatrix} \begin{bmatrix} 1&0\\ 0&-1 \end{bmatrix} = MM^T = \begin{bmatrix} 1&0\\ 0&1 \end{bmatrix} = I, and
  2. M^T= \begin{bmatrix} 1&0\\ 0&-1 \end{bmatrix} = M^{-1}

Homogeneous Linear Equations

A set of equations where the constant terms are all zeros, are called homogeneous equations:

a_{11}x_1 + a_{12}x_2 + \cdots + a_{1n}x_n = 0,\\ a_{21}x_1 + a_{22}x_2 + \cdots + a_{2n}x_n = 0,\\ \vdots \\ a_{m1}x_1 + a_{m2}x_2 + \cdots + a_{mn}x_n = 0

where, there are n variables x_1, x_2, \cdots x_n. This can be written in the matrix form:

\begin{bmatrix} a_{11}&a_{12}&\cdots &a_{1n} \\ a_{21}&a_{22}&\cdots &a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\  a_{m1}&a_{m2}& \cdots &a_{mn}\end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\x_n \end{bmatrix}=\begin{bmatrix} 0 \\ 0 \\ \vdots \\ 0 \end{bmatrix} or, AX=0

Note, in the above system of equations, there always exits a solution: x_1=0,x_2=0, \cdots ,x_n=0 or, X=0, called the “trivial solution”.

If the columns in the square matrix A of size n \times n, are linearly independent, i.e. rank(A)=n, then the trivial solution is the one and only one, unique solution. In other words, det(A)\neq 0. This type of matrix is invertible and is referred to as “Regular Matrix”. On the other hand, if for this square matrix A, det(A)=0, then the equations are not linearly independent, i.e. rank(A)<n. Then, there exists infinite number of non-zero solutions, and A is NOT invertible. Such a matrix is called the “Singular Matrix”.

What is Matrix Decomposition?

Matrix decomposition refers to reduction of the matrix to the simplest and most significant form-”basic building blocks”, without any loss of generality. For example, number 15 can be reduced or factorized into a multiple of 5*3, where both 5 and 3 are primes and can not be reduced further. A similar concept is applied to matrices, where a matrix can be represented by a multiple of basic matrices which can not be reduced further. One of the reasons why we decompose a matrix is to reduce a complex form into products of simpler forms, for example, a complex transformation applied to an image can be reduced to products of simpler transformations.

Singular Value Decomposition (SVD)

Singular Value Decomposition (SVD) is one of the many methods of factorization of a matrix with applications in least squares fitting of data, solving linear equations, computing the rank, range and null space of matrix, etc. A set of linear equations can be written in the form of Ax=b, where A is an m \times n matrix where rows m are greater than the columns n. Then we can factorize A using SVD, where A=USV^T such that:

  1. U is a column orthogonal matrix of size m \times n,
  2. S is a diagonal matrix with positive or zero elements of size n \times n, and
  3. V is an orthogonal matrix of size n \times n.

The above is termed as the Singular Value Decomposition (SVD) of A. Let S=diag\{\sigma _1 , \sigma _2 , \cdots , \sigma _n \}, where \sigma _1 \geq \sigma _2 \geq \cdots \geq \sigma _n \geq 0, then \sigma _1, \sigma _2 , \cdots , \sigma _n are called the singular values of A. Columns of U and V are left and right singular vectors for the corresponding singular values respectively.

These singular values are not the same as the eigenvalues. For a matrix A, matrix A^HA is normal with non-negative eigenvalues where singular values of A are square roots of the eigenvalues of A^HA.

For a set of homogeneous linear equations Ax=0, any vector x which falls in the null space of A is a solution. Any column V where corresponding singular values approaches zero is a solution vector.

For a set of non-homogeneous linear equations Ax=b \neq 0, the goal is to find the solution x with the smallest length |x|^2, where x=V[diag( \sigma _1 ^ {-2}, \sigma _2 ^ {-1}, \cdots , \sigma _n ^{-1} )](U^Tb). In this case, for every singular value \sigma _i where \sigma _i=0, \sigma _i ^{-1} is replaced by 0. If b is not in the range of A then there exists no vector x to satisfy the equation. Thus it is not possible to obtain an exact solution.

If n=m, i.e. A is a square matrix, SVD is given by A=USV^T, where both U and V are orthogonal matrices. Thus, A^{-1}=V[diag( \sigma _1 ^ {-2}, \sigma _2 ^ {-1}, \cdots , \sigma _n ^{-1} )](U^T).

The range of A has the dimension of rank of A. If A is singular, then rank of A will be less than n, where n=rank(A)+null(A). The columns of U corresponding to singular values not equal to zero, are an orthonormal set for the range of A, and columns of V corresponding to singular values not equal to zero, are an orthonormal set for the null space of A.

Calculating SVD

SVD can be calculated by first computing a bi-diagonal matrix of A using Householder reflections, then computing SVD using an iterative method like QR algorithm for computation of eigenvalues [1]. In MATLAB there is a limit of 75 QR step iterations to get the singular values [2]. Sometimes, solution may not converge. LAPACK subroutines can be used for SVD : real double precision A-DGESVD, complex double precision A-ZGESVD, real single precision A-SGESVD, and complex single prevision A-CGESVD [2].

Camera Calibration

Here i am assuming you are familiar with concepts of camera calibration. For more details, please refer to Calibration slides by Bryan Morse at Brigham Young University. The work presented here is based on the Pinhole camera model.

If we know the 3D points in the world model, and the 2D projected pixel coordinates, then we can linearly solve (using SVD) for the calibration matrix, rotation matrix, translation vector, parameters for scaling in X and Y directions, and the optical center of the image (x,y). Using this we can re-project the 3D points and by comparing to the 2D points, figure out the average pixel error in the X and Y direction.

I have used the 3D points and 2D points provided by Dr. Boufama, at the University of Windsor. You can download these files from the links below:

  1. 3D Points
  2. 2D Points

Here is a visual representation of the points:

3D Points

2D Points

Here is the MATLAB Code:

% *************************************************************************
% Title: Function-Produce Calibration Matrix and Average Pixel Error using
% Singular value decomposition (SVD)
% Author: Siddhant Ahuja
% Created: February 2010
% Copyright Siddhant Ahuja, 2010
% ***Inputs***
% 1. File3D: 3-D File containing world coordinates
% 2. File2D: 2-D File containing pixel coordinates
% ***Outputs***
% calibMatrix: 3x4 Calibration Matrix
% rotationMatrix: 3x3 Rotation Matrix
% translationVector: 3x1 Translation vector
% alpha_u: Parameter for scaling in X-direction
% alpha_v: Parameter for scaling in Y-direction
% u_0: Optical Centre of the image (x value)
% v_0: Optical Centre of the image (y value)
% reprojMatrix: Reprojection of 3D points to 2D using Calibration Matrix
% avgError_u: Average Pixel Error in x direction
% avgError_v: Average Pixel Error in y direction
% timeTaken: Time taken by the code
% Example Usage of Function: 
% [calibMatrix, rotationMatrix, translationVector, alpha_u, alpha_v, u_0,
% v_0, reprojMatrix, avgError_u, avgError_v, timeTaken]= funcCalibrate('3D.txt', '2D.txt');
% *************************************************************************
function [calibMatrix, rotationMatrix, translationVector, alpha_u, alpha_v, u_0, v_0, reprojMatrix, avgError_u, avgError_v, timeTaken] = funcCalibrate(File3D, File2D)
 
% Assuming no Radial Distortion is present.
% Initialize the timer to calculate the time consumed.
tic;
% Read 3-D file
[matrix3D]=textread(File3D);
% Delete First row of the 3D Matrix as it only contains number of data points
matrix3D(1,:)=[]; 
% Find the size of the 3D Matrix
[nR3D, nC3D]=size(matrix3D);
% Check to make sure matrix has 3 columns
if(nC3D~=3)
    error('The matrix for 3D points does not have 3 columns.');
end
% Separate out the X values from the matrix
X = matrix3D(:,1);
% Separate out the X values from the matrix
Y = matrix3D(:,2);
% Separate out the X values from the matrix
Z = matrix3D(:,3);
% Plot the points in 3-D
figure;
plot3(matrix3D(:,1),matrix3D(:,2),matrix3D(:,3),'.');
% Read 2-D file
[matrix2D]=textread(File2D);
% Delete First row of the 2D Matrix as it only contains number of data points
matrix2D(1,:)=[]; 
% Find the size of the 2D Matrix
[nR2D, nC2D]=size(matrix2D);
% Check to make sure matrix has 2 columns
if(nC2D~=2)
    error('The matrix for 2D points does not have 2 columns.');
end
% Separate out the u values from the matrix
u_values=matrix2D(:,1);
% Separate out the v values from the matrix
v_values=matrix2D(:,2);
% Plot the points in 2-D
figure;
plot(matrix2D(:,1),matrix2D(:,2),'.');
% Check to make sure number of rows of the 3D Matrix is the same as the
% number of rows of the 2D Matrix
if(nR3D~=nR2D)
    error('Please make sure number of 3D and 2D points is the same.');
end
% ***Linear Solution of the Calibration Matrix***
% Let Calibtration Matrix be denoted by (M)
% Writing linear equations of the form AV=0, where A is a 2nx12 measurement
% matrix and V is a 12-element unknown vector.
% Create a Matrix of ones (o) with the same length as that of the u_values
% vector
o = ones(size(u_values));
% Create a Matrix of zeros (z) with the same length as that of the u_values
% vector
z = zeros(size(u_values));
% Populate the odd rows for A
AoddRows  = [ X Y Z o z z z z -u_values.*X -u_values.*Y -u_values.*Z -u_values ];
% Populate the even rows for A
AevenRows = [ z z z z X Y Z o -v_values.*X -v_values.*Y -v_values.*Z -v_values ];
% Concatenate odd and even rows of A
A=[AoddRows; AevenRows];
% Compute SVD on the matrix
[U, S, V] = svd(A,0);
% Assuming no noise, since the elements of the diagonal matrix S are in descending order, to
% get the eigenvectors corresponding to the smallest eignevalue, we can
% just grab the last column of matrix 
m = V(:,end);
% Construct the camera calibration matrix M
M = reshape(m,4,3)';
calibMatrix=M;
% Since the norm of Projection matrix M is equal to 1, we can calculate the
% absolute scale factor lambda
abs_lambda=sqrt(M(3,1)^2 + M(3,2)^2 + M(3,3)^2);
% Scale the Matrix with the scale factor
M = M / abs_lambda;
% In case the origin of the world frame is in front of the camera, we have
% s=lambda/abs_lambda, or From T_z=s*m_34, we can re-write s in the form of
% sign value of m_34. Here we assume it is in the front.
inFront=1;
if inFront
    s = sign(M(3,4));
else
    s = -sign(M(3,4));
end
% Thus, we can now calculate T_z or T(3) 
T(3) = s*M(3,4);
% Create a 3x3 Rotation matrix and fill it with zeros
R = zeros(3,3);
% From equations, last row of the rotation matrix is the same as the first
% three elements of the last row of the calibration matrix
R(3,:)=s*M(3,1:3);
% Matrix M can be written as:
% M=( m1'   )
%   ( m2' m4)
%   ( m3'   )
% We can now calculate mi, where mi is a 3 element vector
m1 = M(1,1:3)';
m2 = M(2,1:3)';
m3 = M(3,1:3)';
m4 = M(1:3,4);
% Now, we can calculate the centres of projection u_0 and v_0
u_0 = m1'*m3;
v_0 = m2'*m3;
% Calculating the alpha values in u and v directions,
alpha_u=sqrt( m1'*m1 - u_0^2 );
alpha_v=sqrt( m2'*m2 - v_0^2 );
% We can now calculate the first and second rows of the rotation matrix
R(1,:) = s*(u_0*M(3,1:3) - M(1,1:3) ) / alpha_u;
R(2,:) = s*(v_0*M(3,1:3) - M(2,1:3) ) / alpha_v;
% We can also calculate the first and second elements of the Translation
% vector
T(1) = s*(u_0*M(3,4) - M(1,4) ) / alpha_u;
T(2) = s*(v_0*M(3,4) - M(2,4) ) / alpha_v;
T = T';
translationVector=T;
% The rotation matrix R obtained with this estimation procedure is not guaranteed to be orthogonal. 
% Therefore we calculate the rotation matrix that is closest to the estimated matrix (in the 
% Frobenius norm sense). Let R = UDV'T then R = UV'T.
[U,D,V] = svd(R);
R = U*V';
rotationMatrix=R;
% Reproject 3Dpoints to 2D points using the calibration matrix to calculate average pixel errors 
newMatrix2D=zeros(nR3D,2);
for i=1:1:nR3D
    num_u=M(1,1)*matrix3D(i,1) + M(1,2)*matrix3D(i,2) + M(1,3)*matrix3D(i,3) + M(1,4);
    num_v=M(2,1)*matrix3D(i,1) + M(2,2)*matrix3D(i,2) + M(2,3)*matrix3D(i,3) + M(2,4);
    den=M(3,1)*matrix3D(i,1) + M(3,2)*matrix3D(i,2) + M(3,3)*matrix3D(i,3) + M(3,4);
    newMatrix2D(i,1)=num_u/den;
    newMatrix2D(i,2)=num_v/den;
end
% Reprojected matrix
reprojMatrix=newMatrix2D;
% Calculate difference between the reprojectedMatrix and the original 2D
% Matrix
errorDiff=reprojMatrix-matrix2D;
% Calculate average pixel error in x direction
avgError_u=mean(errorDiff(:,1));
% Calculate average pixel error in y direction
avgError_v=mean(errorDiff(:,2));
% Stop the timer to calculate the time consumed.
timeTaken=toc;

To run the above code, please type in the following commands in MATLAB:

% Run this first
clc
clear all
[calibMatrix, rotationMatrix, translationVector, alpha_u, alpha_v, u_0, v_0, reprojMatrix, avgError_u, avgError_v, timeTaken]= funcCalibrate('3D.txt', '2D.txt');

Here are some of the results obtained by running the code:

  1. Calibration Matrix:

    \begin{pmatrix} -0.0276 & -1.2074e-14 &     -0.0071 & -0.7065 \\ 1.8319e-15 & -0.0276 & -0.0071 & -0.7065 \\ 6.5052e-19 & 7.0039e-17 & -2.7599e-05 & -0.0028 \end{pmatrix}

  2. Rotation Matrix:

    \begin{pmatrix} -1.0000 & -5.7371e-13 & -2.3566e-14 \\ 5.7365e-13 & -1.0000 & -2.5378e-12 \\ -2.3566e-14 & -2.5378e-12 & 1 \end{pmatrix}

  3. Translation Vector:

    \begin{pmatrix} -7.0104e-11 \\ -2.6114e-10 \\ 99.9998 \end{pmatrix}

  4. Parameter for scaling in X-direction: 999.9978
  5. Parameter for scaling in Y-direction: 999.9978
  6. Optical center of the image: (256.0000, 256.0000)
  7. Average Pixel Error in X direction: -3.7259e-11
  8. Average Pixel Error in Y direction: -1.9027e-11

References

[1] Trefethen, Lloyd N.; Bau III, David (1997), Numerical linear algebra, Philadelphia: Society for Industrial and Applied Mathematics, ISBN 978-0-89871-361-9.
[2] http://www.mathworks.com/access/helpdesk/help/techdoc/ref/svd.html





Adding Vignetting Effect to Images

21 06 2009

When the lens size is smaller than the dimensions of the actual image sensor, or inadequately covering it, it leads to an effect whereby the center of the image is clear, and a fading/darkening of the image borders is seen. This effect is commonly referred to as the Vignetting Effect. In graphics, it is used to de-emphasize the distracting background, and shed light on the foreground objects.

Example

Baby1LeftColorVignette1

Scale factor at image border=1.0

Baby1LeftColorVignette0.8

Scale factor at image border=0.8

Baby1LeftColorVignette0.6

Scale factor at image border=0.6

Baby1LeftColorVignette0.4

Scale factor at image border=0.4

Baby1LeftColorVignette0.2

Scale factor at image border=0.2

Baby1LeftColorVignette0.1

Scale factor at image border=0.1

MATLAB Code

% *************************************************************************
% Title: Function-Introduce Vignetting effect to the image
% Author: Siddhant Ahuja
% Created: September 2008
% Copyright Siddhant Ahuja, 2008
% Inputs: Input Image (var: inputImage), Scale Change level (var:
% scaleLevel) Valid values for scale change should go from 0.1 to 1
% Outputs: Vignetting effect added image (var: vignettImg, Time taken (var: timeTaken)
% Example Usage of Function: [vignettImg, timeTaken]=funcVignettingEffect('Baby1LeftColor.png', 0.5);
% *************************************************************************
function [vignettImg, timeTaken]=funcVignettingEffect(inputImage,scaleLevel)
% Read the input image
try
    % Read an image using imread function
    inputImage=imread(inputImage);
    % grab the number of rows, columns, and channels
    [nr, nc, nChannels]=size(inputImage);
     % Grab the image information (metadata) of input image using the function imfinfo
    inputImageInfo=imfinfo(inputImage);
    % Determine if input left image is already in grayscale or color
    if(getfield(inputImageInfo,'ColorType')=='truecolor')
        colored=1;
    else if(getfield(inputImageInfo,'ColorType')=='grayscale')
        colored=0;
        else
        error('The Color Type of Left Image is not acceptable. Acceptable color types are truecolor or grayscale.');
        end
    end
catch
    % if it is not an image but a variable
    inputImage=inputImage;
    % grab the number of channels
    [nr, nc, nChannels]=size(inputImage);
    if(nChannels)>1
        colored=1;
    else
        colored=0;
    end
end
if scaleLevel <= 0
    error('Scale value must be > 0');
end
vignettImg=inputImage;
tic; % Initialize the timer to calculate the time consumed.
imgCntX = nc/2;
imgCntY = nr/2;
maxDistance = sqrt (imgCntY^2 + imgCntX^2);
if(colored==1)
    for (i=1:nr)
        for (j=1:nc)
           dis = sqrt (abs(i-imgCntY)^2 + abs(j-imgCntX)^2);          
           %% reduce brighness of pixel based on distance from the image center
           vignettImg(i,j,1) = vignettImg(i,j,1)* (1 - (1-scaleLevel)*(dis/maxDistance) );
           vignettImg(i,j,2) = vignettImg(i,j,2)* (1 - (1-scaleLevel)*(dis/maxDistance) );
           vignettImg(i,j,3) = vignettImg(i,j,3)* (1 - (1-scaleLevel)*(dis/maxDistance) );
       end
    end
else
%gray
     for (i=1:nr)
        for (j=1:nc)
           dis = sqrt (abs(i-imgCntY)^2 + abs(j-imgCntX)^2);          
           %% reduce brighness of pixel based on distance from the image center
           vignettImg(i,j) = vignettImg(i,j)* (1 - (1-scaleLevel)*(dis/maxDistance) );
       end
    end
end
% Stop the timer to calculate the time consumed.
timeTaken=toc;





Left-Right Consistency (LRC) Check

13 06 2009

Left-Right Consistency (LRC) check is performed to get rid of the half-occluded (objects scene in one image, and not in other) pixels in the final disparity map. This is computed by taking the computed disparity value in one image, and re-projecting it in the other image. If the difference in the values is less than a given threshold, then the pixels are half-occluded.

Example

Aloe Left Image
Aloe-Left Image
Aloe Right Image
Aloe-Right Image
Aloe Ground Truth Disparity Left-to-Right
Aloe-Ground Truth
Left to Right Disparity Map
Aloe Ground Truth Disparity Right-to-Left
Aloe-Ground Truth
Right to Left Disparity Map
AloeSADL2R15x15DispRange=0-70Gray
Aloe-Left to Right
SAD Disparity Map
Window Size: 15×15
Disparity Range: 0-70
AloeSADR2L15x15DispRange=0-70Gray
Aloe-Right to Left
SAD Disparity Map
Window Size: 15×15
Disparity Range: 0-70
AloeOccluded4,2
Aloe-Ground Truth
Occlusion Map
AloeSADLRC15x15DispRange=0-70Gray
Aloe-LRC Map
(threshold: 2)

MATLAB Code

% *************************************************************************
% Title: Function-Left/Right Consistency Check
% Author: Siddhant Ahuja
% Created: May 2008
% Copyright Siddhant Ahuja, 2008
% Inputs: Left Image (var: leftImage), Right Image (var: rightImage),
% Window Size (var: windowSize), Minimum Disparity (dispMin), Maximum
% Disparity (dispMax), Threshold for the check (var: thresh) typically 2.0
% Outputs: Disparity Map (var: dispMap), Time taken (var: timeTaken)
% Example Usage of Function: [dispMapLRC, timeTaken]=funcLRCCheck('TsukubaLeft.jpg', 'TsukubaRight.jpg', 9, 0, 16,2);
% *************************************************************************
function [dispMapLRC, timeTaken]=funcLRCCheck(leftImage, rightImage, windowSize, dispMin, dispMax, thresh)
% Initiate the Timer to calculate the time consumed.
tic;
% Perform SAD Correlation based matching (Right to Left)
[dispMapR2L, timeTakenR2L]=funcSADR2L(leftImage, rightImage, windowSize, dispMin, dispMax);
% Perform SAD Correlation based matching (Left to Right)
[dispMapL2R, timeTakenL2R]=funcSADL2R(leftImage, rightImage, windowSize,dispMin , dispMax);
% Prepare matrix for subtraction and scale it for comparison
dispMapL2R=-dispMapL2R;
% Find the size (columns and rows) of the L2R Disparity map and assign the rows to
% variable nrLRCCheck, and columns to variable ncLRCCheck
[nrLRCCheck,ncLRCCheck] = size(dispMapL2R);
% Create an image of size nrLRCCheck and ncLRCCheck, fill it with zeros and assign
% it to variable dispMapLRC
dispMapLRC=zeros(nrLRCCheck,ncLRCCheck);
% Find out how many rows and columns are to the left/right/up/down of the
% central pixel based on the window size
win=(windowSize-1)/2;
for(i=1:1:nrLRCCheck)
    for(j=1:1:ncLRCCheck)
        xl=j;
        xr=xl+dispMapL2R(i,xl);
        if (xr>ncLRCCheck||xr<1)
            dispMapLRC(i,j) = 0; %% occluded pixel
        else            
            xlp=xr+dispMapR2L(i,xr);
            if (abs(xl-xlp)<thresh)
                dispMapLRC(i,j) = -dispMapL2R(i,j);  %% non-occluded pixel            
            else
                dispMapLRC(i,j) = 0; %% occluded pixel                        
            end
        end
    end
end
% Terminate the Timer to calculate the time consumed.
timeTaken=toc;





Compute Variance Map of an Image

8 06 2009

Variance map of an image is calculated by taking a square window of a set size around a center pixel, and calculating the variance of the values of the pixels.

The variance within the window can be calculated from the following equation:
Variance Equation

Example

MonopolyLeftColorMonopoly-Left Image

MonopolyLeft Variance Map 5,9Monopoly-Left Image-Variance Map

(Pixels marked as black are low-texture regions)

MATLAB Code

% *************************************************************************
% Title: Function-Compute Variance map of the image
% Author: Siddhant Ahuja
% Created: May 2008
% Copyright Siddhant Ahuja, 2008
% Inputs: Input Image (var: inputImage), Window Size (var: windowSize),
% Threshold (var: thresh) Typical value is 140
% Outputs: Variance Map (var: varianceImg) , Time taken (var: timeTaken)
% Example Usage of Function: [varianceImg, timeTaken]=funcVarianceMap('MonopolyLeftColor.png', 5, 9);
% *************************************************************************
function [varianceImg, timeTaken]=funcVarianceMap(inputImage, windowSize, thresh)
try 
    % Grab the image information (metadata) of input image using the function imfinfo
    inputImageInfo=imfinfo(inputImage);
    if(getfield(inputImageInfo,'ColorType')=='truecolor')
    % Read an image using imread function, convert from RGB color space to
    % grayscale using rgb2gray function and assign it to variable inputImage
        inputImage=rgb2gray(imread(inputImage));
        % Convert the image from uint8 to double
        inputImage=double(inputImage);
    else if(getfield(inputImageInfo,'ColorType')=='grayscale')
    % If the image is already in grayscale, then just read it.        
            inputImage=imread(inputImage);
            % Convert the image from uint8 to double
            inputImage=double(inputImage);
        else
            error('The Color Type of Left Image is not acceptable. Acceptable color types are truecolor or grayscale.');
        end
    end
catch
    % if it is not an image but a variable
    inputImage=inputImage;
end
% Find the size (columns and rows) of the input image and assign the rows to
% variable nr, and columns to variable nc
[nr,nc] = size(inputImage);
% Check the size of window to see if it is an odd number.
if (mod(windowSize,2)==0)
    error('The window size must be an odd number.');
end
% Create an image of size nr and nc, fill it with zeros and assign
% it to variable meanImg
meanImg=zeros(nr, nc);
% Create an image of size nr and nc, fill it with zeros and assign
% it to variable varianceImg
varianceImg=zeros(nr, nc);
% Find out how many rows and columns are to the left/right/up/down of the
% central pixel based on the window size
win=(windowSize-1)/2;
tic; % Initialize the timer to calculate the time consumed.
% Compute a map of mean values
for(i=1+win:1:nr-win)
    for(j=1+win:1:nc-win)
        sum=0.0;
        for(a=-win:1:win)
            for(b=-win:1:win)
                sum=sum+inputImage(i+a,j+b);
            end
        end
        meanImg(i,j)=sum/(windowSize*windowSize);
    end
end
% Compute a map of variance values
for(i=1+win:1:nr-win)
    for(j=1+win:1:nc-win)
        sum=0.0;
        for(a=-win:1:win)
            for(b=-win:1:win)
                sum=sum+((inputImage(i+a,j+b)-meanImg(i,j))^2);
            end
        end         
        var=sum/((windowSize*windowSize)-1);
        % Apply threshold to produce a binarized variance map
        if (var > thresh)
            varianceImg(i,j) = 255;
        else
            varianceImg(i,j) = 0;
        end
    end
end
% Stop the timer to calculate the time consumed.
timeTaken=toc;





Finding Low-texture or textureless regions in images

2 06 2009

Stereo vision algorithms typically compute erroneous results in regions where there is a little or no texture in the scene. They are defined in [1] as regions where the squared horizontal intensity gradient averaged over a square window of a given size is below a given threshold.

Example

Bowling1-Left ImageBowling1-Left Image

Bowling1-Left Image-Textureless MapBowling1-Left Image-Textureless Map

(Pixels marked as white are low-texture regions)

MATLAB Code

% *************************************************************************
% Title: Function-Find Textureless regions of an image
% Notes: Textureless regions are defined as regions where the squared horizontal
% intensity gradient averaged over a square window of a given size 
% (windowSize) is below a given threshold (thresh);
% Author: Siddhant Ahuja
% Created: May 2008
% Copyright Siddhant Ahuja, 2008
% Inputs: Input Image (var: inputImage), Window Size (var: windowSize),
% Threshold (var: thresh) Typical value is 4
% Outputs: Textureless Map (var: texturelessImg) , Time taken (var: timeTaken)
% Example Usage of Function: [texturelessImg, timeTaken]=funcTexturelessRegions('imL.png', 9, 4);
% *************************************************************************
function [texturelessImg, timeTaken]=funcTexturelessRegions(inputImage, windowSize, thresh)
% Read the input image
try
    % Read an image using imread function
    inputImage=imread(inputImage);
    % grab the number of rows, columns, and channels
    [nr, nc, nChannels]=size(inputImage);
     % Grab the image information (metadata) of input image using the function imfinfo
    inputImageInfo=imfinfo(inputImage);
    % Determine if input left image is already in grayscale or color
    if(getfield(inputImageInfo,'ColorType')=='truecolor')
        colored=1;
    else if(getfield(inputImageInfo,'ColorType')=='grayscale')
        colored=0;
        else
        error('The Color Type of Left Image is not acceptable. Acceptable color types are truecolor or grayscale.');
        end
    end
catch
    % if it is not an image but a variable
    % grab the number of channels
    [nr, nc, nChannels]=size(inputImage);
    if(nChannels)>1
        colored=1;
    else
        colored=0;
    end
end
% Check the size of window to see if it is an odd number.
if (mod(windowSize,2)==0)
    error('The window size must be an odd number.');
end
% Create an image of size nr and nc, fill it with zeros and assign
% it to variable texturelessImg
texturelessImg=zeros(nr, nc);
% Create an image of size nr and nc, fill it with zeros and assign
% it to variable sqGradImg
sqGradImg=zeros(nr,nc);
% Find out how many rows and columns are to the left/right/up/down of the
% central pixel based on the window size
win=(windowSize-1)/2;
inputImage=double(inputImage);
tic; % Initialize the timer to calculate the time consumed.
% Produce Squared Horizontal Gradient image sqGradImg
for (i=1:1:nr)
    for (j=1:1:nc-1)
        sum = 0.0;        
        for (k=1:1:nChannels)
            diff =  inputImage(i,j,k) - inputImage(i,j+1,k);
            sum = sum + (diff*diff); 
        end
        sum = sum / nChannels;
        sqGradImg(i,j+1) = sum;
        if (j==1)
            sqGradImg(i,j) = sum;
        end
        if (sum > sqGradImg(i,j))
            sqGradImg(i,j) = sum;
        end
    end
end
% Compute average within predefined box window of size windowSize x
% windowSize
for (i=1+win:nr-win)
    for (j=1+win:nc-win)       
        % go over the square window
        sum = 0.0;
        avg = 0.0;
        for (a=-win:1:win)
            for (b=-win:1:win)                
                sum = sum + sqGradImg(i+a,j+b);
            end
        end           
        % Compute the average
        avg = sum / (windowSize*windowSize);
        % Apply threshold
        if (avg < (thresh*thresh))
            texturelessImg(i,j) = 255; % mark detected textureless pixel as white
        end
    end
end
% Stop the timer to calculate the time consumed.
timeTaken=toc;

References

[1] D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” International Journal of Computer Vision, vol. 47(1/2/3), pp. 7-42, Apr. 2002.





Finding Depth Discontinuous regions in stereo-images

31 05 2009

Stereo vision algorithms typically compute erroneous results in regions where there is a sudden change in the depth between objects in the scene. They are defined in [1] as regions where neighboring disparities differ by more than a certain gap, dilated by a window of a given width.

Example

Flowerpots Left ImageFlowerpots-Left Image

Flowerpots Right ImageFlowerpots-Right Image

Flowerpots, Ground Truth Left-to-RightFlowerpots-Ground Truth

Disparity Map=Left-to-Right Image

Flowerpots, Ground Truth Right-to-LeftFlowerpots-Ground Truth

Disparity Map-Right-to-Left Image

Flowerpots-Depth-DiscontinuousFlowerpots-Depth-Discontinuous Regions

(Pixels marked as white are depth-discontinuous)

MATLAB Code

% *************************************************************************
% Title: Function-Find Discontinuous regions of an image
% Notes: 
% 1. According to paper [A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms], Depth Discontinuous regions are defined as pixels whose neighboring
% disparities differ by more than a certain gap, dilated
% by a window of width windowSize.
% 2. According to paper [ An Experimental Comparison of Stereo Algorithms], A pixel is a depth 
% discontinuity if any of its (4-connected) neighbors has a disparity that differs by more than 1 from 
% its disparity. Neighboring pixels that are part of a sloped surface can easily differ by 1 pixel, but
% should not be counted as discontinuities.
% Author: Siddhant Ahuja
% Created: May 2008
% Copyright Siddhant Ahuja, 2008
% Inputs: Input Image-Depth Map (var: inputImage), Window Size (var: windowSize),
% Threshold (var: thresh) Typical value is 2
% Outputs: Discontinuous Map (var: discontinuousImg) , Time taken (var: timeTaken)
% Example Usage of Function: [discontinuousImg, timeTaken]=funcDiscontinuousRegions('disp_l_r.png', 9, 2);
% *************************************************************************
function [discontinuousImg, timeTaken]=funcDiscontinuousRegions(inputImage, windowSize, thresh)
% Read the input image
try
    % Read an image using imread function
    inputImage=imread(inputImage);
    % grab the number of rows, columns, and channels
    [nr, nc, nChannels]=size(inputImage);
     % Grab the image information (metadata) of input image using the function imfinfo
    inputImageInfo=imfinfo(inputImage);
    % Determine if input left image is already in grayscale or color
    if(getfield(inputImageInfo,'ColorType')=='truecolor')
        inputImage=rgb2gray(inputImage);
    else if(getfield(inputImageInfo,'ColorType')=='grayscale')
        inputImage=inputImage;
        else
        error('The Color Type of Left Image is not acceptable. Acceptable color types are truecolor or grayscale.');
        end
    end
catch
    % if it is not an image but a variable
    % grab the number of channels
    [nr, nc, nChannels]=size(inputImage);
    if(nChannels)>1
        inputImage=rgb2gray(inputImage);
    else
        inputImage=inputImage;
    end
end
% Check the size of window to see if it is an odd number.
if (mod(windowSize,2)==0)
    error('The window size must be an odd number.');
end
% Create an image of size nr and nc, fill it with zeros and assign
% it to variable discontinuousImg
discontinuousImg=zeros(nr, nc);
% Create an image of size nr and nc, fill it with zeros and assign
% it to variable edgeImg
edgeImg=zeros(nr, nc);
% Find out how many rows and columns are to the left/right/up/down of the
% central pixel based on the window size
win=(windowSize-1)/2;
inputImage=double(inputImage);
tic; % Initialize the timer to calculate the time consumed.
% Find variation/edges in disparity values in horizontal and vertical directions
for (i=1:1:nr-1)
    for (j=1:1:nc-1)
        % Traverse in horizontal direction
        if (abs(inputImage(i,j)-inputImage(i,j+1)) > thresh )
            edgeImg(i,j) = 255;
            edgeImg(i,j+1) = 255;
        end
        % Traverse in vertical direction
        if (abs(inputImage(i,j)-inputImage(i+1,j)) > thresh )
            edgeImg(i,j) = 255;
            edgeImg(i+1,j) = 255;
        end
    end
end
% Dilate within the window
for (i=1+win:nr-win)
    for (j=1+win:nc-win)       
        % Go over the square window
        sum = 0.0;
        for (a=-win:1:win)
            for (b=-win:1:win)                
                sum = sum + edgeImg(i+a,j+b);
            end
        end  
        % Apply Threshold
        if (sum>0)
            discontinuousImg(i,j) = 255;
        else
            discontinuousImg(i,j) = 0;
        end
    end
end
% Stop the timer to calculate the time consumed.
timeTaken=toc;

References

[1] D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” International Journal of Computer Vision, vol. 47(1/2/3), pp. 7-42, Apr. 2002.





Finding regions that are occluded in left and right images

30 05 2009

Stereo vision algorithms typically compute erroneous results when the objects in the scene are fully occluded or half occluded. They are defined in [1] as regions where the left-to-right disparity lands at a location with a larger disparity i.e. regions that are visible in one image, and not in the other image.

Example

Aloe Left Image

Aloe-Left Image

Aloe Right Image

Aloe-Right Image

Aloe Ground Truth Disparity Left-to-Right

Aloe-Ground Truth

Disparity Map (Left-to-Right)

Aloe Ground Truth Disparity Right-to-Left

Aloe-Ground Truth

Disparity Map (Right-to-Left)

AloeOccluded4,2

Aloe-Occluded Regions

(Pixels marked as black are occluded)

MATLAB Code

% *************************************************************************
% Title: Function-Find Occluded regions of an image
% Notes: Occluded regions are defined as regions that are occluded in the
% matching image, i.e., where the forward-mapped disparity
% lands at a location with a larger (nearer) disparity.
% Author: Siddhant Ahuja
% Created: September 2008
% Copyright Siddhant Ahuja, 2008
% Inputs: Left to Right Disparity Image (var: dispMapL2R), Right to Left Disparity Image (var: dispMapR2L),
% Scale factor of ground truth map (var: scale) typically 4.0, Threshold for the check (var: thresh) typically 2.0
% Outputs: Disparity Map (var: dispMap), Time taken (var: timeTaken)
% Example Usage of Function: [occludedImg, timeTaken]=funcOccludedRegions('disp_l_r.png', 'disp_r_l.png', 4, 2);
% *************************************************************************
function [occludedImg, timeTaken]=funcOccludedRegions(dispMapL2R, dispMapR2L, scale, thresh)
% Initiate the Timer to calculate the time consumed.
tic;
dispMapL2R=imread(dispMapL2R);
dispMapR2L=imread(dispMapR2L);
dispMapL2R = floor(double (dispMapL2R)/scale);
dispMapR2L = floor(double (dispMapR2L)/scale);
% Prepare matrix for subtraction and scale it for comparison
dispMapL2R=-dispMapL2R;
% Find the size (columns and rows) of the L2R Disparity map and assign the rows to
% variable nrLRCCheck, and columns to variable ncLRCCheck
[nrLRCCheck,ncLRCCheck] = size(dispMapL2R);
% Create an image of size nrLRCCheck and ncLRCCheck, fill it with zeros and assign
% it to variable occludedImg
occludedImg=zeros(nrLRCCheck,ncLRCCheck);
for(i=1:1:nrLRCCheck)
    for(j=1:1:ncLRCCheck)
        xl=j;
        xr=xl+dispMapL2R(i,xl);
        if (xr>ncLRCCheck||xr<1)
            occludedImg(i,j) = 0; %% occluded pixel
        else            
            xlp=xr+dispMapR2L(i,xr);
            if (abs(xl-xlp)<thresh)
                occludedImg(i,j) = 255;  %% non-occluded pixel            
            else
                occludedImg(i,j) = 0; %% occluded pixel                        
            end
        end
    end
end
% Terminate the Timer to calculate the time consumed.
timeTaken=toc;

References

[1] D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” International Journal of Computer Vision, vol. 47(1/2/3), pp. 7-42, Apr. 2002.





Correlation based similarity measure-Sum of Hamming Distances (SHD)

30 05 2009

Example 1: Tsukuba

TsukubaLeftLeft Image TsukubaRightRight Image
TsukubaSHD9x9DispRange=0-16ColorSHD Disparity Map
Disparity Range: 0-16
Window Size: 9×9
The hotter the color, the closer it is; the cooler the color the farther it is.

Example 2: Stereogram

StereogramLeftLeft Image StereogramRightRight Image
StereogramSHD9x9DispRange=0-16ColorSHD Disparity Map
Disparity Range: 0-16
Window Size: 9×9
The hotter the color, the closer it is; the cooler the color the farther it is.

MATLAB Code-SHD Right to Left Matching

% *************************************************************************
% Title: Function-Compute Correlation between two images using the 
% similarity measure of Sum of Hamming Distances (SHD) with Right Image 
% as reference.
% Author: Siddhant Ahuja
% Created: May 2008
% Copyright Siddhant Ahuja, 2008
% Inputs: Left Image (var: leftImage), Right Image (var: rightImage),
% Window Size (var: windowSize), Minimum Disparity (dispMin), Maximum
% Disparity (dispMax)
% Outputs: Disparity Map (var: dispMap), Time taken (var: timeTaken)
% Example Usage of Function: [dispMap, timeTaken]=funcSHDR2L('TsukubaLeft.jpg', 'TsukubaRight.jpg', 9, 0, 16);
% *************************************************************************
function [dispMap, timeTaken]=funcSHDR2L(leftImage, rightImage, windowSize, dispMin, dispMax)
try 
    % Grab the image information (metadata) of left image using the function imfinfo
    leftImageInfo=imfinfo(leftImage);
    % Since SHDR2L is applied on a grayscale image, determine if the
    % input left image is already in grayscale or color
    if(getfield(leftImageInfo,'ColorType')=='truecolor')
    % Read an image using imread function, convert from RGB color space to
    % grayscale using rgb2gray function and assign it to variable leftImage
        leftImage=rgb2gray(imread(leftImage));
    else if(getfield(leftImageInfo,'ColorType')=='grayscale')
    % If the image is already in grayscale, then just read it.        
            leftImage=imread(leftImage);
        else
            error('The Color Type of Left Image is not acceptable. Acceptable color types are truecolor or grayscale.');
        end
    end
catch
    % if it is not an image but a variable
    leftImage=leftImage;
end
try
    % Grab the image information (metadata) of right image using the function imfinfo
    rightImageInfo=imfinfo(rightImage);
    % Since SHDR2L is applied on a grayscale image, determine if the
    % input right image is already in grayscale or color
    if(getfield(rightImageInfo,'ColorType')=='truecolor')
    % Read an image using imread function, convert from RGB color space to
    % grayscale using rgb2gray function and assign it to variable rightImage
        rightImage=rgb2gray(imread(rightImage));
    else if(getfield(rightImageInfo,'ColorType')=='grayscale')
    % If the image is already in grayscale, then just read it.        
            rightImage=imread(rightImage);
        else
            error('The Color Type of Right Image is not acceptable. Acceptable color types are truecolor or grayscale.');
        end
    end
catch
    % if it is not an image but a variable
    rightImage=rightImage;
end
% Find the size (columns and rows) of the left image and assign the rows to
% variable nrLeft, and columns to variable ncLeft
[nrLeft,ncLeft] = size(leftImage);
% Find the size (columns and rows) of the right image and assign the rows to
% variable nrRight, and columns to variable ncRight
[nrRight,ncRight] = size(rightImage);
% Check to see if both the left and right images have same number of rows
% and columns
if(nrLeft==nrRight && ncLeft==ncRight)
else
    error('Both left and right images should have the same number of rows and columns');
end
% Check the size of window to see if it is an odd number.
if (mod(windowSize,2)==0)
    error('The window size must be an odd number.');
end
% Check whether minimum disparity is less than the maximum disparity.
if (dispMin>dispMax)
    error('Minimum Disparity must be less than the Maximum disparity.');
end
% Create an image of size nrLeft and ncLeft, fill it with zeros and assign
% it to variable dispMap
dispMap=zeros(nrLeft, ncLeft);
% Find out how many rows and columns are to the left/right/up/down of the
% central pixel based on the window size
win=(windowSize-1)/2;
numberOfBits=8;
tic; % Initialize the timer to calculate the time consumed.
for(i=1+win:1:nrLeft-win)
    for(j=1+win:1:ncLeft-win-dispMax)
        min=0;
        position=0;
        rightWindow=rightImage(i-win:i+win, j-win:j+win);
        for(dispRange=dispMin:1:dispMax)
            if (j+win+dispRange <= ncLeft)
                leftWindow=leftImage(i-win:i+win, j-win+dispRange:j+win+dispRange);
                bloc3=bitxor(rightWindow,leftWindow);
                distance=uint8(zeros(windowSize,windowSize));
                for (k=1:1:numberOfBits)
                    distance=distance+bitget(bloc3,k);
                end
                dif=sum(sum(distance));
                if (dispRange==0)
                    min=dif;
                elseif (min>dif)
                    min=dif;
                    position=dispRange;
                end
            end 
        end
        dispMap(i,j) = position;
    end
end
% Stop the timer to calculate the time consumed.
timeTaken=toc;








Follow

Get every new post delivered to your Inbox.

Join 180 other followers