Correlation based similarity measures-Summary

11 04 2010

Correlation based matching typically produces dense depth maps by calculating the disparity at each pixel within a neighborhood. This is achieved by taking a square window of certain size around the pixel of interest in the reference image and finding the homologous pixel within the window in the target image, while moving along the corresponding scanline. The goal is to find the corresponding (correlated) pixel within a certain disparity range d (d E [0,….,dmax]) that minimizes the associated error and maximizes the similarity.

In brief, the matching process involves computation of the similarity measure for each disparity value, followed by an aggregation and optimization step. Since these steps consume a lot of processing power, there are significant speed-performance advantages to be had in optimizing the matching algorithm.

The images can be matched by taking either left image as the reference (left-to-right matching, also known as direct matching) or right image as the reference (right-to-left matching, also known as reverse matching) [2].

Similarity Measure Formula
Sum of Absolute Differences (SAD) Sum of Absolute Differences
Zero-mean Sum of Absolute Differences (ZSAD)
Locally scaled Sum of Absolute Differences (LSAD)
Sum of Squared Differences (SSD) Sum of Squared Differences
Zero-mean Sum of Squared Differences (ZSSD)
Locally scaled Sum of Squared Differences (LSSD)
Normalized Cross Correlation (NCC) Normalized Cross Correlation
Zero-mean Normalized Cross Correlation (ZNCC)
Sum of Hamming Distances (SHD) Sum of Hamming Distances

Sum of Absolute Differences (SAD) is one of the simplest of the similarity measures which is calculated by subtracting pixels within a square neighborhood between the reference image I1 and the target image I2 followed by the aggregation of absolute differences within the square window, and optimization with the winner-take-all (WTA) strategy [1]. If the left and right images exactly match, the resultant will be zero.

In Sum of Squared Differences (SSD), the differences are squared and aggregated within a square window and later optimized by WTA strategy. This measure has a higher computational complexity compared to SAD algorithm as it involves numerous multiplication operations.

Normalized Cross Correlation is even more complex to both SAD and SSD algorithms as it involves numerous multiplication, division and square root operations.

Sum of Hamming Distances is normally employed for matching census-transformed images (can be used on images that have not been census transformed) by computing bitwise-XOR of the values in left and right images, within a square window. This step is usually followed by a bit-counting operation which results in the final Hamming distance score.

Example: Tsukuba

TsukubaLeftLeft Image TsukubaRightRight Image
SAD Disparity Map ZSAD Disparity Map LSAD Disparity Map
SSD Disparity Map ZSSD Disparity Map LSSD Disparity Map
NCC Disparity Map ZNCC Disparity Map TsukubaSHD9x9DispRange=0-16ColorSHD Disparity Map
TsukubaGroundTruthColor

Ground Truth Disparity Map

Disparity Range: 0-16 ,Window Size: 9×9, The hotter the color, the closer it is; the cooler the color the farther it is.

Generic MATLAB code:

% *************************************************************************
% Title: Function-Compute Correlation between two images using various 
% similarity measures with Left Image as reference.
% Author: Siddhant Ahuja
% Created: March 2010
% Copyright Siddhant Ahuja, 2010
% Inputs: 
% 1. Left Image (var: rightImage), 
% 2. Right Image (var: leftImage),
% 3. Correlation Window Size (var: corrWindowSize), 
% 4. Minimum Disparity in X-direction (var: dMin), 
% 5. Maximum Disparity in X-direction (var: dMax),
% 6. Method used for calculating the correlation scores (var: method)
% Valid values include: 'SAD', 'LSAD', 'ZSAD', 'SSD', 'LSSD', ZSSD', 'NCC',
% 'ZNCC'
% Outputs: 
% 1. Disparity Map (var: dispMap), 
% 2. Time taken (var: timeTaken)
% Example Usage of Function: [dispMap, timeTaken]=denseMatch('tsukuba_left.tiff', 'tsukuba_right.tiff', 9, 0, 16, 'ZNCC');
% *************************************************************************
function [dispMap, timeTaken]=denseMatch(rightImage, leftImage, corrWindowSize, dMin, dMax, method)
% Grab the image information (metadata) of left image using the function imfinfo
leftImageInfo=imfinfo(leftImage);
% Grab the image information (metadata) of right image using the function imfinfo
rightImageInfo=imfinfo(rightImage);
% Since Dense Matching is applied on a grayscale image, determine if the
% input left image is already in grayscale or color
if(getfield(leftImageInfo,'ColorType')=='truecolor')
% Read an image using imread function, convert from RGB color space to
% grayscale using rgb2gray function and assign it to variable leftImage
    leftImage=rgb2gray(imread(leftImage));
else if(getfield(leftImageInfo,'ColorType')=='grayscale')
% If the image is already in grayscale, then just read it.        
        leftImage=imread(leftImage);
    else
        error('The Color Type of Left Image is not acceptable. Acceptable color types are truecolor or grayscale.');
    end
end
% Since Dense Matching is applied on a grayscale image, determine if the
% input right image is already in grayscale or color
if(getfield(rightImageInfo,'ColorType')=='truecolor')
% Read an image using imread function, convert from RGB color space to
% grayscale using rgb2gray function and assign it to variable rightImage
    rightImage=rgb2gray(imread(rightImage));
else if(getfield(rightImageInfo,'ColorType')=='grayscale')
% If the image is already in grayscale, then just read it.        
        rightImage=imread(rightImage);
    else
        error('The Color Type of Right Image is not acceptable. Acceptable color types are truecolor or grayscale.');
    end
end
% Find the size (columns and rows) of the left image and assign the rows to
% variable nrLeft, and columns to variable ncLeft
[nrLeft,ncLeft] = size(leftImage);
% Find the size (columns and rows) of the right image and assign the rows to
% variable nrRight, and columns to variable ncRight
[nrRight,ncRight] = size(rightImage);
% Check to see if both the left and right images have same number of rows
% and columns
if(nrLeft==nrRight && ncLeft==ncRight)
else
    error('Both left and right images should have the same number of rows and columns');
end
% Convert the left and right images from uint8 to double
leftImage=im2double(leftImage);
rightImage=im2double(rightImage);
% Check the size of window to see if it is an odd number.
if (mod(corrWindowSize,2)==0)
    error('The window size must be an odd number.');
end
% Check whether minimum disparity is less than the maximum disparity.
if (dMin>dMax)
    error('Minimum Disparity must be less than the Maximum disparity.');
end
% Create an image of size nrLeft and ncLeft, fill it with zeros and assign
% it to variable dispMap
dispMap=zeros(nrLeft, ncLeft);
% Find out how many rows and columns are to the left/right/up/down of the
% central pixel based on the window size
win=(corrWindowSize-1)/2;
% The objective of CC, NCC and ZNCC is to maxmize the
% correlation score, whereas other methods try to minimize
% it.
maximize = 0;
if strcmp(method,'NCC') || strcmp(method,'ZNCC')
    maximize = 1;
end
tic; % Initialize the timer to calculate the time consumed.
for(i=1+win:1:nrLeft-win)
    % For every row in Left Image
    for(j=1+win:1:ncLeft-win-dMax)
        % For every column in Left Image
        % Initialize the temporary variable to hold the previous
        % correlation score
        if(maximize)
            prevcorrScore = 0.0;
        else
            prevcorrScore = 65532;
        end
        % Initialize the temporary variable to store the best matched
        % disparity score
        bestMatchSoFar = dMin;
        for(d=dMin:dMax)
            % For every disparity value in x-direction
            % Construct a region with window around central/selected pixel in left image
            regionLeft=leftImage(i-win : i+win, j-win : j+win);
            % Construct a region with window around central/selected pixel in right image
            regionRight=rightImage(i-win : i+win, j+d-win : j+d+win);
            % Calculate the local mean in left region
            meanLeft = mean2(regionLeft);
            % Calculate the local mean in right region
            meanRight = mean2(regionRight);
            % Initialize the variable to store temporarily the correlation
            % scores
            tempCorrScore = zeros(size(regionLeft));
            % Calculate the correlation score
            if strcmp(method,'SAD')
                tempCorrScore = abs(regionLeft - regionRight);
            elseif strcmp(method,'ZSAD')
                tempCorrScore = abs(regionLeft - meanLeft - regionRight + meanRight);
            elseif strcmp(method,'LSAD')
                tempCorrScore = abs(regionLeft - meanLeft/meanRight*regionRight);
            elseif strcmp(method,'SSD')
                tempCorrScore = (regionLeft - regionRight).^2;
            elseif strcmp(method,'ZSSD')
                tempCorrScore = (regionLeft - meanLeft - regionRight + meanRight).^2;          
            elseif strcmp(method,'LSSD')
                tempCorrScore = (regionLeft - meanLeft/meanRight*regionRight).^2;
            elseif strcmp(method,'NCC')
                % Calculate the term in the denominator (var: den)
                den = sqrt(sum(sum(regionLeft.^2))*sum(sum(regionRight.^2)));
                tempCorrScore = regionLeft.*regionRight/den;
            elseif strcmp(method,'ZNCC')
                % Calculate the term in the denominator (var: den)
                den = sqrt(sum(sum((regionLeft - meanLeft).^2))*sum(sum((regionRight - meanRight).^2)));
                tempCorrScore = (regionLeft - meanLeft).*(regionRight - meanRight)/den;
            end
            % Compute the final score by summing the values in tempCorrScore,
            % and store it in a temporary variable signifying the distance
            % (var: corrScore)
            corrScore=sum(sum(tempCorrScore));
            if(maximize)
                if(corrScore>prevcorrScore)
                    % If the current disparity value is greater than
                    % previous one, then swap them
                    prevcorrScore=corrScore;
                    bestMatchSoFar=d;
                end
            else
                if (prevcorrScore > corrScore)
                    % If the current disparity value is less than
                    % previous one, then swap them
                    prevcorrScore = corrScore;
                    bestMatchSoFar = d;
                end
            end
        end
        % Store the final matched value in variable dispMap
        dispMap(i,j) = bestMatchSoFar;
    end
end
% Stop the timer to calculate the time consumed.
timeTaken=toc;
end

For other MATLAB Codes, please visit:

1. Sum of Absolute Differences
2. Sum of Squared Differences
3. Normalized Cross Correlation
4. Sum of Hamming Distances

References

1. T. Kanade, H. Kano, and S. Kimura, “Development of a video-rate stereo machine,” in Image UnderstandingWorkshop, Monterey,CA, 1994, p. 549–557.
2. D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” International Journal of Computer Vision, vol. 47(1/2/3), pp. 7-42, Apr. 2002.
3. S. Chambon and A. Crouzil. évaluation et comparaison de mesures de corrélationrobustes aux occultations. Rapport de recherche 2002-34-R, IRIT, Université PaulSabatier, Toulouse, France, December 2002.





How to select a proper Disparity Value for Stereo-Vision

24 07 2009

I have been asked many times how to determine the proper disparity range? If you are taking the images (left, right and ground truth disparity maps) from Middlebury Stereo’s website, then the answer is pretty easy. The minimum disparity value is 0. The maximum disparity value is the maximum value of the pixel in the ground truth map, divided by the scale factor. If you have a stereo vision set-up taking real-world imagery, then you have to do a bit of math to calculate the values.

Here is one of the equations:
range equation

In the above equation,

r is the range of the object that you are trying to determine,
b is the baseline of the two cameras, i.e. the distance between the centers of the cameras,
f is the focal length of the image sensor,
x is the pixel size of the image sensor,
N is the maximum disparity value.

For example, lets say that the image sensor has the pixel size of 17um, focal length of 2.8mm, baseline of 28mm, and maximum disparity value is from 5-35. Range vs. disparity values can be plotted in a graphical form as below:

range vs disparity graph

As can be seen above, for the characteristics of the stereo vision system selected, the maximum range of the object that you are trying to detect really depends on the range of disparity values.  For disparity values between 5-35, the detectable range of objects is roughly 15-100 cm.

However, there is another trade-off that people forget about, primarily between the uncertainty in detecting objects at a particular range and the actual range itself. Here is the equation which relates both these variables:

range uncertainty vs range equation

In the above equation, there are two new variables: Δr and ΔN.

Δr is the uncertainty in detecting the object at a certain range,
ΔN is the change in disparity value.

Taking the previous example, lets say that the image sensor has the pixel size of 17um, focal length of 2.8mm, baseline of 28mm, maximum disparity values between 5-35, and the range of 15-100cm, the uncertainty in detecting object at the desired range vs. range values can be plotted in a graphical form as below:

range uncertainty vs range graph

From the above, and for the characteristics of the stereo vision system selected, for a range of 50cm, the uncertainty in the range is 5cm, i.e. if an object has been detected at 50cm, the object could be anywhere from 47.5cm to 52.5cm with a delta of 5cm.

Thus you can see that there are some trade-offs that you should consider while developing a stereo-vision system, and selecting a proper disparity value.

For more information, please refer to:

[1] Khaleghi B., Ahuja S., Wu Q.M.J., “An improved real-time miniaturized embedded stereo vision system (MESVS-II),” IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2008 (CVPRW’08), pp.1-8, 23-28 June 2008.
[2] Khaleghi B., Ahuja S., Wu Q.M.J., “A New Miniaturized Embedded Stereo-Vision System (MESVS-I)”, Canadian Conference on Computer and Robot Vision, 2008 (CRV ‘08), pp.26-33, 28-30 May 2008





Left-Right Consistency (LRC) Check

13 06 2009

Left-Right Consistency (LRC) check is performed to get rid of the half-occluded (objects scene in one image, and not in other) pixels in the final disparity map. This is computed by taking the computed disparity value in one image, and re-projecting it in the other image. If the difference in the values is less than a given threshold, then the pixels are half-occluded.

Example

Aloe Left Image
Aloe-Left Image
Aloe Right Image
Aloe-Right Image
Aloe Ground Truth Disparity Left-to-Right
Aloe-Ground Truth
Left to Right Disparity Map
Aloe Ground Truth Disparity Right-to-Left
Aloe-Ground Truth
Right to Left Disparity Map
AloeSADL2R15x15DispRange=0-70Gray
Aloe-Left to Right
SAD Disparity Map
Window Size: 15×15
Disparity Range: 0-70
AloeSADR2L15x15DispRange=0-70Gray
Aloe-Right to Left
SAD Disparity Map
Window Size: 15×15
Disparity Range: 0-70
AloeOccluded4,2
Aloe-Ground Truth
Occlusion Map
AloeSADLRC15x15DispRange=0-70Gray
Aloe-LRC Map
(threshold: 2)

MATLAB Code

% *************************************************************************
% Title: Function-Left/Right Consistency Check
% Author: Siddhant Ahuja
% Created: May 2008
% Copyright Siddhant Ahuja, 2008
% Inputs: Left Image (var: leftImage), Right Image (var: rightImage),
% Window Size (var: windowSize), Minimum Disparity (dispMin), Maximum
% Disparity (dispMax), Threshold for the check (var: thresh) typically 2.0
% Outputs: Disparity Map (var: dispMap), Time taken (var: timeTaken)
% Example Usage of Function: [dispMapLRC, timeTaken]=funcLRCCheck('TsukubaLeft.jpg', 'TsukubaRight.jpg', 9, 0, 16,2);
% *************************************************************************
function [dispMapLRC, timeTaken]=funcLRCCheck(leftImage, rightImage, windowSize, dispMin, dispMax, thresh)
% Initiate the Timer to calculate the time consumed.
tic;
% Perform SAD Correlation based matching (Right to Left)
[dispMapR2L, timeTakenR2L]=funcSADR2L(leftImage, rightImage, windowSize, dispMin, dispMax);
% Perform SAD Correlation based matching (Left to Right)
[dispMapL2R, timeTakenL2R]=funcSADL2R(leftImage, rightImage, windowSize,dispMin , dispMax);
% Prepare matrix for subtraction and scale it for comparison
dispMapL2R=-dispMapL2R;
% Find the size (columns and rows) of the L2R Disparity map and assign the rows to
% variable nrLRCCheck, and columns to variable ncLRCCheck
[nrLRCCheck,ncLRCCheck] = size(dispMapL2R);
% Create an image of size nrLRCCheck and ncLRCCheck, fill it with zeros and assign
% it to variable dispMapLRC
dispMapLRC=zeros(nrLRCCheck,ncLRCCheck);
% Find out how many rows and columns are to the left/right/up/down of the
% central pixel based on the window size
win=(windowSize-1)/2;
for(i=1:1:nrLRCCheck)
    for(j=1:1:ncLRCCheck)
        xl=j;
        xr=xl+dispMapL2R(i,xl);
        if (xr>ncLRCCheck||xr<1)
            dispMapLRC(i,j) = 0; %% occluded pixel
        else            
            xlp=xr+dispMapR2L(i,xr);
            if (abs(xl-xlp)<thresh)
                dispMapLRC(i,j) = -dispMapL2R(i,j);  %% non-occluded pixel            
            else
                dispMapLRC(i,j) = 0; %% occluded pixel                        
            end
        end
    end
end
% Terminate the Timer to calculate the time consumed.
timeTaken=toc;





Finding Low-texture or textureless regions in images

2 06 2009

Stereo vision algorithms typically compute erroneous results in regions where there is a little or no texture in the scene. They are defined in [1] as regions where the squared horizontal intensity gradient averaged over a square window of a given size is below a given threshold.

Example

Bowling1-Left ImageBowling1-Left Image

Bowling1-Left Image-Textureless MapBowling1-Left Image-Textureless Map

(Pixels marked as white are low-texture regions)

MATLAB Code

% *************************************************************************
% Title: Function-Find Textureless regions of an image
% Notes: Textureless regions are defined as regions where the squared horizontal
% intensity gradient averaged over a square window of a given size 
% (windowSize) is below a given threshold (thresh);
% Author: Siddhant Ahuja
% Created: May 2008
% Copyright Siddhant Ahuja, 2008
% Inputs: Input Image (var: inputImage), Window Size (var: windowSize),
% Threshold (var: thresh) Typical value is 4
% Outputs: Textureless Map (var: texturelessImg) , Time taken (var: timeTaken)
% Example Usage of Function: [texturelessImg, timeTaken]=funcTexturelessRegions('imL.png', 9, 4);
% *************************************************************************
function [texturelessImg, timeTaken]=funcTexturelessRegions(inputImage, windowSize, thresh)
% Read the input image
try
    % Read an image using imread function
    inputImage=imread(inputImage);
    % grab the number of rows, columns, and channels
    [nr, nc, nChannels]=size(inputImage);
     % Grab the image information (metadata) of input image using the function imfinfo
    inputImageInfo=imfinfo(inputImage);
    % Determine if input left image is already in grayscale or color
    if(getfield(inputImageInfo,'ColorType')=='truecolor')
        colored=1;
    else if(getfield(inputImageInfo,'ColorType')=='grayscale')
        colored=0;
        else
        error('The Color Type of Left Image is not acceptable. Acceptable color types are truecolor or grayscale.');
        end
    end
catch
    % if it is not an image but a variable
    % grab the number of channels
    [nr, nc, nChannels]=size(inputImage);
    if(nChannels)>1
        colored=1;
    else
        colored=0;
    end
end
% Check the size of window to see if it is an odd number.
if (mod(windowSize,2)==0)
    error('The window size must be an odd number.');
end
% Create an image of size nr and nc, fill it with zeros and assign
% it to variable texturelessImg
texturelessImg=zeros(nr, nc);
% Create an image of size nr and nc, fill it with zeros and assign
% it to variable sqGradImg
sqGradImg=zeros(nr,nc);
% Find out how many rows and columns are to the left/right/up/down of the
% central pixel based on the window size
win=(windowSize-1)/2;
inputImage=double(inputImage);
tic; % Initialize the timer to calculate the time consumed.
% Produce Squared Horizontal Gradient image sqGradImg
for (i=1:1:nr)
    for (j=1:1:nc-1)
        sum = 0.0;        
        for (k=1:1:nChannels)
            diff =  inputImage(i,j,k) - inputImage(i,j+1,k);
            sum = sum + (diff*diff); 
        end
        sum = sum / nChannels;
        sqGradImg(i,j+1) = sum;
        if (j==1)
            sqGradImg(i,j) = sum;
        end
        if (sum > sqGradImg(i,j))
            sqGradImg(i,j) = sum;
        end
    end
end
% Compute average within predefined box window of size windowSize x
% windowSize
for (i=1+win:nr-win)
    for (j=1+win:nc-win)       
        % go over the square window
        sum = 0.0;
        avg = 0.0;
        for (a=-win:1:win)
            for (b=-win:1:win)                
                sum = sum + sqGradImg(i+a,j+b);
            end
        end           
        % Compute the average
        avg = sum / (windowSize*windowSize);
        % Apply threshold
        if (avg < (thresh*thresh))
            texturelessImg(i,j) = 255; % mark detected textureless pixel as white
        end
    end
end
% Stop the timer to calculate the time consumed.
timeTaken=toc;

References

[1] D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” International Journal of Computer Vision, vol. 47(1/2/3), pp. 7-42, Apr. 2002.





Finding Depth Discontinuous regions in stereo-images

31 05 2009

Stereo vision algorithms typically compute erroneous results in regions where there is a sudden change in the depth between objects in the scene. They are defined in [1] as regions where neighboring disparities differ by more than a certain gap, dilated by a window of a given width.

Example

Flowerpots Left ImageFlowerpots-Left Image

Flowerpots Right ImageFlowerpots-Right Image

Flowerpots, Ground Truth Left-to-RightFlowerpots-Ground Truth

Disparity Map=Left-to-Right Image

Flowerpots, Ground Truth Right-to-LeftFlowerpots-Ground Truth

Disparity Map-Right-to-Left Image

Flowerpots-Depth-DiscontinuousFlowerpots-Depth-Discontinuous Regions

(Pixels marked as white are depth-discontinuous)

MATLAB Code

% *************************************************************************
% Title: Function-Find Discontinuous regions of an image
% Notes: 
% 1. According to paper [A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms], Depth Discontinuous regions are defined as pixels whose neighboring
% disparities differ by more than a certain gap, dilated
% by a window of width windowSize.
% 2. According to paper [ An Experimental Comparison of Stereo Algorithms], A pixel is a depth 
% discontinuity if any of its (4-connected) neighbors has a disparity that differs by more than 1 from 
% its disparity. Neighboring pixels that are part of a sloped surface can easily differ by 1 pixel, but
% should not be counted as discontinuities.
% Author: Siddhant Ahuja
% Created: May 2008
% Copyright Siddhant Ahuja, 2008
% Inputs: Input Image-Depth Map (var: inputImage), Window Size (var: windowSize),
% Threshold (var: thresh) Typical value is 2
% Outputs: Discontinuous Map (var: discontinuousImg) , Time taken (var: timeTaken)
% Example Usage of Function: [discontinuousImg, timeTaken]=funcDiscontinuousRegions('disp_l_r.png', 9, 2);
% *************************************************************************
function [discontinuousImg, timeTaken]=funcDiscontinuousRegions(inputImage, windowSize, thresh)
% Read the input image
try
    % Read an image using imread function
    inputImage=imread(inputImage);
    % grab the number of rows, columns, and channels
    [nr, nc, nChannels]=size(inputImage);
     % Grab the image information (metadata) of input image using the function imfinfo
    inputImageInfo=imfinfo(inputImage);
    % Determine if input left image is already in grayscale or color
    if(getfield(inputImageInfo,'ColorType')=='truecolor')
        inputImage=rgb2gray(inputImage);
    else if(getfield(inputImageInfo,'ColorType')=='grayscale')
        inputImage=inputImage;
        else
        error('The Color Type of Left Image is not acceptable. Acceptable color types are truecolor or grayscale.');
        end
    end
catch
    % if it is not an image but a variable
    % grab the number of channels
    [nr, nc, nChannels]=size(inputImage);
    if(nChannels)>1
        inputImage=rgb2gray(inputImage);
    else
        inputImage=inputImage;
    end
end
% Check the size of window to see if it is an odd number.
if (mod(windowSize,2)==0)
    error('The window size must be an odd number.');
end
% Create an image of size nr and nc, fill it with zeros and assign
% it to variable discontinuousImg
discontinuousImg=zeros(nr, nc);
% Create an image of size nr and nc, fill it with zeros and assign
% it to variable edgeImg
edgeImg=zeros(nr, nc);
% Find out how many rows and columns are to the left/right/up/down of the
% central pixel based on the window size
win=(windowSize-1)/2;
inputImage=double(inputImage);
tic; % Initialize the timer to calculate the time consumed.
% Find variation/edges in disparity values in horizontal and vertical directions
for (i=1:1:nr-1)
    for (j=1:1:nc-1)
        % Traverse in horizontal direction
        if (abs(inputImage(i,j)-inputImage(i,j+1)) > thresh )
            edgeImg(i,j) = 255;
            edgeImg(i,j+1) = 255;
        end
        % Traverse in vertical direction
        if (abs(inputImage(i,j)-inputImage(i+1,j)) > thresh )
            edgeImg(i,j) = 255;
            edgeImg(i+1,j) = 255;
        end
    end
end
% Dilate within the window
for (i=1+win:nr-win)
    for (j=1+win:nc-win)       
        % Go over the square window
        sum = 0.0;
        for (a=-win:1:win)
            for (b=-win:1:win)                
                sum = sum + edgeImg(i+a,j+b);
            end
        end  
        % Apply Threshold
        if (sum>0)
            discontinuousImg(i,j) = 255;
        else
            discontinuousImg(i,j) = 0;
        end
    end
end
% Stop the timer to calculate the time consumed.
timeTaken=toc;

References

[1] D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” International Journal of Computer Vision, vol. 47(1/2/3), pp. 7-42, Apr. 2002.





Finding regions that are occluded in left and right images

30 05 2009

Stereo vision algorithms typically compute erroneous results when the objects in the scene are fully occluded or half occluded. They are defined in [1] as regions where the left-to-right disparity lands at a location with a larger disparity i.e. regions that are visible in one image, and not in the other image.

Example

Aloe Left Image

Aloe-Left Image

Aloe Right Image

Aloe-Right Image

Aloe Ground Truth Disparity Left-to-Right

Aloe-Ground Truth

Disparity Map (Left-to-Right)

Aloe Ground Truth Disparity Right-to-Left

Aloe-Ground Truth

Disparity Map (Right-to-Left)

AloeOccluded4,2

Aloe-Occluded Regions

(Pixels marked as black are occluded)

MATLAB Code

% *************************************************************************
% Title: Function-Find Occluded regions of an image
% Notes: Occluded regions are defined as regions that are occluded in the
% matching image, i.e., where the forward-mapped disparity
% lands at a location with a larger (nearer) disparity.
% Author: Siddhant Ahuja
% Created: September 2008
% Copyright Siddhant Ahuja, 2008
% Inputs: Left to Right Disparity Image (var: dispMapL2R), Right to Left Disparity Image (var: dispMapR2L),
% Scale factor of ground truth map (var: scale) typically 4.0, Threshold for the check (var: thresh) typically 2.0
% Outputs: Disparity Map (var: dispMap), Time taken (var: timeTaken)
% Example Usage of Function: [occludedImg, timeTaken]=funcOccludedRegions('disp_l_r.png', 'disp_r_l.png', 4, 2);
% *************************************************************************
function [occludedImg, timeTaken]=funcOccludedRegions(dispMapL2R, dispMapR2L, scale, thresh)
% Initiate the Timer to calculate the time consumed.
tic;
dispMapL2R=imread(dispMapL2R);
dispMapR2L=imread(dispMapR2L);
dispMapL2R = floor(double (dispMapL2R)/scale);
dispMapR2L = floor(double (dispMapR2L)/scale);
% Prepare matrix for subtraction and scale it for comparison
dispMapL2R=-dispMapL2R;
% Find the size (columns and rows) of the L2R Disparity map and assign the rows to
% variable nrLRCCheck, and columns to variable ncLRCCheck
[nrLRCCheck,ncLRCCheck] = size(dispMapL2R);
% Create an image of size nrLRCCheck and ncLRCCheck, fill it with zeros and assign
% it to variable occludedImg
occludedImg=zeros(nrLRCCheck,ncLRCCheck);
for(i=1:1:nrLRCCheck)
    for(j=1:1:ncLRCCheck)
        xl=j;
        xr=xl+dispMapL2R(i,xl);
        if (xr>ncLRCCheck||xr<1)
            occludedImg(i,j) = 0; %% occluded pixel
        else            
            xlp=xr+dispMapR2L(i,xr);
            if (abs(xl-xlp)<thresh)
                occludedImg(i,j) = 255;  %% non-occluded pixel            
            else
                occludedImg(i,j) = 0; %% occluded pixel                        
            end
        end
    end
end
% Terminate the Timer to calculate the time consumed.
timeTaken=toc;

References

[1] D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” International Journal of Computer Vision, vol. 47(1/2/3), pp. 7-42, Apr. 2002.





Quality metrics in Stereo-Vision

12 05 2009

Root-Mean-Squared (RMS) Error

This measure is computed by the following formula:

RMS Error

where, N is the total number of pixels in the image, dc is the computed disparity map, and dt is the ground truth disparity map.

MATLAB Code

% *************************************************************************
% Title: Function-Compute Root Mean Squared (RMS) Error
% Author: Siddhant Ahuja
% Created: May 2008
% Copyright Siddhant Ahuja, 2008
% Inputs: Computed Disparity Map (var: computedDisparityMap), Ground Truth Disparity Map (var: groundTruthDisparityMap),
% How many pixels to ignore at the border (var: borderPixelsToIgnore). For
% a 9x9 windowSize used, border pixels to ignore should be (9-1)/2 or 4
% pixels, Disparity range (var: dispRange), Scale factor for groundtruth
% (var: scale)
% Outputs: RMS Error (var: rmsError), Time taken (var: timeTaken)
% Example Usage of Function: [rmsError, timeTaken]= RMSError('TsukubaSAD9x9DispRange=0-16.png','TsukubaGroundTruth.png',4,16,8);
% *************************************************************************
function [rmsError, timeTaken]= funcRMSError(computedDisparityMap,groundTruthDisparityMap,borderPixelsToIgnore,dispRange, scale)
% Read an image using imread function, and assign it to variable
% computedDisparityMap
try 
    computedDisparityMap=imread(computedDisparityMap);
catch
    % if it is not an image but a variable
    computedDisparityMap=computedDisparityMap;
end
% Convert the image from uint8 to double
computedDisparityMap=double(computedDisparityMap);
% Read an image using imread function, and assign it to variable
% groundTruthDisparityMap
groundTruthDisparityMap=imread(groundTruthDisparityMap);
% Convert the image from uint8 to double
groundTruthDisparityMap=double(groundTruthDisparityMap);
% Find the size (columns and rows) of the left image and assign the rows to
% variable nrComputedDisparityMap, and columns to variable ncComputedDisparityMap
[nrComputedDisparityMap,ncComputedDisparityMap] = size(computedDisparityMap);
% Find the size (columns and rows) of the image and assign the rows to
% variable nrGroundTruthDisparityMap, and columns to variable ncGroundTruthDisparityMap
[nrGroundTruthDisparityMap,ncGroundTruthDisparityMap] = size(groundTruthDisparityMap);
% Check to see if both the left and right images have same number of rows
% and columns
if(nrComputedDisparityMap==nrGroundTruthDisparityMap && ncComputedDisparityMap==ncGroundTruthDisparityMap)
else
    error('Both Computed Disparity Map and Groundtruth Disparity Map images should have the same number of rows and columns');
end
numPixels=0;
rmsError=0.0;
tic; % Initialize the timer to calculate the time consumed.
% Calculate rms error
for (i=borderPixelsToIgnore:1:nrComputedDisparityMap-borderPixelsToIgnore)
    for(j=borderPixelsToIgnore:1:ncComputedDisparityMap-borderPixelsToIgnore-dispRange)
        if(groundTruthDisparityMap(i,j)~=0.0) % Ignore Pixels with unknown disparity in the groundTruthDisparityMap
            rmsError= rmsError+(abs((computedDisparityMap(i,j)*scale)-groundTruthDisparityMap(i,j))^2);
            numPixels=numPixels+1;
        end
    end
end
rmsError=sqrt(rmsError/numPixels);
% Stop the timer to calculate the time consumed.
timeTaken=toc;

Percentage of Bad Matching Pixels

This measure is computed by the following formula:

Percentage Bad Matching

where, N is the total number of pixels in an image, dc is the computed disparity map, and dt is the ground truth disparity map, and δ_thresh is the threshold for evaluating bad matched pixels (usually attains the value of 1.0).

MATLAB Code

% *************************************************************************
% Title: Function-Compute Percentage of bad matching pixels
% Author: Siddhant Ahuja
% Created: May 2008
% Copyright Siddhant Ahuja, 2008
% Inputs: Computed Disparity Map (var: computedDisparityMap), Ground Truth Disparity Map (var: groundTruthDisparityMap),
% How many pixels to ignore at the border (var: borderPixelsToIgnore). For
% a 9x9 windowSize used, border pixels to ignore should be (9-1)/2 or 4
% pixels, Disparity range (var: dispRange), Threshold (var: thresh), Scale
% factor for groundtruth (var: scale)
% Outputs: Percentage Bad matching pixels (var: perBADMatch), Time taken
% (var: timeTaken)
% Example Usage of Function: [perBADMatch, timeTaken]=
% funcPercentBadMatchingPixels(dispMap,'VenusGroundTruthL2R.png',4,16,1,8)
% *************************************************************************
function [perBADMatch, timeTaken]= funcPercentBadMatchingPixels(computedDisparityMap,groundTruthDisparityMap,borderPixelsToIgnore,dispRange,thresh, scale)
% Read an image using imread function, and assign it to variable
% computedDisparityMap
try 
    computedDisparityMap=imread(computedDisparityMap);
catch
    % if it is not an image but a variable
    computedDisparityMap=computedDisparityMap;
end
% Convert the image from uint8 to double
computedDisparityMap=double(computedDisparityMap);
% Read an image using imread function, and assign it to variable
% groundTruthDisparityMap
groundTruthDisparityMap=imread(groundTruthDisparityMap);
% Convert the image from uint8 to double
groundTruthDisparityMap=double(groundTruthDisparityMap);
% Find the size (columns and rows) of the left image and assign the rows to
% variable nrComputedDisparityMap, and columns to variable ncComputedDisparityMap
[nrComputedDisparityMap,ncComputedDisparityMap] = size(computedDisparityMap);
% Find the size (columns and rows) of the image and assign the rows to
% variable nrGroundTruthDisparityMap, and columns to variable ncGroundTruthDisparityMap
[nrGroundTruthDisparityMap,ncGroundTruthDisparityMap] = size(groundTruthDisparityMap);
% Check to see if both the left and right images have same number of rows
% and columns
if(nrComputedDisparityMap==nrGroundTruthDisparityMap && ncComputedDisparityMap==ncGroundTruthDisparityMap)
else
    error('Both Computed Disparity Map and Groundtruth Disparity Map images should have the same number of rows and columns');
end
numPixels=0;
perBADMatch=0.0;
tic; % Initialize the timer to calculate the time consumed.
% Calculate Percentage Bad Matching Pixels
for (i=borderPixelsToIgnore:1:nrComputedDisparityMap-borderPixelsToIgnore)
    for(j=borderPixelsToIgnore:1:ncComputedDisparityMap-borderPixelsToIgnore-dispRange)
        if(groundTruthDisparityMap(i,j)~=0.0) % Ignore Pixels with unknown disparity in the groundTruthDisparityMap
            if(abs((computedDisparityMap(i,j)*scale)-groundTruthDisparityMap(i,j))>thresh)
                perBADMatch=perBADMatch+1;
            end
            numPixels=numPixels+1;
        end
    end
end
perBADMatch=perBADMatch/numPixels;
% Stop the timer to calculate the time consumed.
timeTaken=toc;





Video Demos: Stereo-Vision, Area Based Algorithm

17 04 2009

This is a demo of the raw-result obtained by using the area based algorithm for Stereo-vision.

Result after filtering the results:








Follow

Get every new post delivered to your Inbox.

Join 180 other followers