How to select a proper Disparity Value for Stereo-Vision

24 07 2009

I have been asked many times how to determine the proper disparity range? If you are taking the images (left, right and ground truth disparity maps) from Middlebury Stereo’s website, then the answer is pretty easy. The minimum disparity value is 0. The maximum disparity value is the maximum value of the pixel in the ground truth map, divided by the scale factor. If you have a stereo vision set-up taking real-world imagery, then you have to do a bit of math to calculate the values.

Here is one of the equations:
range equation

In the above equation,

r is the range of the object that you are trying to determine,
b is the baseline of the two cameras, i.e. the distance between the centers of the cameras,
f is the focal length of the image sensor,
x is the pixel size of the image sensor,
N is the maximum disparity value.

For example, lets say that the image sensor has the pixel size of 17um, focal length of 2.8mm, baseline of 28mm, and maximum disparity value is from 5-35. Range vs. disparity values can be plotted in a graphical form as below:

range vs disparity graph

As can be seen above, for the characteristics of the stereo vision system selected, the maximum range of the object that you are trying to detect really depends on the range of disparity values.  For disparity values between 5-35, the detectable range of objects is roughly 15-100 cm.

However, there is another trade-off that people forget about, primarily between the uncertainty in detecting objects at a particular range and the actual range itself. Here is the equation which relates both these variables:

range uncertainty vs range equation

In the above equation, there are two new variables: Δr and ΔN.

Δr is the uncertainty in detecting the object at a certain range,
ΔN is the change in disparity value.

Taking the previous example, lets say that the image sensor has the pixel size of 17um, focal length of 2.8mm, baseline of 28mm, maximum disparity values between 5-35, and the range of 15-100cm, the uncertainty in detecting object at the desired range vs. range values can be plotted in a graphical form as below:

range uncertainty vs range graph

From the above, and for the characteristics of the stereo vision system selected, for a range of 50cm, the uncertainty in the range is 5cm, i.e. if an object has been detected at 50cm, the object could be anywhere from 47.5cm to 52.5cm with a delta of 5cm.

Thus you can see that there are some trade-offs that you should consider while developing a stereo-vision system, and selecting a proper disparity value.

For more information, please refer to:

[1] Khaleghi B., Ahuja S., Wu Q.M.J., “An improved real-time miniaturized embedded stereo vision system (MESVS-II),” IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2008 (CVPRW’08), pp.1-8, 23-28 June 2008.
[2] Khaleghi B., Ahuja S., Wu Q.M.J., “A New Miniaturized Embedded Stereo-Vision System (MESVS-I)”, Canadian Conference on Computer and Robot Vision, 2008 (CRV ‘08), pp.26-33, 28-30 May 2008





Adding Vignetting Effect to Images

21 06 2009

When the lens size is smaller than the dimensions of the actual image sensor, or inadequately covering it, it leads to an effect whereby the center of the image is clear, and a fading/darkening of the image borders is seen. This effect is commonly referred to as the Vignetting Effect. In graphics, it is used to de-emphasize the distracting background, and shed light on the foreground objects.

Example

Baby1LeftColorVignette1

Scale factor at image border=1.0

Baby1LeftColorVignette0.8

Scale factor at image border=0.8

Baby1LeftColorVignette0.6

Scale factor at image border=0.6

Baby1LeftColorVignette0.4

Scale factor at image border=0.4

Baby1LeftColorVignette0.2

Scale factor at image border=0.2

Baby1LeftColorVignette0.1

Scale factor at image border=0.1

MATLAB Code

% *************************************************************************
% Title: Function-Introduce Vignetting effect to the image
% Author: Siddhant Ahuja
% Created: September 2008
% Copyright Siddhant Ahuja, 2008
% Inputs: Input Image (var: inputImage), Scale Change level (var:
% scaleLevel) Valid values for scale change should go from 0.1 to 1
% Outputs: Vignetting effect added image (var: vignettImg, Time taken (var: timeTaken)
% Example Usage of Function: [vignettImg, timeTaken]=funcVignettingEffect('Baby1LeftColor.png', 0.5);
% *************************************************************************
function [vignettImg, timeTaken]=funcVignettingEffect(inputImage,scaleLevel)
% Read the input image
try
    % Read an image using imread function
    inputImage=imread(inputImage);
    % grab the number of rows, columns, and channels
    [nr, nc, nChannels]=size(inputImage);
     % Grab the image information (metadata) of input image using the function imfinfo
    inputImageInfo=imfinfo(inputImage);
    % Determine if input left image is already in grayscale or color
    if(getfield(inputImageInfo,'ColorType')=='truecolor')
        colored=1;
    else if(getfield(inputImageInfo,'ColorType')=='grayscale')
        colored=0;
        else
        error('The Color Type of Left Image is not acceptable. Acceptable color types are truecolor or grayscale.');
        end
    end
catch
    % if it is not an image but a variable
    inputImage=inputImage;
    % grab the number of channels
    [nr, nc, nChannels]=size(inputImage);
    if(nChannels)>1
        colored=1;
    else
        colored=0;
    end
end
if scaleLevel <= 0
    error('Scale value must be > 0');
end
vignettImg=inputImage;
tic; % Initialize the timer to calculate the time consumed.
imgCntX = nc/2;
imgCntY = nr/2;
maxDistance = sqrt (imgCntY^2 + imgCntX^2);
if(colored==1)
    for (i=1:nr)
        for (j=1:nc)
           dis = sqrt (abs(i-imgCntY)^2 + abs(j-imgCntX)^2);          
           %% reduce brighness of pixel based on distance from the image center
           vignettImg(i,j,1) = vignettImg(i,j,1)* (1 - (1-scaleLevel)*(dis/maxDistance) );
           vignettImg(i,j,2) = vignettImg(i,j,2)* (1 - (1-scaleLevel)*(dis/maxDistance) );
           vignettImg(i,j,3) = vignettImg(i,j,3)* (1 - (1-scaleLevel)*(dis/maxDistance) );
       end
    end
else
%gray
     for (i=1:nr)
        for (j=1:nc)
           dis = sqrt (abs(i-imgCntY)^2 + abs(j-imgCntX)^2);          
           %% reduce brighness of pixel based on distance from the image center
           vignettImg(i,j) = vignettImg(i,j)* (1 - (1-scaleLevel)*(dis/maxDistance) );
       end
    end
end
% Stop the timer to calculate the time consumed.
timeTaken=toc;





Left-Right Consistency (LRC) Check

13 06 2009

Left-Right Consistency (LRC) check is performed to get rid of the half-occluded (objects scene in one image, and not in other) pixels in the final disparity map. This is computed by taking the computed disparity value in one image, and re-projecting it in the other image. If the difference in the values is less than a given threshold, then the pixels are half-occluded.

Example

Aloe Left Image
Aloe-Left Image
Aloe Right Image
Aloe-Right Image
Aloe Ground Truth Disparity Left-to-Right
Aloe-Ground Truth
Left to Right Disparity Map
Aloe Ground Truth Disparity Right-to-Left
Aloe-Ground Truth
Right to Left Disparity Map
AloeSADL2R15x15DispRange=0-70Gray
Aloe-Left to Right
SAD Disparity Map
Window Size: 15×15
Disparity Range: 0-70
AloeSADR2L15x15DispRange=0-70Gray
Aloe-Right to Left
SAD Disparity Map
Window Size: 15×15
Disparity Range: 0-70
AloeOccluded4,2
Aloe-Ground Truth
Occlusion Map
AloeSADLRC15x15DispRange=0-70Gray
Aloe-LRC Map
(threshold: 2)

MATLAB Code

% *************************************************************************
% Title: Function-Left/Right Consistency Check
% Author: Siddhant Ahuja
% Created: May 2008
% Copyright Siddhant Ahuja, 2008
% Inputs: Left Image (var: leftImage), Right Image (var: rightImage),
% Window Size (var: windowSize), Minimum Disparity (dispMin), Maximum
% Disparity (dispMax), Threshold for the check (var: thresh) typically 2.0
% Outputs: Disparity Map (var: dispMap), Time taken (var: timeTaken)
% Example Usage of Function: [dispMapLRC, timeTaken]=funcLRCCheck('TsukubaLeft.jpg', 'TsukubaRight.jpg', 9, 0, 16,2);
% *************************************************************************
function [dispMapLRC, timeTaken]=funcLRCCheck(leftImage, rightImage, windowSize, dispMin, dispMax, thresh)
% Initiate the Timer to calculate the time consumed.
tic;
% Perform SAD Correlation based matching (Right to Left)
[dispMapR2L, timeTakenR2L]=funcSADR2L(leftImage, rightImage, windowSize, dispMin, dispMax);
% Perform SAD Correlation based matching (Left to Right)
[dispMapL2R, timeTakenL2R]=funcSADL2R(leftImage, rightImage, windowSize,dispMin , dispMax);
% Prepare matrix for subtraction and scale it for comparison
dispMapL2R=-dispMapL2R;
% Find the size (columns and rows) of the L2R Disparity map and assign the rows to
% variable nrLRCCheck, and columns to variable ncLRCCheck
[nrLRCCheck,ncLRCCheck] = size(dispMapL2R);
% Create an image of size nrLRCCheck and ncLRCCheck, fill it with zeros and assign
% it to variable dispMapLRC
dispMapLRC=zeros(nrLRCCheck,ncLRCCheck);
% Find out how many rows and columns are to the left/right/up/down of the
% central pixel based on the window size
win=(windowSize-1)/2;
for(i=1:1:nrLRCCheck)
    for(j=1:1:ncLRCCheck)
        xl=j;
        xr=xl+dispMapL2R(i,xl);
        if (xr>ncLRCCheck||xr<1)
            dispMapLRC(i,j) = 0; %% occluded pixel
        else            
            xlp=xr+dispMapR2L(i,xr);
            if (abs(xl-xlp)<thresh)
                dispMapLRC(i,j) = -dispMapL2R(i,j);  %% non-occluded pixel            
            else
                dispMapLRC(i,j) = 0; %% occluded pixel                        
            end
        end
    end
end
% Terminate the Timer to calculate the time consumed.
timeTaken=toc;





Compute Variance Map of an Image

8 06 2009

Variance map of an image is calculated by taking a square window of a set size around a center pixel, and calculating the variance of the values of the pixels.

The variance within the window can be calculated from the following equation:
Variance Equation

Example

MonopolyLeftColorMonopoly-Left Image

MonopolyLeft Variance Map 5,9Monopoly-Left Image-Variance Map

(Pixels marked as black are low-texture regions)

MATLAB Code

% *************************************************************************
% Title: Function-Compute Variance map of the image
% Author: Siddhant Ahuja
% Created: May 2008
% Copyright Siddhant Ahuja, 2008
% Inputs: Input Image (var: inputImage), Window Size (var: windowSize),
% Threshold (var: thresh) Typical value is 140
% Outputs: Variance Map (var: varianceImg) , Time taken (var: timeTaken)
% Example Usage of Function: [varianceImg, timeTaken]=funcVarianceMap('MonopolyLeftColor.png', 5, 9);
% *************************************************************************
function [varianceImg, timeTaken]=funcVarianceMap(inputImage, windowSize, thresh)
try 
    % Grab the image information (metadata) of input image using the function imfinfo
    inputImageInfo=imfinfo(inputImage);
    if(getfield(inputImageInfo,'ColorType')=='truecolor')
    % Read an image using imread function, convert from RGB color space to
    % grayscale using rgb2gray function and assign it to variable inputImage
        inputImage=rgb2gray(imread(inputImage));
        % Convert the image from uint8 to double
        inputImage=double(inputImage);
    else if(getfield(inputImageInfo,'ColorType')=='grayscale')
    % If the image is already in grayscale, then just read it.        
            inputImage=imread(inputImage);
            % Convert the image from uint8 to double
            inputImage=double(inputImage);
        else
            error('The Color Type of Left Image is not acceptable. Acceptable color types are truecolor or grayscale.');
        end
    end
catch
    % if it is not an image but a variable
    inputImage=inputImage;
end
% Find the size (columns and rows) of the input image and assign the rows to
% variable nr, and columns to variable nc
[nr,nc] = size(inputImage);
% Check the size of window to see if it is an odd number.
if (mod(windowSize,2)==0)
    error('The window size must be an odd number.');
end
% Create an image of size nr and nc, fill it with zeros and assign
% it to variable meanImg
meanImg=zeros(nr, nc);
% Create an image of size nr and nc, fill it with zeros and assign
% it to variable varianceImg
varianceImg=zeros(nr, nc);
% Find out how many rows and columns are to the left/right/up/down of the
% central pixel based on the window size
win=(windowSize-1)/2;
tic; % Initialize the timer to calculate the time consumed.
% Compute a map of mean values
for(i=1+win:1:nr-win)
    for(j=1+win:1:nc-win)
        sum=0.0;
        for(a=-win:1:win)
            for(b=-win:1:win)
                sum=sum+inputImage(i+a,j+b);
            end
        end
        meanImg(i,j)=sum/(windowSize*windowSize);
    end
end
% Compute a map of variance values
for(i=1+win:1:nr-win)
    for(j=1+win:1:nc-win)
        sum=0.0;
        for(a=-win:1:win)
            for(b=-win:1:win)
                sum=sum+((inputImage(i+a,j+b)-meanImg(i,j))^2);
            end
        end         
        var=sum/((windowSize*windowSize)-1);
        % Apply threshold to produce a binarized variance map
        if (var > thresh)
            varianceImg(i,j) = 255;
        else
            varianceImg(i,j) = 0;
        end
    end
end
% Stop the timer to calculate the time consumed.
timeTaken=toc;





Finding Low-texture or textureless regions in images

2 06 2009

Stereo vision algorithms typically compute erroneous results in regions where there is a little or no texture in the scene. They are defined in [1] as regions where the squared horizontal intensity gradient averaged over a square window of a given size is below a given threshold.

Example

Bowling1-Left ImageBowling1-Left Image

Bowling1-Left Image-Textureless MapBowling1-Left Image-Textureless Map

(Pixels marked as white are low-texture regions)

MATLAB Code

% *************************************************************************
% Title: Function-Find Textureless regions of an image
% Notes: Textureless regions are defined as regions where the squared horizontal
% intensity gradient averaged over a square window of a given size 
% (windowSize) is below a given threshold (thresh);
% Author: Siddhant Ahuja
% Created: May 2008
% Copyright Siddhant Ahuja, 2008
% Inputs: Input Image (var: inputImage), Window Size (var: windowSize),
% Threshold (var: thresh) Typical value is 4
% Outputs: Textureless Map (var: texturelessImg) , Time taken (var: timeTaken)
% Example Usage of Function: [texturelessImg, timeTaken]=funcTexturelessRegions('imL.png', 9, 4);
% *************************************************************************
function [texturelessImg, timeTaken]=funcTexturelessRegions(inputImage, windowSize, thresh)
% Read the input image
try
    % Read an image using imread function
    inputImage=imread(inputImage);
    % grab the number of rows, columns, and channels
    [nr, nc, nChannels]=size(inputImage);
     % Grab the image information (metadata) of input image using the function imfinfo
    inputImageInfo=imfinfo(inputImage);
    % Determine if input left image is already in grayscale or color
    if(getfield(inputImageInfo,'ColorType')=='truecolor')
        colored=1;
    else if(getfield(inputImageInfo,'ColorType')=='grayscale')
        colored=0;
        else
        error('The Color Type of Left Image is not acceptable. Acceptable color types are truecolor or grayscale.');
        end
    end
catch
    % if it is not an image but a variable
    % grab the number of channels
    [nr, nc, nChannels]=size(inputImage);
    if(nChannels)>1
        colored=1;
    else
        colored=0;
    end
end
% Check the size of window to see if it is an odd number.
if (mod(windowSize,2)==0)
    error('The window size must be an odd number.');
end
% Create an image of size nr and nc, fill it with zeros and assign
% it to variable texturelessImg
texturelessImg=zeros(nr, nc);
% Create an image of size nr and nc, fill it with zeros and assign
% it to variable sqGradImg
sqGradImg=zeros(nr,nc);
% Find out how many rows and columns are to the left/right/up/down of the
% central pixel based on the window size
win=(windowSize-1)/2;
inputImage=double(inputImage);
tic; % Initialize the timer to calculate the time consumed.
% Produce Squared Horizontal Gradient image sqGradImg
for (i=1:1:nr)
    for (j=1:1:nc-1)
        sum = 0.0;        
        for (k=1:1:nChannels)
            diff =  inputImage(i,j,k) - inputImage(i,j+1,k);
            sum = sum + (diff*diff); 
        end
        sum = sum / nChannels;
        sqGradImg(i,j+1) = sum;
        if (j==1)
            sqGradImg(i,j) = sum;
        end
        if (sum > sqGradImg(i,j))
            sqGradImg(i,j) = sum;
        end
    end
end
% Compute average within predefined box window of size windowSize x
% windowSize
for (i=1+win:nr-win)
    for (j=1+win:nc-win)       
        % go over the square window
        sum = 0.0;
        avg = 0.0;
        for (a=-win:1:win)
            for (b=-win:1:win)                
                sum = sum + sqGradImg(i+a,j+b);
            end
        end           
        % Compute the average
        avg = sum / (windowSize*windowSize);
        % Apply threshold
        if (avg < (thresh*thresh))
            texturelessImg(i,j) = 255; % mark detected textureless pixel as white
        end
    end
end
% Stop the timer to calculate the time consumed.
timeTaken=toc;

References

[1] D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” International Journal of Computer Vision, vol. 47(1/2/3), pp. 7-42, Apr. 2002.





Proof of Lens Law

2 06 2009

Following graphic illustrates a simple lens model:

Simple Lens Model

where,

h= height of the object

h’= height of the object projected in an image

G and C = focal points

f= focal distance

u= Distance between the object and the focal point

O= Centre of the lens

v= Distance between the centre of the lens and image plane



Assumptions

  1. Lens is very thin
  2. Optical axis is perpendicular to image plane



To Prove

1/f=1/u + 1/v



Proof

In ΔAHO, tanα=h/u

In ΔEDO, tanα=h’/v

∴ tanα=h/u=h’/v

⇒ h’/h=v/u ————- (1)

In ΔBOC, tanβ=h/f

In ΔEDC, tanβ=h’/(v-f)

∴ tanβ=h’/(v-f)=h/f

⇒ h’/h=(v-f)/f ———- (2)

Equating (1) and (2),

v/u=(v-f)/f

⇒ v/u=v/f -1

Dividing both sides by v,

1/u=1/f – 1/v

or,

1/f=1/u + 1/v

Hence proved.



Notes

h’/h is often referred to as the Magnification factor M. If M is negative, the projected image is real but inverted.





Finding Depth Discontinuous regions in stereo-images

31 05 2009

Stereo vision algorithms typically compute erroneous results in regions where there is a sudden change in the depth between objects in the scene. They are defined in [1] as regions where neighboring disparities differ by more than a certain gap, dilated by a window of a given width.

Example

Flowerpots Left ImageFlowerpots-Left Image

Flowerpots Right ImageFlowerpots-Right Image

Flowerpots, Ground Truth Left-to-RightFlowerpots-Ground Truth

Disparity Map=Left-to-Right Image

Flowerpots, Ground Truth Right-to-LeftFlowerpots-Ground Truth

Disparity Map-Right-to-Left Image

Flowerpots-Depth-DiscontinuousFlowerpots-Depth-Discontinuous Regions

(Pixels marked as white are depth-discontinuous)

MATLAB Code

% *************************************************************************
% Title: Function-Find Discontinuous regions of an image
% Notes: 
% 1. According to paper [A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms], Depth Discontinuous regions are defined as pixels whose neighboring
% disparities differ by more than a certain gap, dilated
% by a window of width windowSize.
% 2. According to paper [ An Experimental Comparison of Stereo Algorithms], A pixel is a depth 
% discontinuity if any of its (4-connected) neighbors has a disparity that differs by more than 1 from 
% its disparity. Neighboring pixels that are part of a sloped surface can easily differ by 1 pixel, but
% should not be counted as discontinuities.
% Author: Siddhant Ahuja
% Created: May 2008
% Copyright Siddhant Ahuja, 2008
% Inputs: Input Image-Depth Map (var: inputImage), Window Size (var: windowSize),
% Threshold (var: thresh) Typical value is 2
% Outputs: Discontinuous Map (var: discontinuousImg) , Time taken (var: timeTaken)
% Example Usage of Function: [discontinuousImg, timeTaken]=funcDiscontinuousRegions('disp_l_r.png', 9, 2);
% *************************************************************************
function [discontinuousImg, timeTaken]=funcDiscontinuousRegions(inputImage, windowSize, thresh)
% Read the input image
try
    % Read an image using imread function
    inputImage=imread(inputImage);
    % grab the number of rows, columns, and channels
    [nr, nc, nChannels]=size(inputImage);
     % Grab the image information (metadata) of input image using the function imfinfo
    inputImageInfo=imfinfo(inputImage);
    % Determine if input left image is already in grayscale or color
    if(getfield(inputImageInfo,'ColorType')=='truecolor')
        inputImage=rgb2gray(inputImage);
    else if(getfield(inputImageInfo,'ColorType')=='grayscale')
        inputImage=inputImage;
        else
        error('The Color Type of Left Image is not acceptable. Acceptable color types are truecolor or grayscale.');
        end
    end
catch
    % if it is not an image but a variable
    % grab the number of channels
    [nr, nc, nChannels]=size(inputImage);
    if(nChannels)>1
        inputImage=rgb2gray(inputImage);
    else
        inputImage=inputImage;
    end
end
% Check the size of window to see if it is an odd number.
if (mod(windowSize,2)==0)
    error('The window size must be an odd number.');
end
% Create an image of size nr and nc, fill it with zeros and assign
% it to variable discontinuousImg
discontinuousImg=zeros(nr, nc);
% Create an image of size nr and nc, fill it with zeros and assign
% it to variable edgeImg
edgeImg=zeros(nr, nc);
% Find out how many rows and columns are to the left/right/up/down of the
% central pixel based on the window size
win=(windowSize-1)/2;
inputImage=double(inputImage);
tic; % Initialize the timer to calculate the time consumed.
% Find variation/edges in disparity values in horizontal and vertical directions
for (i=1:1:nr-1)
    for (j=1:1:nc-1)
        % Traverse in horizontal direction
        if (abs(inputImage(i,j)-inputImage(i,j+1)) > thresh )
            edgeImg(i,j) = 255;
            edgeImg(i,j+1) = 255;
        end
        % Traverse in vertical direction
        if (abs(inputImage(i,j)-inputImage(i+1,j)) > thresh )
            edgeImg(i,j) = 255;
            edgeImg(i+1,j) = 255;
        end
    end
end
% Dilate within the window
for (i=1+win:nr-win)
    for (j=1+win:nc-win)       
        % Go over the square window
        sum = 0.0;
        for (a=-win:1:win)
            for (b=-win:1:win)                
                sum = sum + edgeImg(i+a,j+b);
            end
        end  
        % Apply Threshold
        if (sum>0)
            discontinuousImg(i,j) = 255;
        else
            discontinuousImg(i,j) = 0;
        end
    end
end
% Stop the timer to calculate the time consumed.
timeTaken=toc;

References

[1] D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” International Journal of Computer Vision, vol. 47(1/2/3), pp. 7-42, Apr. 2002.





Finding regions that are occluded in left and right images

30 05 2009

Stereo vision algorithms typically compute erroneous results when the objects in the scene are fully occluded or half occluded. They are defined in [1] as regions where the left-to-right disparity lands at a location with a larger disparity i.e. regions that are visible in one image, and not in the other image.

Example

Aloe Left Image

Aloe-Left Image

Aloe Right Image

Aloe-Right Image

Aloe Ground Truth Disparity Left-to-Right

Aloe-Ground Truth

Disparity Map (Left-to-Right)

Aloe Ground Truth Disparity Right-to-Left

Aloe-Ground Truth

Disparity Map (Right-to-Left)

AloeOccluded4,2

Aloe-Occluded Regions

(Pixels marked as black are occluded)

MATLAB Code

% *************************************************************************
% Title: Function-Find Occluded regions of an image
% Notes: Occluded regions are defined as regions that are occluded in the
% matching image, i.e., where the forward-mapped disparity
% lands at a location with a larger (nearer) disparity.
% Author: Siddhant Ahuja
% Created: September 2008
% Copyright Siddhant Ahuja, 2008
% Inputs: Left to Right Disparity Image (var: dispMapL2R), Right to Left Disparity Image (var: dispMapR2L),
% Scale factor of ground truth map (var: scale) typically 4.0, Threshold for the check (var: thresh) typically 2.0
% Outputs: Disparity Map (var: dispMap), Time taken (var: timeTaken)
% Example Usage of Function: [occludedImg, timeTaken]=funcOccludedRegions('disp_l_r.png', 'disp_r_l.png', 4, 2);
% *************************************************************************
function [occludedImg, timeTaken]=funcOccludedRegions(dispMapL2R, dispMapR2L, scale, thresh)
% Initiate the Timer to calculate the time consumed.
tic;
dispMapL2R=imread(dispMapL2R);
dispMapR2L=imread(dispMapR2L);
dispMapL2R = floor(double (dispMapL2R)/scale);
dispMapR2L = floor(double (dispMapR2L)/scale);
% Prepare matrix for subtraction and scale it for comparison
dispMapL2R=-dispMapL2R;
% Find the size (columns and rows) of the L2R Disparity map and assign the rows to
% variable nrLRCCheck, and columns to variable ncLRCCheck
[nrLRCCheck,ncLRCCheck] = size(dispMapL2R);
% Create an image of size nrLRCCheck and ncLRCCheck, fill it with zeros and assign
% it to variable occludedImg
occludedImg=zeros(nrLRCCheck,ncLRCCheck);
for(i=1:1:nrLRCCheck)
    for(j=1:1:ncLRCCheck)
        xl=j;
        xr=xl+dispMapL2R(i,xl);
        if (xr>ncLRCCheck||xr<1)
            occludedImg(i,j) = 0; %% occluded pixel
        else            
            xlp=xr+dispMapR2L(i,xr);
            if (abs(xl-xlp)<thresh)
                occludedImg(i,j) = 255;  %% non-occluded pixel            
            else
                occludedImg(i,j) = 0; %% occluded pixel                        
            end
        end
    end
end
% Terminate the Timer to calculate the time consumed.
timeTaken=toc;

References

[1] D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” International Journal of Computer Vision, vol. 47(1/2/3), pp. 7-42, Apr. 2002.





Correlation based similarity measures-Summary

30 05 2009

Correlation based matching typically produces dense depth maps by calculating the disparity at each pixel within a neighborhood. This is achieved by taking a square window of certain size around the pixel of interest in the reference image and finding the homologous pixel within the window in the target image, while moving along the corresponding scanline. The goal is to find the corresponding (correlated) pixel within a certain disparity range d (d E [0,….,dmax]) that minimizes the associated error and maximizes the similarity.

In brief, the matching process involves computation of the similarity measure for each disparity value, followed by an aggregation and optimization step. Since these steps consume a lot of processing power, there are significant speed-performance advantages to be had in optimizing the matching algorithm.

The images can be matched by taking either left image as the reference (left-to-right matching, also known as direct matching) or right image as the reference (right-to-left matching, also known as reverse matching) [2].

Similarity Measure Formula
Sum of Absolute Differences (SAD)  Sum of Absolute Differences
Sum of Squared Differences (SSD)  Sum of Squared Differences
Normalized Cross Correlation (NCC)  Normalized Cross Correlation
Sum of Hamming Distances (SHD)  Sum of Hamming Distances

Sum of Absolute Differences (SAD) is one of the simplest of the similarity measures which is calculated by subtracting pixels within a square neighborhood between the reference image I1 and the target image I2 followed by the aggregation of absolute differences within the square window, and optimization with the winner-take-all (WTA) strategy [1]. If the left and right images exactly match, the resultant will be zero.

In Sum of Squared Differences (SSD), the differences are squared and aggregated within a square window and later optimized by WTA strategy. This measure has a higher computational complexity compared to SAD algorithm as it involves numerous multiplication operations.

Normalized Cross Correlation is even more complex to both SAD and SSD algorithms as it involves numerous multiplication, division and square root operations.

Sum of Hamming Distances is normally employed for matching census-transformed images (can be used on images that have not been census transformed) by computing bitwise-XOR of the values in left and right images, within a square window. This step is usually followed by a bit-counting operation which results in the final Hamming distance score.

Example: Tsukuba

TsukubaLeftLeft Image TsukubaRightRight Image
TsukubaSAD9x9DispRange=0-16ColorSAD Disparity Map TsukubaSSD9x9DispRange=0-16ColorSSD Disparity Map
TsukubaNCC9x9DispRange=0-16Color

NCC Disparity Map

TsukubaSHD9x9DispRange=0-16Color

SHD Disparity Map

TsukubaGroundTruthColor

Ground Truth Disparity Map

Disparity Range: 0-16 ,Window Size: 9×9, The hotter the color, the closer it is; the cooler the color the farther it is.

For MATLAB Codes, please visit:

1. Sum of Absolute Differences
2. Sum of Squared Differences
3. Normalized Cross Correlation
4. Sum of Hamming Distances

References

1. T. Kanade, H. Kano, and S. Kimura, “Development of a video-rate stereo machine,” in Image UnderstandingWorkshop, Monterey,CA, 1994, p. 549–557.
2. D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” International Journal of Computer Vision, vol. 47(1/2/3), pp. 7-42, Apr. 2002.





Correlation based similarity measure-Sum of Hamming Distances (SHD)

30 05 2009

Example 1: Tsukuba

TsukubaLeftLeft Image TsukubaRightRight Image
TsukubaSHD9x9DispRange=0-16ColorSHD Disparity Map
Disparity Range: 0-16
Window Size: 9×9
The hotter the color, the closer it is; the cooler the color the farther it is.

Example 2: Stereogram

StereogramLeftLeft Image StereogramRightRight Image
StereogramSHD9x9DispRange=0-16ColorSHD Disparity Map
Disparity Range: 0-16
Window Size: 9×9
The hotter the color, the closer it is; the cooler the color the farther it is.

MATLAB Code-SHD Right to Left Matching

% *************************************************************************
% Title: Function-Compute Correlation between two images using the 
% similarity measure of Sum of Hamming Distances (SHD) with Right Image 
% as reference.
% Author: Siddhant Ahuja
% Created: May 2008
% Copyright Siddhant Ahuja, 2008
% Inputs: Left Image (var: leftImage), Right Image (var: rightImage),
% Window Size (var: windowSize), Minimum Disparity (dispMin), Maximum
% Disparity (dispMax)
% Outputs: Disparity Map (var: dispMap), Time taken (var: timeTaken)
% Example Usage of Function: [dispMap, timeTaken]=funcSHDR2L('TsukubaLeft.jpg', 'TsukubaRight.jpg', 9, 0, 16);
% *************************************************************************
function [dispMap, timeTaken]=funcSHDR2L(leftImage, rightImage, windowSize, dispMin, dispMax)
try 
    % Grab the image information (metadata) of left image using the function imfinfo
    leftImageInfo=imfinfo(leftImage);
    % Since SHDR2L is applied on a grayscale image, determine if the
    % input left image is already in grayscale or color
    if(getfield(leftImageInfo,'ColorType')=='truecolor')
    % Read an image using imread function, convert from RGB color space to
    % grayscale using rgb2gray function and assign it to variable leftImage
        leftImage=rgb2gray(imread(leftImage));
    else if(getfield(leftImageInfo,'ColorType')=='grayscale')
    % If the image is already in grayscale, then just read it.        
            leftImage=imread(leftImage);
        else
            error('The Color Type of Left Image is not acceptable. Acceptable color types are truecolor or grayscale.');
        end
    end
catch
    % if it is not an image but a variable
    leftImage=leftImage;
end
try
    % Grab the image information (metadata) of right image using the function imfinfo
    rightImageInfo=imfinfo(rightImage);
    % Since SHDR2L is applied on a grayscale image, determine if the
    % input right image is already in grayscale or color
    if(getfield(rightImageInfo,'ColorType')=='truecolor')
    % Read an image using imread function, convert from RGB color space to
    % grayscale using rgb2gray function and assign it to variable rightImage
        rightImage=rgb2gray(imread(rightImage));
    else if(getfield(rightImageInfo,'ColorType')=='grayscale')
    % If the image is already in grayscale, then just read it.        
            rightImage=imread(rightImage);
        else
            error('The Color Type of Right Image is not acceptable. Acceptable color types are truecolor or grayscale.');
        end
    end
catch
    % if it is not an image but a variable
    rightImage=rightImage;
end
% Find the size (columns and rows) of the left image and assign the rows to
% variable nrLeft, and columns to variable ncLeft
[nrLeft,ncLeft] = size(leftImage);
% Find the size (columns and rows) of the right image and assign the rows to
% variable nrRight, and columns to variable ncRight
[nrRight,ncRight] = size(rightImage);
% Check to see if both the left and right images have same number of rows
% and columns
if(nrLeft==nrRight && ncLeft==ncRight)
else
    error('Both left and right images should have the same number of rows and columns');
end
% Check the size of window to see if it is an odd number.
if (mod(windowSize,2)==0)
    error('The window size must be an odd number.');
end
% Check whether minimum disparity is less than the maximum disparity.
if (dispMin>dispMax)
    error('Minimum Disparity must be less than the Maximum disparity.');
end
% Create an image of size nrLeft and ncLeft, fill it with zeros and assign
% it to variable dispMap
dispMap=zeros(nrLeft, ncLeft);
% Find out how many rows and columns are to the left/right/up/down of the
% central pixel based on the window size
win=(windowSize-1)/2;
numberOfBits=8;
tic; % Initialize the timer to calculate the time consumed.
for(i=1+win:1:nrLeft-win)
    for(j=1+win:1:ncLeft-win-dispMax)
        min=0;
        position=0;
        rightWindow=rightImage(i-win:i+win, j-win:j+win);
        for(dispRange=dispMin:1:dispMax)
            if (j+win+dispRange <= ncLeft)
                leftWindow=leftImage(i-win:i+win, j-win+dispRange:j+win+dispRange);
                bloc3=bitxor(rightWindow,leftWindow);
                distance=uint8(zeros(windowSize,windowSize));
                for (k=1:1:numberOfBits)
                    distance=distance+bitget(bloc3,k);
                end
                dif=sum(sum(distance));
                if (dispRange==0)
                    min=dif;
                elseif (min>dif)
                    min=dif;
                    position=dispRange;
                end
            end 
        end
        dispMap(i,j) = position;
    end
end
% Stop the timer to calculate the time consumed.
timeTaken=toc;