Working with ROS and OpenCV

20 07 2011

The goal of this tutorial is to get you up and running with ROS (Robot Operating System) environment and open source computer vision library OpenCV as fast as possible.

My environment

ROS supports many camera drivers which let us control the camera parameters like resolution, frame rate etc., and publish raw images for further processing by the code. Here, i will be using uvc_cam USB camera driver written by Eric Perko.

We will:

  1. Launch the uvc_cam driver to start publishing the images.
  2. Subscribe to these images using image_transport
  3. Convert the images from ROS image message format to OpenCV image format using cv_bridge
  4. Process them using OpenCV commands
  5. Convert the images from OpenCV image message format to ROS image message using cv_bridge
  6. Publish the processed images using image_transport

Software for Testing Webcam

Before moving ahead, it is important to make sure whether your webcam works or not in linux. There are some open-source programs like:

  1. Cheeseuses the gstreamer library, which utlilizes the video4linux2 API to capture video and stills from a webcam. It can also apply some special effects. Type
    $ sudo apt-get install cheese

    to install it.

  2. GUVCView is a graphical front-end for UVC drivers built using GTK+. It is way better than cheesefor controlling webcam and recording audio/video. Type
    $ sudo apt-get install guvcview

    to install it.

Install Camera Drivers to work with ROS

  1. Open the terminal window, go to the directory where you would like to install the camera driver package. In my computer, I installed it in the home directory (go there by typing cd in the terminal window).
  2. Download the uvc_campackage:
    $ git clone https://github.com/ericperko/uvc_cam.git

    This will create a directory called uvc_cam.

  3. In order to compile and make/build this package you would need elevated permissions. Type:
    $ sudo bash

    and then enter the password. You should now have root access.

  4. Before making the package you need to tell ROS where to find it. Type:
    # cd
    # gedit .bashrc

    A text editor should open up. Next add the following line at the end:

    export ROS_PACKAGE_PATH=~/uvc_cam:$ROS_PACKAGE_PATH

    Save and Exit the text editor.

  5. Restart the terminal, get root access again and type:
    # rosmake --rosdep-install uvc_cam

    In the rosmake results you should notice:

    [ rosmake ] Results:
    [ rosmake ] Built 49 packages with 0 failures.

    If you notice some failures, you may not have root access.

Find which device is the camera connected to.

  1. Open Terminal Window
  2. Disconnect the camera from the computer and type the following in the terminal:
    $ ls /dev/video*

    If you dont have any other cameras connected to your computer you would see no output, else you would see something like

    /dev/video0

    Remember the number of devices listed in this step.

  3. Now connect the camera to your computer and repeat step 2. You should see a new device added to the list like:
    /dev/video0

    So, the camera is connected to /dev/video0. We will used this information in the next section.

Create a new package for image processing

  1. Open a new terminal window and in the home directory, type the following:
    $ roscreate-pkg tutorialROSOpenCV image_transport roscpp std_msgs opencv2 cv_bridge uvc_cam

    This will create a new directory called tutorialROSOpenCV.

  2. We will now create a launch directory where we would store the file used to launch the uvc_camnode in order for it to start publishing the images. Change directory:
    $ cd tutorialROSOpenCV

    Now make a new directory under it called launch:

    $ mkdir launch

    This is where we will store the launch file.

  3. Change directory:
    $ cd launch

    Create the new launch file by typing:

    $ gedit uvcCamLaunch.launch

    Paste the following in there:

    <launch>
        <node name="uvc_cam_node" pkg="uvc_cam" type="uvc_cam_node" output="screen">
            <param name="device" value="/dev/video0" />
            <param name="width" value="320" />
            <param name="height" value="240" />
            <param name="frame_rate" value="20" />
        </node>
    </launch>
    

    Save and exit. Notice the parameter device with value: /dev/video0. Change this to the actual device path. Refer to previous section.

  4. Before making this package you need to tell ROS where to find it. Gain root access in the terminal and type:
    # cd
    # gedit .bashrc

    A text editor should open up. Next add the following line at the end:

    export ROS_PACKAGE_PATH=~/tutorialROSOpenCV:$ROS_PACKAGE_PATH

    Save and Exit the text editor.

  5. Restart the terminal, gain root access, and type:
    # rosmake tutorialROSOpenCV

    In the rosmake results you should notice:

    [ rosmake ] Results:
    [ rosmake ] Built 50 packages with 0 failures.

    If you notice some failures, you may not have root access.

Test the Launch script
In order to make sure that the package uvc_cam is in fact publishing the images, we will now run the node and check the output.

  1. In a new terminal type:
    $ roslaunch tutorialROSOpenCV uvcCamLaunch.launch

    If your webcam has an integrated light indicator, you should see it light up.

  2. Now we will make sure that the images are being published. Open up another terminal window and type:
    $ rostopic list

    You should see the following topics being published:

    /camera/camera_info
    /camera/image_raw
    /camera/image_raw/compressed
    /camera/image_raw/compressed/parameter_descriptions
    /camera/image_raw/compressed/parameter_updates
    /camera/image_raw/theora
    /camera/image_raw/theora/parameter_descriptions
    /camera/image_raw/theora/parameter_updates
    /rosout
    /rosout_agg
    /uvc_cam_node/parameter_descriptions
    /uvc_cam_node/parameter_updates
  3. Lets look at the topic /camera/image_raw. Type:
    $ rosrun image_view image_view image:=/camera/image_raw

    You should see a new window pop-up with a continuous streaming video capture output of the webcam. Press Ctrl-Cto exit. Lets look at the format of this topic. Type

    $ rostopic info /camera/image_raw

    You should see something like this:

    Type: sensor_msgs/Image
    
    Publishers:
     * /uvc_cam_node (http://username-computername:port/)
    
    Subscribers: None

    sensor_msgs/Image is a ROS image format which contains the header, height, width, step, and image data. Refer to: http://www.ros.org/doc/api/sensor_msgs/html/msg/Image.html for more information.

  4. Lets look at the topic /camera/camera_info. Type:
    $ rostopic info /camera/camera_info

    You should see something like this:

    Type: sensor_msgs/CameraInfo
    
    Publishers:
     * /uvc_cam_node (http://username-computername:port/)
    
    Subscribers: None

    sensor_msgs/CameraInfo is a ROS information format which contains the header, height, width, distortion model params, binning params and ROI (region of interest) params. Refer to: http://www.ros.org/doc/api/sensor_msgs/html/msg/CameraInfo.htmlfor more info. You can view the contents of this topic by typing:

    rostopic echo -c /camera/camera_info

Write the code for image processing

  1. In a new terminal window, type the following:
    $ roscd tutorialROSOpenCV/src

    We will now create the source file which will contain the code. Type:

    $ gedit main.cpp

    The following program inverts the colors of the image captured from the webcam. Paste the following code in the file:

    //Includes all the headers necessary to use the most common public pieces of the ROS system.
    #include <ros/ros.h>
    //Use image_transport for publishing and subscribing to images in ROS
    #include <image_transport/image_transport.h>
    //Use cv_bridge to convert between ROS and OpenCV Image formats
    #include <cv_bridge/cv_bridge.h>
    //Include some useful constants for image encoding. Refer to: http://www.ros.org/doc/api/sensor_msgs/html/namespacesensor__msgs_1_1image__encodings.html for more info.
    #include <sensor_msgs/image_encodings.h>
    //Include headers for OpenCV Image processing
    #include <opencv2/imgproc/imgproc.hpp>
    //Include headers for OpenCV GUI handling
    #include <opencv2/highgui/highgui.hpp>
    
    //Store all constants for image encodings in the enc namespace to be used later.
    namespace enc = sensor_msgs::image_encodings;
    
    //Declare a string with the name of the window that we will create using OpenCV where processed images will be displayed.
    static const char WINDOW[] = "Image Processed";
    
    //Use method of ImageTransport to create image publisher
    image_transport::Publisher pub;
    
    //This function is called everytime a new image is published
    void imageCallback(const sensor_msgs::ImageConstPtr& original_image)
    {
    	//Convert from the ROS image message to a CvImage suitable for working with OpenCV for processing
    	cv_bridge::CvImagePtr cv_ptr;
    	try
    	{
    		//Always copy, returning a mutable CvImage
    		//OpenCV expects color images to use BGR channel order.
    		cv_ptr = cv_bridge::toCvCopy(original_image, enc::BGR8);
    	}
    	catch (cv_bridge::Exception& e)
    	{
    		//if there is an error during conversion, display it
    		ROS_ERROR("tutorialROSOpenCV::main.cpp::cv_bridge exception: %s", e.what());
    		return;
    	}
    
    	//Invert Image
    	//Go through all the rows
    	for(int i=0; i<cv_ptr->image.rows; i++)
    	{
    		//Go through all the columns
    		for(int j=0; j<cv_ptr->image.cols; j++)
    		{
    			//Go through all the channels (b, g, r)
    			for(int k=0; k<cv_ptr->image.channels(); k++)
    			{
    				//Invert the image by subtracting image data from 255				
    				cv_ptr->image.data[i*cv_ptr->image.rows*4+j*3 + k] = 255-cv_ptr->image.data[i*cv_ptr->image.rows*4+j*3 + k];
    			}
    		}
    	}
    	
    
    	//Display the image using OpenCV
    	cv::imshow(WINDOW, cv_ptr->image);
    	//Add some delay in miliseconds. The function only works if there is at least one HighGUI window created and the window is active. If there are several HighGUI windows, any of them can be active.
    	cv::waitKey(3);
    	/**
    	* The publish() function is how you send messages. The parameter
    	* is the message object. The type of this object must agree with the type
    	* given as a template parameter to the advertise<>() call, as was done
    	* in the constructor in main().
    	*/
    	//Convert the CvImage to a ROS image message and publish it on the "camera/image_processed" topic.
            pub.publish(cv_ptr->toImageMsg());
    }
    
    /**
    * This tutorial demonstrates simple image conversion between ROS image message and OpenCV formats and image processing
    */
    int main(int argc, char **argv)
    {
    	/**
    	* The ros::init() function needs to see argc and argv so that it can perform
    	* any ROS arguments and name remapping that were provided at the command line. For programmatic
    	* remappings you can use a different version of init() which takes remappings
    	* directly, but for most command-line programs, passing argc and argv is the easiest
    	* way to do it.  The third argument to init() is the name of the node. Node names must be unique in a running system.
    	* The name used here must be a base name, ie. it cannot have a / in it.
    	* You must call one of the versions of ros::init() before using any other
    	* part of the ROS system.
    	*/
            ros::init(argc, argv, "image_processor");
    	/**
    	* NodeHandle is the main access point to communications with the ROS system.
    	* The first NodeHandle constructed will fully initialize this node, and the last
    	* NodeHandle destructed will close down the node.
    	*/
            ros::NodeHandle nh;
    	//Create an ImageTransport instance, initializing it with our NodeHandle.
            image_transport::ImageTransport it(nh);
    	//OpenCV HighGUI call to create a display window on start-up.
    	cv::namedWindow(WINDOW, CV_WINDOW_AUTOSIZE);
    	/**
    	* Subscribe to the "camera/image_raw" base topic. The actual ROS topic subscribed to depends on which transport is used. 
    	* In the default case, "raw" transport, the topic is in fact "camera/image_raw" with type sensor_msgs/Image. ROS will call 
    	* the "imageCallback" function whenever a new image arrives. The 2nd argument is the queue size.
    	* subscribe() returns an image_transport::Subscriber object, that you must hold on to until you want to unsubscribe. 
    	* When the Subscriber object is destructed, it will automatically unsubscribe from the "camera/image_raw" base topic.
    	*/
            image_transport::Subscriber sub = it.subscribe("camera/image_raw", 1, imageCallback);
    	//OpenCV HighGUI call to destroy a display window on shut-down.
    	cv::destroyWindow(WINDOW);
    	/**
    	* The advertise() function is how you tell ROS that you want to
    	* publish on a given topic name. This invokes a call to the ROS
    	* master node, which keeps a registry of who is publishing and who
    	* is subscribing. After this advertise() call is made, the master
    	* node will notify anyone who is trying to subscribe to this topic name,
    	* and they will in turn negotiate a peer-to-peer connection with this
    	* node.  advertise() returns a Publisher object which allows you to
    	* publish messages on that topic through a call to publish().  Once
    	* all copies of the returned Publisher object are destroyed, the topic
    	* will be automatically unadvertised.
    	*
    	* The second parameter to advertise() is the size of the message queue
    	* used for publishing messages.  If messages are published more quickly
    	* than we can send them, the number here specifies how many messages to
    	* buffer up before throwing some away.
    	*/
            pub = it.advertise("camera/image_processed", 1);
    	/**
    	* In this application all user callbacks will be called from within the ros::spin() call. 
    	* ros::spin() will not return until the node has been shutdown, either through a call 
    	* to ros::shutdown() or a Ctrl-C.
    	*/
            ros::spin();
    	//ROS_INFO is the replacement for printf/cout.
    	ROS_INFO("tutorialROSOpenCV::main.cpp::No error.");
    
    }

    Save and Exit.

  2. Before we make the program, we have to edit the CMakeLists.txtfile to ensure it will build and make an executable file of the program above. Gain root access, and type:
    # roscd tutorialROSOpenCV/

    Then we will edit the CMakeLists file. Type:

    # gedit CMakeLists.txt

    At the end of the file add the following line:

    rosbuild_add_executable(tutorialROSOpenCV src/main.cpp)

    Save and Exit.

  3. Now we can compile/make the package. Type:
    # rosmake tutorialROSOpenCV

    You should notice the following:

    [ rosmake ] Results:
    [ rosmake ] Built 50 packages with 0 failures.
    [ rosmake ] Summary output to directory
  4. To run the project, type:
    # rosrun tutorialROSOpenCV tutorialROSOpenCV

    You should see a window being created titled Image Processed containing the inverted image from the webcam

  5. In order to ensure that the processed image has been published as a rostopic, in a new terminal window type:
    # rostopic list

    You would see the following:

    /camera/camera_info
    /camera/image_processed
    /camera/image_processed/compressed
    /camera/image_processed/compressed/parameter_descriptions
    /camera/image_processed/compressed/parameter_updates
    /camera/image_processed/theora
    /camera/image_processed/theora/parameter_descriptions
    /camera/image_processed/theora/parameter_updates
    /camera/image_raw
    /camera/image_raw/compressed
    /camera/image_raw/compressed/parameter_descriptions
    /camera/image_raw/compressed/parameter_updates
    /camera/image_raw/theora
    /camera/image_raw/theora/parameter_descriptions
    /camera/image_raw/theora/parameter_updates
    /rosout
    /rosout_agg
    /uvc_cam_node/parameter_descriptions
    /uvc_cam_node/parameter_updates

    To check the image data in rostopic /camera/image_processed, type:

    $ rosrun image_view image_view image:=/camera/image_processed

    You should see a new window pop-up with a continuous streaming video capture output of the webcam with image inverted. Press Ctrl-C to exit.





Getting Started with OpenCV 2.3 in Microsoft Visual Studio 2010 in Windows 7 64-bit

18 07 2011

Prepared by Siddhant Ahuja

I was trying to install OpenCV 2.3 on my Windows 7 64-bit machine and i kept getting the error: The application was unable to start correctly (0xc000007b)

I tried chasing the problem down using freely available program Dependency Walker and it looked like my executable project was x86 whereas all the dll files it was loading up were x64. So after many tries including re-building OpenCV using CMake, changing dll links to x86, etc., i finally figured out how to solve this problem, which i have presented as a tutorial below.

This tutorial assumes that you have installed Microsoft Visual Studio 2010 Professional.  If you are a student you can get it for free at https://www.dreamspark.com/Products/Product.aspx?ProductId=25.

This tutorial contains the following:

  • How to download and install the OpenCV library
  • How to manually change the system path to include the bin file
  • How to create a project
  • How to configure Visual Studio to use OpenCV
  • A sample program that displays a video from a camera

Downloading and Installing OpenCV

OpenCV is a computer vision library written mainly in C that focuses on real time video processing.  It is open source or free to the public.

  1. Go to http://sourceforge.net/projects/opencvlibrary/files/opencv-win/2.3/
  2. Select the OpenCV-2.3.0-win-superpack.exe file and download it.
  3. Go the folder where you downloaded the executable file in Step 2 and select “Run as Administrator”.
  4. The executable file is essentially an archive of files with an extractor built in. It will ask you to select the location where you would like it to extract all its contents. Select: C:\Program Files\ as the path and click Extract. It will create a folder called OpenCV2.3 with the path: C:\Program Files\OpenCV2.3

Manually Changing the System Path to Include the Bin File

  1. To access the system path in Vista go to Control Panel\System and Security\System and on the left hand side select Advanced system settings this should bring up the Systems Properties dialogue.  Click on the Advanced tab and then the Environment Variables button.
  2. This will bring up the Environment Variables dialogue.  In the System Variables box select Path and then Edit
  3. When modifying the path know that it is sensitive to all characters including spaces.  To add the new bin, type a semicolon directly after the text that is there without adding in a space.  Next add the path to the bin file.  The path is the same as where you chose to install OpenCV back in step 4 of Downloading and Installing OpenCV section plus \bin. For example if you chose to install OpenCV in:

    C:\Program Files\OpenCV2.3

    For 64 bit Windows, add the following to the system path:

    ;C:\Program Files\OpenCV2.3\build\bin\;C:\Program Files\OpenCV2.3\build\x64\vc10\bin\

    Make sure to restart your computer so the changes can take effect.

Creating a Project

  1. Click FileNewProject
  2. In the “Installed Templates”  box choose Visual C++Win32
  3. On the right choose Win32 Console Application
  4. Enter a name for the project and click OK
  5. Click finish in the dialogue box which appears

Configuring Visual Studio

  1. Click ProjectProperties to access the properties dialog
  2. In the left box click on Configuration Properties and on the top right click on Configuration Manager

  3. In the Configuration Manager dialog box that opens up, under Active Solution Platform combo box select New.
  4. In the New Solution Platform dialog box that opens up, under Type or select the new platform, select x64, copy settings from Win32 and make sure that Create new project platforms is selected. Click OK.
  5. You will notice that the in the Configuration Manager dialog box x64 platform has now been selected. Click Close.
  6. In the left box choose Configuration Properties C++ General
  7. In the right box, next to Additional Include Directories type the following text:
    C:\Program Files\OpenCV2.3\build\include;C:\Program Files\OpenCV2.3\build\include\opencv;%(AdditionalIncludeDirectories)

    IMPORTANT note that all these paths assume that you installed in the default location, if you installed in a different location; for example Program Files (x86) instead of Program Files, make sure you change these paths to reflect that.

  8. Next in the felt box choose Configuration Properties → Linker → Input
  9. In the right box, next to Additional Dependencies type the following text:
    "C:\Program Files\OpenCV2.3\build\x64\vc10\lib\opencv_core230d.lib";"C:\Program Files\OpenCV2.3\build\x64\vc10\lib\opencv_highgui230d.lib";"C:\Program Files\OpenCV2.3\build\x64\vc10\lib\opencv_video230d.lib";"C:\Program Files\OpenCV2.3\build\x64\vc10\lib\opencv_ml230d.lib";"C:\Program Files\OpenCV2.3\build\x64\vc10\lib\opencv_legacy230d.lib";"C:\Program Files\OpenCV2.3\build\x64\vc10\lib\opencv_imgproc230d.lib";%(AdditionalDependencies)

    IMPORTANT note that all these paths assume that you installed in the default location, if you installed in a different location; for example Program Files (x86) instead of Program Files, make sure you change these paths to reflect that.

  10. Click Apply then OK

Sample Program

This program will capture video from a camera and display it.  Press escape to stop capturing and displaying frames.

It may be necessary to change the number in this line of code to select the camera source:

CvCapture* capture = cvCaptureFromCAM(1); // capture from video device #1

If you do not know the video device number just try 0, then 1,  2 or else -1. Copy and paste the following code:

#include &quot;stdafx.h&quot;
#include &lt;highgui.h&gt;

int _tmain(int argc, _TCHAR* argv[])
{
	int c;
	// allocate memory for an image
	IplImage *img;
	// capture from video device #1
	CvCapture* capture = cvCaptureFromCAM(1);
	// create a window to display the images
	cvNamedWindow(&quot;mainWin&quot;, CV_WINDOW_AUTOSIZE);
	// position the window
	cvMoveWindow(&quot;mainWin&quot;, 5, 5);
	while(1)
	{
		// retrieve the captured frame
		img=cvQueryFrame(capture);
		// show the image in the window
		cvShowImage(&quot;mainWin&quot;, img );
		// wait 10 ms for a key to be pressed
		c=cvWaitKey(10);
		// escape key terminates program
		if(c == 27)
		break;
	}
 return 0;
}





Correlation based similarity measures-Summary

11 04 2010

Correlation based matching typically produces dense depth maps by calculating the disparity at each pixel within a neighborhood. This is achieved by taking a square window of certain size around the pixel of interest in the reference image and finding the homologous pixel within the window in the target image, while moving along the corresponding scanline. The goal is to find the corresponding (correlated) pixel within a certain disparity range d (d E [0,….,dmax]) that minimizes the associated error and maximizes the similarity.

In brief, the matching process involves computation of the similarity measure for each disparity value, followed by an aggregation and optimization step. Since these steps consume a lot of processing power, there are significant speed-performance advantages to be had in optimizing the matching algorithm.

The images can be matched by taking either left image as the reference (left-to-right matching, also known as direct matching) or right image as the reference (right-to-left matching, also known as reverse matching) [2].

Similarity Measure Formula
Sum of Absolute Differences (SAD) Sum of Absolute Differences
Zero-mean Sum of Absolute Differences (ZSAD)
Locally scaled Sum of Absolute Differences (LSAD)
Sum of Squared Differences (SSD) Sum of Squared Differences
Zero-mean Sum of Squared Differences (ZSSD)
Locally scaled Sum of Squared Differences (LSSD)
Normalized Cross Correlation (NCC) Normalized Cross Correlation
Zero-mean Normalized Cross Correlation (ZNCC)
Sum of Hamming Distances (SHD) Sum of Hamming Distances

Sum of Absolute Differences (SAD) is one of the simplest of the similarity measures which is calculated by subtracting pixels within a square neighborhood between the reference image I1 and the target image I2 followed by the aggregation of absolute differences within the square window, and optimization with the winner-take-all (WTA) strategy [1]. If the left and right images exactly match, the resultant will be zero.

In Sum of Squared Differences (SSD), the differences are squared and aggregated within a square window and later optimized by WTA strategy. This measure has a higher computational complexity compared to SAD algorithm as it involves numerous multiplication operations.

Normalized Cross Correlation is even more complex to both SAD and SSD algorithms as it involves numerous multiplication, division and square root operations.

Sum of Hamming Distances is normally employed for matching census-transformed images (can be used on images that have not been census transformed) by computing bitwise-XOR of the values in left and right images, within a square window. This step is usually followed by a bit-counting operation which results in the final Hamming distance score.

Example: Tsukuba

TsukubaLeftLeft Image TsukubaRightRight Image
SAD Disparity Map ZSAD Disparity Map LSAD Disparity Map
SSD Disparity Map ZSSD Disparity Map LSSD Disparity Map
NCC Disparity Map ZNCC Disparity Map TsukubaSHD9x9DispRange=0-16ColorSHD Disparity Map
TsukubaGroundTruthColor

Ground Truth Disparity Map

Disparity Range: 0-16 ,Window Size: 9×9, The hotter the color, the closer it is; the cooler the color the farther it is.

Generic MATLAB code:

% *************************************************************************
% Title: Function-Compute Correlation between two images using various 
% similarity measures with Left Image as reference.
% Author: Siddhant Ahuja
% Created: March 2010
% Copyright Siddhant Ahuja, 2010
% Inputs: 
% 1. Left Image (var: rightImage), 
% 2. Right Image (var: leftImage),
% 3. Correlation Window Size (var: corrWindowSize), 
% 4. Minimum Disparity in X-direction (var: dMin), 
% 5. Maximum Disparity in X-direction (var: dMax),
% 6. Method used for calculating the correlation scores (var: method)
% Valid values include: 'SAD', 'LSAD', 'ZSAD', 'SSD', 'LSSD', ZSSD', 'NCC',
% 'ZNCC'
% Outputs: 
% 1. Disparity Map (var: dispMap), 
% 2. Time taken (var: timeTaken)
% Example Usage of Function: [dispMap, timeTaken]=denseMatch('tsukuba_left.tiff', 'tsukuba_right.tiff', 9, 0, 16, 'ZNCC');
% *************************************************************************
function [dispMap, timeTaken]=denseMatch(rightImage, leftImage, corrWindowSize, dMin, dMax, method)
% Grab the image information (metadata) of left image using the function imfinfo
leftImageInfo=imfinfo(leftImage);
% Grab the image information (metadata) of right image using the function imfinfo
rightImageInfo=imfinfo(rightImage);
% Since Dense Matching is applied on a grayscale image, determine if the
% input left image is already in grayscale or color
if(getfield(leftImageInfo,'ColorType')=='truecolor')
% Read an image using imread function, convert from RGB color space to
% grayscale using rgb2gray function and assign it to variable leftImage
    leftImage=rgb2gray(imread(leftImage));
else if(getfield(leftImageInfo,'ColorType')=='grayscale')
% If the image is already in grayscale, then just read it.        
        leftImage=imread(leftImage);
    else
        error('The Color Type of Left Image is not acceptable. Acceptable color types are truecolor or grayscale.');
    end
end
% Since Dense Matching is applied on a grayscale image, determine if the
% input right image is already in grayscale or color
if(getfield(rightImageInfo,'ColorType')=='truecolor')
% Read an image using imread function, convert from RGB color space to
% grayscale using rgb2gray function and assign it to variable rightImage
    rightImage=rgb2gray(imread(rightImage));
else if(getfield(rightImageInfo,'ColorType')=='grayscale')
% If the image is already in grayscale, then just read it.        
        rightImage=imread(rightImage);
    else
        error('The Color Type of Right Image is not acceptable. Acceptable color types are truecolor or grayscale.');
    end
end
% Find the size (columns and rows) of the left image and assign the rows to
% variable nrLeft, and columns to variable ncLeft
[nrLeft,ncLeft] = size(leftImage);
% Find the size (columns and rows) of the right image and assign the rows to
% variable nrRight, and columns to variable ncRight
[nrRight,ncRight] = size(rightImage);
% Check to see if both the left and right images have same number of rows
% and columns
if(nrLeft==nrRight && ncLeft==ncRight)
else
    error('Both left and right images should have the same number of rows and columns');
end
% Convert the left and right images from uint8 to double
leftImage=im2double(leftImage);
rightImage=im2double(rightImage);
% Check the size of window to see if it is an odd number.
if (mod(corrWindowSize,2)==0)
    error('The window size must be an odd number.');
end
% Check whether minimum disparity is less than the maximum disparity.
if (dMin>dMax)
    error('Minimum Disparity must be less than the Maximum disparity.');
end
% Create an image of size nrLeft and ncLeft, fill it with zeros and assign
% it to variable dispMap
dispMap=zeros(nrLeft, ncLeft);
% Find out how many rows and columns are to the left/right/up/down of the
% central pixel based on the window size
win=(corrWindowSize-1)/2;
% The objective of CC, NCC and ZNCC is to maxmize the
% correlation score, whereas other methods try to minimize
% it.
maximize = 0;
if strcmp(method,'NCC') || strcmp(method,'ZNCC')
    maximize = 1;
end
tic; % Initialize the timer to calculate the time consumed.
for(i=1+win:1:nrLeft-win)
    % For every row in Left Image
    for(j=1+win:1:ncLeft-win-dMax)
        % For every column in Left Image
        % Initialize the temporary variable to hold the previous
        % correlation score
        if(maximize)
            prevcorrScore = 0.0;
        else
            prevcorrScore = 65532;
        end
        % Initialize the temporary variable to store the best matched
        % disparity score
        bestMatchSoFar = dMin;
        for(d=dMin:dMax)
            % For every disparity value in x-direction
            % Construct a region with window around central/selected pixel in left image
            regionLeft=leftImage(i-win : i+win, j-win : j+win);
            % Construct a region with window around central/selected pixel in right image
            regionRight=rightImage(i-win : i+win, j+d-win : j+d+win);
            % Calculate the local mean in left region
            meanLeft = mean2(regionLeft);
            % Calculate the local mean in right region
            meanRight = mean2(regionRight);
            % Initialize the variable to store temporarily the correlation
            % scores
            tempCorrScore = zeros(size(regionLeft));
            % Calculate the correlation score
            if strcmp(method,'SAD')
                tempCorrScore = abs(regionLeft - regionRight);
            elseif strcmp(method,'ZSAD')
                tempCorrScore = abs(regionLeft - meanLeft - regionRight + meanRight);
            elseif strcmp(method,'LSAD')
                tempCorrScore = abs(regionLeft - meanLeft/meanRight*regionRight);
            elseif strcmp(method,'SSD')
                tempCorrScore = (regionLeft - regionRight).^2;
            elseif strcmp(method,'ZSSD')
                tempCorrScore = (regionLeft - meanLeft - regionRight + meanRight).^2;          
            elseif strcmp(method,'LSSD')
                tempCorrScore = (regionLeft - meanLeft/meanRight*regionRight).^2;
            elseif strcmp(method,'NCC')
                % Calculate the term in the denominator (var: den)
                den = sqrt(sum(sum(regionLeft.^2))*sum(sum(regionRight.^2)));
                tempCorrScore = regionLeft.*regionRight/den;
            elseif strcmp(method,'ZNCC')
                % Calculate the term in the denominator (var: den)
                den = sqrt(sum(sum((regionLeft - meanLeft).^2))*sum(sum((regionRight - meanRight).^2)));
                tempCorrScore = (regionLeft - meanLeft).*(regionRight - meanRight)/den;
            end
            % Compute the final score by summing the values in tempCorrScore,
            % and store it in a temporary variable signifying the distance
            % (var: corrScore)
            corrScore=sum(sum(tempCorrScore));
            if(maximize)
                if(corrScore>prevcorrScore)
                    % If the current disparity value is greater than
                    % previous one, then swap them
                    prevcorrScore=corrScore;
                    bestMatchSoFar=d;
                end
            else
                if (prevcorrScore > corrScore)
                    % If the current disparity value is less than
                    % previous one, then swap them
                    prevcorrScore = corrScore;
                    bestMatchSoFar = d;
                end
            end
        end
        % Store the final matched value in variable dispMap
        dispMap(i,j) = bestMatchSoFar;
    end
end
% Stop the timer to calculate the time consumed.
timeTaken=toc;
end

For other MATLAB Codes, please visit:

1. Sum of Absolute Differences
2. Sum of Squared Differences
3. Normalized Cross Correlation
4. Sum of Hamming Distances

References

1. T. Kanade, H. Kano, and S. Kimura, “Development of a video-rate stereo machine,” in Image UnderstandingWorkshop, Monterey,CA, 1994, p. 549–557.
2. D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” International Journal of Computer Vision, vol. 47(1/2/3), pp. 7-42, Apr. 2002.
3. S. Chambon and A. Crouzil. évaluation et comparaison de mesures de corrélationrobustes aux occultations. Rapport de recherche 2002-34-R, IRIT, Université PaulSabatier, Toulouse, France, December 2002.





Camera Calibration using Singular Value Decomposition (SVD) in MATLAB

20 02 2010

Before i jump to the section on calibration of a single camera, it is worthwhile to present a refresher on basic concepts of orthogonality, properties of homogeneous linear equations, matrix decomposition and basics of Singular Value Decomposition.

What is an Orthogonal or Orthonormal Matrix?

If two vectors x,y are perpendicular to each other, they are termed orthogonal, represented by x\bot y. Their inner/dot product represented by \langle x,y \rangle is zero. For example vectors (1,3,2) and (3,-1,0) are orthogonal to each other, since their dot product: (1)(3)+(3)(-1)+(2)(0) is equal to zero.

Two vectors are orthonormal if they are orthogonal and are of unit length, i.e. norms of vectors are one. For example, vectors (1,0) and (0,1) are orthonormal because:

  1. Dot product: (1)(0)+(0)(1)=0,
  2. Norm of (1,0)=\|(1,0)\|=\sqrt {1^2+0^2}=1, and
  3. Norm of (0,1)=\|(0,1)\|=\sqrt {0^2+1^2}=1.

A similar concept is applied to matrices. A matrix is said to be orthogonal if:

  1. Matrix elements are real numbers (i.e. all floating point numbers that are not complex in nature),
  2. Matrix is a square matrix (i.e. same number of rows and columns), and
  3. Either column or row vectors are orthogonal.

In other words, if the product of a matrix M and its transpose M^T is equal to an identity matrix I, i.e. M^TM=MM^T=I. This is also true if the transpose of the matrix is the same as its inverse,  i.e. M^T=M^{-1}.

If in such a matrix, column or row vectors are of unit length, then these matrices are called orthonormal matrices.

For example,

M=\begin{bmatrix} 1&0\\ 0&-1 \end{bmatrix} is an orthogonal and orthonormal matrix because,

  1. Dot product of column vectors: (1)(0)+(0)(-1)=0,
  2. Norm of column vector (1,0)=\|(1,0)\|=\sqrt{1^2+0^2}=1, and
  3. Norm of column vector (0,-1)=\|(0,-1)\|=\sqrt{0^2+(-1)^2}=1.

Also,

  1. M^TM= \begin{bmatrix} 1&0\\ 0&-1 \end{bmatrix} \begin{bmatrix} 1&0\\ 0&-1 \end{bmatrix} = MM^T = \begin{bmatrix} 1&0\\ 0&1 \end{bmatrix} = I, and
  2. M^T= \begin{bmatrix} 1&0\\ 0&-1 \end{bmatrix} = M^{-1}

Homogeneous Linear Equations

A set of equations where the constant terms are all zeros, are called homogeneous equations:

a_{11}x_1 + a_{12}x_2 + \cdots + a_{1n}x_n = 0,\\ a_{21}x_1 + a_{22}x_2 + \cdots + a_{2n}x_n = 0,\\ \vdots \\ a_{m1}x_1 + a_{m2}x_2 + \cdots + a_{mn}x_n = 0

where, there are n variables x_1, x_2, \cdots x_n. This can be written in the matrix form:

\begin{bmatrix} a_{11}&a_{12}&\cdots &a_{1n} \\ a_{21}&a_{22}&\cdots &a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\  a_{m1}&a_{m2}& \cdots &a_{mn}\end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\x_n \end{bmatrix}=\begin{bmatrix} 0 \\ 0 \\ \vdots \\ 0 \end{bmatrix} or, AX=0

Note, in the above system of equations, there always exits a solution: x_1=0,x_2=0, \cdots ,x_n=0 or, X=0, called the “trivial solution”.

If the columns in the square matrix A of size n \times n, are linearly independent, i.e. rank(A)=n, then the trivial solution is the one and only one, unique solution. In other words, det(A)\neq 0. This type of matrix is invertible and is referred to as “Regular Matrix”. On the other hand, if for this square matrix A, det(A)=0, then the equations are not linearly independent, i.e. rank(A)<n. Then, there exists infinite number of non-zero solutions, and A is NOT invertible. Such a matrix is called the “Singular Matrix”.

What is Matrix Decomposition?

Matrix decomposition refers to reduction of the matrix to the simplest and most significant form-”basic building blocks”, without any loss of generality. For example, number 15 can be reduced or factorized into a multiple of 5*3, where both 5 and 3 are primes and can not be reduced further. A similar concept is applied to matrices, where a matrix can be represented by a multiple of basic matrices which can not be reduced further. One of the reasons why we decompose a matrix is to reduce a complex form into products of simpler forms, for example, a complex transformation applied to an image can be reduced to products of simpler transformations.

Singular Value Decomposition (SVD)

Singular Value Decomposition (SVD) is one of the many methods of factorization of a matrix with applications in least squares fitting of data, solving linear equations, computing the rank, range and null space of matrix, etc. A set of linear equations can be written in the form of Ax=b, where A is an m \times n matrix where rows m are greater than the columns n. Then we can factorize A using SVD, where A=USV^T such that:

  1. U is a column orthogonal matrix of size m \times n,
  2. S is a diagonal matrix with positive or zero elements of size n \times n, and
  3. V is an orthogonal matrix of size n \times n.

The above is termed as the Singular Value Decomposition (SVD) of A. Let S=diag\{\sigma _1 , \sigma _2 , \cdots , \sigma _n \}, where \sigma _1 \geq \sigma _2 \geq \cdots \geq \sigma _n \geq 0, then \sigma _1, \sigma _2 , \cdots , \sigma _n are called the singular values of A. Columns of U and V are left and right singular vectors for the corresponding singular values respectively.

These singular values are not the same as the eigenvalues. For a matrix A, matrix A^HA is normal with non-negative eigenvalues where singular values of A are square roots of the eigenvalues of A^HA.

For a set of homogeneous linear equations Ax=0, any vector x which falls in the null space of A is a solution. Any column V where corresponding singular values approaches zero is a solution vector.

For a set of non-homogeneous linear equations Ax=b \neq 0, the goal is to find the solution x with the smallest length |x|^2, where x=V[diag( \sigma _1 ^ {-2}, \sigma _2 ^ {-1}, \cdots , \sigma _n ^{-1} )](U^Tb). In this case, for every singular value \sigma _i where \sigma _i=0, \sigma _i ^{-1} is replaced by 0. If b is not in the range of A then there exists no vector x to satisfy the equation. Thus it is not possible to obtain an exact solution.

If n=m, i.e. A is a square matrix, SVD is given by A=USV^T, where both U and V are orthogonal matrices. Thus, A^{-1}=V[diag( \sigma _1 ^ {-2}, \sigma _2 ^ {-1}, \cdots , \sigma _n ^{-1} )](U^T).

The range of A has the dimension of rank of A. If A is singular, then rank of A will be less than n, where n=rank(A)+null(A). The columns of U corresponding to singular values not equal to zero, are an orthonormal set for the range of A, and columns of V corresponding to singular values not equal to zero, are an orthonormal set for the null space of A.

Calculating SVD

SVD can be calculated by first computing a bi-diagonal matrix of A using Householder reflections, then computing SVD using an iterative method like QR algorithm for computation of eigenvalues [1]. In MATLAB there is a limit of 75 QR step iterations to get the singular values [2]. Sometimes, solution may not converge. LAPACK subroutines can be used for SVD : real double precision A-DGESVD, complex double precision A-ZGESVD, real single precision A-SGESVD, and complex single prevision A-CGESVD [2].

Camera Calibration

Here i am assuming you are familiar with concepts of camera calibration. For more details, please refer to Calibration slides by Bryan Morse at Brigham Young University. The work presented here is based on the Pinhole camera model.

If we know the 3D points in the world model, and the 2D projected pixel coordinates, then we can linearly solve (using SVD) for the calibration matrix, rotation matrix, translation vector, parameters for scaling in X and Y directions, and the optical center of the image (x,y). Using this we can re-project the 3D points and by comparing to the 2D points, figure out the average pixel error in the X and Y direction.

I have used the 3D points and 2D points provided by Dr. Boufama, at the University of Windsor. You can download these files from the links below:

  1. 3D Points
  2. 2D Points

Here is a visual representation of the points:

3D Points

2D Points

Here is the MATLAB Code:

% *************************************************************************
% Title: Function-Produce Calibration Matrix and Average Pixel Error using
% Singular value decomposition (SVD)
% Author: Siddhant Ahuja
% Created: February 2010
% Copyright Siddhant Ahuja, 2010
% ***Inputs***
% 1. File3D: 3-D File containing world coordinates
% 2. File2D: 2-D File containing pixel coordinates
% ***Outputs***
% calibMatrix: 3x4 Calibration Matrix
% rotationMatrix: 3x3 Rotation Matrix
% translationVector: 3x1 Translation vector
% alpha_u: Parameter for scaling in X-direction
% alpha_v: Parameter for scaling in Y-direction
% u_0: Optical Centre of the image (x value)
% v_0: Optical Centre of the image (y value)
% reprojMatrix: Reprojection of 3D points to 2D using Calibration Matrix
% avgError_u: Average Pixel Error in x direction
% avgError_v: Average Pixel Error in y direction
% timeTaken: Time taken by the code
% Example Usage of Function: 
% [calibMatrix, rotationMatrix, translationVector, alpha_u, alpha_v, u_0,
% v_0, reprojMatrix, avgError_u, avgError_v, timeTaken]= funcCalibrate('3D.txt', '2D.txt');
% *************************************************************************
function [calibMatrix, rotationMatrix, translationVector, alpha_u, alpha_v, u_0, v_0, reprojMatrix, avgError_u, avgError_v, timeTaken] = funcCalibrate(File3D, File2D)
 
% Assuming no Radial Distortion is present.
% Initialize the timer to calculate the time consumed.
tic;
% Read 3-D file
[matrix3D]=textread(File3D);
% Delete First row of the 3D Matrix as it only contains number of data points
matrix3D(1,:)=[]; 
% Find the size of the 3D Matrix
[nR3D, nC3D]=size(matrix3D);
% Check to make sure matrix has 3 columns
if(nC3D~=3)
    error('The matrix for 3D points does not have 3 columns.');
end
% Separate out the X values from the matrix
X = matrix3D(:,1);
% Separate out the X values from the matrix
Y = matrix3D(:,2);
% Separate out the X values from the matrix
Z = matrix3D(:,3);
% Plot the points in 3-D
figure;
plot3(matrix3D(:,1),matrix3D(:,2),matrix3D(:,3),'.');
% Read 2-D file
[matrix2D]=textread(File2D);
% Delete First row of the 2D Matrix as it only contains number of data points
matrix2D(1,:)=[]; 
% Find the size of the 2D Matrix
[nR2D, nC2D]=size(matrix2D);
% Check to make sure matrix has 2 columns
if(nC2D~=2)
    error('The matrix for 2D points does not have 2 columns.');
end
% Separate out the u values from the matrix
u_values=matrix2D(:,1);
% Separate out the v values from the matrix
v_values=matrix2D(:,2);
% Plot the points in 2-D
figure;
plot(matrix2D(:,1),matrix2D(:,2),'.');
% Check to make sure number of rows of the 3D Matrix is the same as the
% number of rows of the 2D Matrix
if(nR3D~=nR2D)
    error('Please make sure number of 3D and 2D points is the same.');
end
% ***Linear Solution of the Calibration Matrix***
% Let Calibtration Matrix be denoted by (M)
% Writing linear equations of the form AV=0, where A is a 2nx12 measurement
% matrix and V is a 12-element unknown vector.
% Create a Matrix of ones (o) with the same length as that of the u_values
% vector
o = ones(size(u_values));
% Create a Matrix of zeros (z) with the same length as that of the u_values
% vector
z = zeros(size(u_values));
% Populate the odd rows for A
AoddRows  = [ X Y Z o z z z z -u_values.*X -u_values.*Y -u_values.*Z -u_values ];
% Populate the even rows for A
AevenRows = [ z z z z X Y Z o -v_values.*X -v_values.*Y -v_values.*Z -v_values ];
% Concatenate odd and even rows of A
A=[AoddRows; AevenRows];
% Compute SVD on the matrix
[U, S, V] = svd(A,0);
% Assuming no noise, since the elements of the diagonal matrix S are in descending order, to
% get the eigenvectors corresponding to the smallest eignevalue, we can
% just grab the last column of matrix 
m = V(:,end);
% Construct the camera calibration matrix M
M = reshape(m,4,3)';
calibMatrix=M;
% Since the norm of Projection matrix M is equal to 1, we can calculate the
% absolute scale factor lambda
abs_lambda=sqrt(M(3,1)^2 + M(3,2)^2 + M(3,3)^2);
% Scale the Matrix with the scale factor
M = M / abs_lambda;
% In case the origin of the world frame is in front of the camera, we have
% s=lambda/abs_lambda, or From T_z=s*m_34, we can re-write s in the form of
% sign value of m_34. Here we assume it is in the front.
inFront=1;
if inFront
    s = sign(M(3,4));
else
    s = -sign(M(3,4));
end
% Thus, we can now calculate T_z or T(3) 
T(3) = s*M(3,4);
% Create a 3x3 Rotation matrix and fill it with zeros
R = zeros(3,3);
% From equations, last row of the rotation matrix is the same as the first
% three elements of the last row of the calibration matrix
R(3,:)=s*M(3,1:3);
% Matrix M can be written as:
% M=( m1'   )
%   ( m2' m4)
%   ( m3'   )
% We can now calculate mi, where mi is a 3 element vector
m1 = M(1,1:3)';
m2 = M(2,1:3)';
m3 = M(3,1:3)';
m4 = M(1:3,4);
% Now, we can calculate the centres of projection u_0 and v_0
u_0 = m1'*m3;
v_0 = m2'*m3;
% Calculating the alpha values in u and v directions,
alpha_u=sqrt( m1'*m1 - u_0^2 );
alpha_v=sqrt( m2'*m2 - v_0^2 );
% We can now calculate the first and second rows of the rotation matrix
R(1,:) = s*(u_0*M(3,1:3) - M(1,1:3) ) / alpha_u;
R(2,:) = s*(v_0*M(3,1:3) - M(2,1:3) ) / alpha_v;
% We can also calculate the first and second elements of the Translation
% vector
T(1) = s*(u_0*M(3,4) - M(1,4) ) / alpha_u;
T(2) = s*(v_0*M(3,4) - M(2,4) ) / alpha_v;
T = T';
translationVector=T;
% The rotation matrix R obtained with this estimation procedure is not guaranteed to be orthogonal. 
% Therefore we calculate the rotation matrix that is closest to the estimated matrix (in the 
% Frobenius norm sense). Let R = UDV'T then R = UV'T.
[U,D,V] = svd(R);
R = U*V';
rotationMatrix=R;
% Reproject 3Dpoints to 2D points using the calibration matrix to calculate average pixel errors 
newMatrix2D=zeros(nR3D,2);
for i=1:1:nR3D
    num_u=M(1,1)*matrix3D(i,1) + M(1,2)*matrix3D(i,2) + M(1,3)*matrix3D(i,3) + M(1,4);
    num_v=M(2,1)*matrix3D(i,1) + M(2,2)*matrix3D(i,2) + M(2,3)*matrix3D(i,3) + M(2,4);
    den=M(3,1)*matrix3D(i,1) + M(3,2)*matrix3D(i,2) + M(3,3)*matrix3D(i,3) + M(3,4);
    newMatrix2D(i,1)=num_u/den;
    newMatrix2D(i,2)=num_v/den;
end
% Reprojected matrix
reprojMatrix=newMatrix2D;
% Calculate difference between the reprojectedMatrix and the original 2D
% Matrix
errorDiff=reprojMatrix-matrix2D;
% Calculate average pixel error in x direction
avgError_u=mean(errorDiff(:,1));
% Calculate average pixel error in y direction
avgError_v=mean(errorDiff(:,2));
% Stop the timer to calculate the time consumed.
timeTaken=toc;

To run the above code, please type in the following commands in MATLAB:

% Run this first
clc
clear all
[calibMatrix, rotationMatrix, translationVector, alpha_u, alpha_v, u_0, v_0, reprojMatrix, avgError_u, avgError_v, timeTaken]= funcCalibrate('3D.txt', '2D.txt');

Here are some of the results obtained by running the code:

  1. Calibration Matrix:

    \begin{pmatrix} -0.0276 & -1.2074e-14 &     -0.0071 & -0.7065 \\ 1.8319e-15 & -0.0276 & -0.0071 & -0.7065 \\ 6.5052e-19 & 7.0039e-17 & -2.7599e-05 & -0.0028 \end{pmatrix}

  2. Rotation Matrix:

    \begin{pmatrix} -1.0000 & -5.7371e-13 & -2.3566e-14 \\ 5.7365e-13 & -1.0000 & -2.5378e-12 \\ -2.3566e-14 & -2.5378e-12 & 1 \end{pmatrix}

  3. Translation Vector:

    \begin{pmatrix} -7.0104e-11 \\ -2.6114e-10 \\ 99.9998 \end{pmatrix}

  4. Parameter for scaling in X-direction: 999.9978
  5. Parameter for scaling in Y-direction: 999.9978
  6. Optical center of the image: (256.0000, 256.0000)
  7. Average Pixel Error in X direction: -3.7259e-11
  8. Average Pixel Error in Y direction: -1.9027e-11

References

[1] Trefethen, Lloyd N.; Bau III, David (1997), Numerical linear algebra, Philadelphia: Society for Industrial and Applied Mathematics, ISBN 978-0-89871-361-9.
[2] http://www.mathworks.com/access/helpdesk/help/techdoc/ref/svd.html





Image Segmentation Dataset

21 11 2009

Here is a dataset that i prepared for testing image segmentation algorithms. All images are synthetic of size 128×128 (i.e. 16384 pixels in total) and grayscale.

Trin

(4 classes)

A

(5 classes)

B

(3 classes)

Jigsaw

(5 classes)

Snowflake

(5 classes)

Sun

(4 classes)

To download the dataset, please Click Here (13 KB, *.rar format)





Adding Vignetting Effect to Images

21 06 2009

When the lens size is smaller than the dimensions of the actual image sensor, or inadequately covering it, it leads to an effect whereby the center of the image is clear, and a fading/darkening of the image borders is seen. This effect is commonly referred to as the Vignetting Effect. In graphics, it is used to de-emphasize the distracting background, and shed light on the foreground objects.

Example

Baby1LeftColorVignette1

Scale factor at image border=1.0

Baby1LeftColorVignette0.8

Scale factor at image border=0.8

Baby1LeftColorVignette0.6

Scale factor at image border=0.6

Baby1LeftColorVignette0.4

Scale factor at image border=0.4

Baby1LeftColorVignette0.2

Scale factor at image border=0.2

Baby1LeftColorVignette0.1

Scale factor at image border=0.1

MATLAB Code

% *************************************************************************
% Title: Function-Introduce Vignetting effect to the image
% Author: Siddhant Ahuja
% Created: September 2008
% Copyright Siddhant Ahuja, 2008
% Inputs: Input Image (var: inputImage), Scale Change level (var:
% scaleLevel) Valid values for scale change should go from 0.1 to 1
% Outputs: Vignetting effect added image (var: vignettImg, Time taken (var: timeTaken)
% Example Usage of Function: [vignettImg, timeTaken]=funcVignettingEffect('Baby1LeftColor.png', 0.5);
% *************************************************************************
function [vignettImg, timeTaken]=funcVignettingEffect(inputImage,scaleLevel)
% Read the input image
try
    % Read an image using imread function
    inputImage=imread(inputImage);
    % grab the number of rows, columns, and channels
    [nr, nc, nChannels]=size(inputImage);
     % Grab the image information (metadata) of input image using the function imfinfo
    inputImageInfo=imfinfo(inputImage);
    % Determine if input left image is already in grayscale or color
    if(getfield(inputImageInfo,'ColorType')=='truecolor')
        colored=1;
    else if(getfield(inputImageInfo,'ColorType')=='grayscale')
        colored=0;
        else
        error('The Color Type of Left Image is not acceptable. Acceptable color types are truecolor or grayscale.');
        end
    end
catch
    % if it is not an image but a variable
    inputImage=inputImage;
    % grab the number of channels
    [nr, nc, nChannels]=size(inputImage);
    if(nChannels)>1
        colored=1;
    else
        colored=0;
    end
end
if scaleLevel <= 0
    error('Scale value must be > 0');
end
vignettImg=inputImage;
tic; % Initialize the timer to calculate the time consumed.
imgCntX = nc/2;
imgCntY = nr/2;
maxDistance = sqrt (imgCntY^2 + imgCntX^2);
if(colored==1)
    for (i=1:nr)
        for (j=1:nc)
           dis = sqrt (abs(i-imgCntY)^2 + abs(j-imgCntX)^2);          
           %% reduce brighness of pixel based on distance from the image center
           vignettImg(i,j,1) = vignettImg(i,j,1)* (1 - (1-scaleLevel)*(dis/maxDistance) );
           vignettImg(i,j,2) = vignettImg(i,j,2)* (1 - (1-scaleLevel)*(dis/maxDistance) );
           vignettImg(i,j,3) = vignettImg(i,j,3)* (1 - (1-scaleLevel)*(dis/maxDistance) );
       end
    end
else
%gray
     for (i=1:nr)
        for (j=1:nc)
           dis = sqrt (abs(i-imgCntY)^2 + abs(j-imgCntX)^2);          
           %% reduce brighness of pixel based on distance from the image center
           vignettImg(i,j) = vignettImg(i,j)* (1 - (1-scaleLevel)*(dis/maxDistance) );
       end
    end
end
% Stop the timer to calculate the time consumed.
timeTaken=toc;





Left-Right Consistency (LRC) Check

13 06 2009

Left-Right Consistency (LRC) check is performed to get rid of the half-occluded (objects scene in one image, and not in other) pixels in the final disparity map. This is computed by taking the computed disparity value in one image, and re-projecting it in the other image. If the difference in the values is less than a given threshold, then the pixels are half-occluded.

Example

Aloe Left Image
Aloe-Left Image
Aloe Right Image
Aloe-Right Image
Aloe Ground Truth Disparity Left-to-Right
Aloe-Ground Truth
Left to Right Disparity Map
Aloe Ground Truth Disparity Right-to-Left
Aloe-Ground Truth
Right to Left Disparity Map
AloeSADL2R15x15DispRange=0-70Gray
Aloe-Left to Right
SAD Disparity Map
Window Size: 15×15
Disparity Range: 0-70
AloeSADR2L15x15DispRange=0-70Gray
Aloe-Right to Left
SAD Disparity Map
Window Size: 15×15
Disparity Range: 0-70
AloeOccluded4,2
Aloe-Ground Truth
Occlusion Map
AloeSADLRC15x15DispRange=0-70Gray
Aloe-LRC Map
(threshold: 2)

MATLAB Code

% *************************************************************************
% Title: Function-Left/Right Consistency Check
% Author: Siddhant Ahuja
% Created: May 2008
% Copyright Siddhant Ahuja, 2008
% Inputs: Left Image (var: leftImage), Right Image (var: rightImage),
% Window Size (var: windowSize), Minimum Disparity (dispMin), Maximum
% Disparity (dispMax), Threshold for the check (var: thresh) typically 2.0
% Outputs: Disparity Map (var: dispMap), Time taken (var: timeTaken)
% Example Usage of Function: [dispMapLRC, timeTaken]=funcLRCCheck('TsukubaLeft.jpg', 'TsukubaRight.jpg', 9, 0, 16,2);
% *************************************************************************
function [dispMapLRC, timeTaken]=funcLRCCheck(leftImage, rightImage, windowSize, dispMin, dispMax, thresh)
% Initiate the Timer to calculate the time consumed.
tic;
% Perform SAD Correlation based matching (Right to Left)
[dispMapR2L, timeTakenR2L]=funcSADR2L(leftImage, rightImage, windowSize, dispMin, dispMax);
% Perform SAD Correlation based matching (Left to Right)
[dispMapL2R, timeTakenL2R]=funcSADL2R(leftImage, rightImage, windowSize,dispMin , dispMax);
% Prepare matrix for subtraction and scale it for comparison
dispMapL2R=-dispMapL2R;
% Find the size (columns and rows) of the L2R Disparity map and assign the rows to
% variable nrLRCCheck, and columns to variable ncLRCCheck
[nrLRCCheck,ncLRCCheck] = size(dispMapL2R);
% Create an image of size nrLRCCheck and ncLRCCheck, fill it with zeros and assign
% it to variable dispMapLRC
dispMapLRC=zeros(nrLRCCheck,ncLRCCheck);
% Find out how many rows and columns are to the left/right/up/down of the
% central pixel based on the window size
win=(windowSize-1)/2;
for(i=1:1:nrLRCCheck)
    for(j=1:1:ncLRCCheck)
        xl=j;
        xr=xl+dispMapL2R(i,xl);
        if (xr>ncLRCCheck||xr<1)
            dispMapLRC(i,j) = 0; %% occluded pixel
        else            
            xlp=xr+dispMapR2L(i,xr);
            if (abs(xl-xlp)<thresh)
                dispMapLRC(i,j) = -dispMapL2R(i,j);  %% non-occluded pixel            
            else
                dispMapLRC(i,j) = 0; %% occluded pixel                        
            end
        end
    end
end
% Terminate the Timer to calculate the time consumed.
timeTaken=toc;





Compute Variance Map of an Image

8 06 2009

Variance map of an image is calculated by taking a square window of a set size around a center pixel, and calculating the variance of the values of the pixels.

The variance within the window can be calculated from the following equation:
Variance Equation

Example

MonopolyLeftColorMonopoly-Left Image

MonopolyLeft Variance Map 5,9Monopoly-Left Image-Variance Map

(Pixels marked as black are low-texture regions)

MATLAB Code

% *************************************************************************
% Title: Function-Compute Variance map of the image
% Author: Siddhant Ahuja
% Created: May 2008
% Copyright Siddhant Ahuja, 2008
% Inputs: Input Image (var: inputImage), Window Size (var: windowSize),
% Threshold (var: thresh) Typical value is 140
% Outputs: Variance Map (var: varianceImg) , Time taken (var: timeTaken)
% Example Usage of Function: [varianceImg, timeTaken]=funcVarianceMap('MonopolyLeftColor.png', 5, 9);
% *************************************************************************
function [varianceImg, timeTaken]=funcVarianceMap(inputImage, windowSize, thresh)
try 
    % Grab the image information (metadata) of input image using the function imfinfo
    inputImageInfo=imfinfo(inputImage);
    if(getfield(inputImageInfo,'ColorType')=='truecolor')
    % Read an image using imread function, convert from RGB color space to
    % grayscale using rgb2gray function and assign it to variable inputImage
        inputImage=rgb2gray(imread(inputImage));
        % Convert the image from uint8 to double
        inputImage=double(inputImage);
    else if(getfield(inputImageInfo,'ColorType')=='grayscale')
    % If the image is already in grayscale, then just read it.        
            inputImage=imread(inputImage);
            % Convert the image from uint8 to double
            inputImage=double(inputImage);
        else
            error('The Color Type of Left Image is not acceptable. Acceptable color types are truecolor or grayscale.');
        end
    end
catch
    % if it is not an image but a variable
    inputImage=inputImage;
end
% Find the size (columns and rows) of the input image and assign the rows to
% variable nr, and columns to variable nc
[nr,nc] = size(inputImage);
% Check the size of window to see if it is an odd number.
if (mod(windowSize,2)==0)
    error('The window size must be an odd number.');
end
% Create an image of size nr and nc, fill it with zeros and assign
% it to variable meanImg
meanImg=zeros(nr, nc);
% Create an image of size nr and nc, fill it with zeros and assign
% it to variable varianceImg
varianceImg=zeros(nr, nc);
% Find out how many rows and columns are to the left/right/up/down of the
% central pixel based on the window size
win=(windowSize-1)/2;
tic; % Initialize the timer to calculate the time consumed.
% Compute a map of mean values
for(i=1+win:1:nr-win)
    for(j=1+win:1:nc-win)
        sum=0.0;
        for(a=-win:1:win)
            for(b=-win:1:win)
                sum=sum+inputImage(i+a,j+b);
            end
        end
        meanImg(i,j)=sum/(windowSize*windowSize);
    end
end
% Compute a map of variance values
for(i=1+win:1:nr-win)
    for(j=1+win:1:nc-win)
        sum=0.0;
        for(a=-win:1:win)
            for(b=-win:1:win)
                sum=sum+((inputImage(i+a,j+b)-meanImg(i,j))^2);
            end
        end         
        var=sum/((windowSize*windowSize)-1);
        % Apply threshold to produce a binarized variance map
        if (var > thresh)
            varianceImg(i,j) = 255;
        else
            varianceImg(i,j) = 0;
        end
    end
end
% Stop the timer to calculate the time consumed.
timeTaken=toc;





Finding Low-texture or textureless regions in images

2 06 2009

Stereo vision algorithms typically compute erroneous results in regions where there is a little or no texture in the scene. They are defined in [1] as regions where the squared horizontal intensity gradient averaged over a square window of a given size is below a given threshold.

Example

Bowling1-Left ImageBowling1-Left Image

Bowling1-Left Image-Textureless MapBowling1-Left Image-Textureless Map

(Pixels marked as white are low-texture regions)

MATLAB Code

% *************************************************************************
% Title: Function-Find Textureless regions of an image
% Notes: Textureless regions are defined as regions where the squared horizontal
% intensity gradient averaged over a square window of a given size 
% (windowSize) is below a given threshold (thresh);
% Author: Siddhant Ahuja
% Created: May 2008
% Copyright Siddhant Ahuja, 2008
% Inputs: Input Image (var: inputImage), Window Size (var: windowSize),
% Threshold (var: thresh) Typical value is 4
% Outputs: Textureless Map (var: texturelessImg) , Time taken (var: timeTaken)
% Example Usage of Function: [texturelessImg, timeTaken]=funcTexturelessRegions('imL.png', 9, 4);
% *************************************************************************
function [texturelessImg, timeTaken]=funcTexturelessRegions(inputImage, windowSize, thresh)
% Read the input image
try
    % Read an image using imread function
    inputImage=imread(inputImage);
    % grab the number of rows, columns, and channels
    [nr, nc, nChannels]=size(inputImage);
     % Grab the image information (metadata) of input image using the function imfinfo
    inputImageInfo=imfinfo(inputImage);
    % Determine if input left image is already in grayscale or color
    if(getfield(inputImageInfo,'ColorType')=='truecolor')
        colored=1;
    else if(getfield(inputImageInfo,'ColorType')=='grayscale')
        colored=0;
        else
        error('The Color Type of Left Image is not acceptable. Acceptable color types are truecolor or grayscale.');
        end
    end
catch
    % if it is not an image but a variable
    % grab the number of channels
    [nr, nc, nChannels]=size(inputImage);
    if(nChannels)>1
        colored=1;
    else
        colored=0;
    end
end
% Check the size of window to see if it is an odd number.
if (mod(windowSize,2)==0)
    error('The window size must be an odd number.');
end
% Create an image of size nr and nc, fill it with zeros and assign
% it to variable texturelessImg
texturelessImg=zeros(nr, nc);
% Create an image of size nr and nc, fill it with zeros and assign
% it to variable sqGradImg
sqGradImg=zeros(nr,nc);
% Find out how many rows and columns are to the left/right/up/down of the
% central pixel based on the window size
win=(windowSize-1)/2;
inputImage=double(inputImage);
tic; % Initialize the timer to calculate the time consumed.
% Produce Squared Horizontal Gradient image sqGradImg
for (i=1:1:nr)
    for (j=1:1:nc-1)
        sum = 0.0;        
        for (k=1:1:nChannels)
            diff =  inputImage(i,j,k) - inputImage(i,j+1,k);
            sum = sum + (diff*diff); 
        end
        sum = sum / nChannels;
        sqGradImg(i,j+1) = sum;
        if (j==1)
            sqGradImg(i,j) = sum;
        end
        if (sum > sqGradImg(i,j))
            sqGradImg(i,j) = sum;
        end
    end
end
% Compute average within predefined box window of size windowSize x
% windowSize
for (i=1+win:nr-win)
    for (j=1+win:nc-win)       
        % go over the square window
        sum = 0.0;
        avg = 0.0;
        for (a=-win:1:win)
            for (b=-win:1:win)                
                sum = sum + sqGradImg(i+a,j+b);
            end
        end           
        % Compute the average
        avg = sum / (windowSize*windowSize);
        % Apply threshold
        if (avg < (thresh*thresh))
            texturelessImg(i,j) = 255; % mark detected textureless pixel as white
        end
    end
end
% Stop the timer to calculate the time consumed.
timeTaken=toc;

References

[1] D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” International Journal of Computer Vision, vol. 47(1/2/3), pp. 7-42, Apr. 2002.





Finding Depth Discontinuous regions in stereo-images

31 05 2009

Stereo vision algorithms typically compute erroneous results in regions where there is a sudden change in the depth between objects in the scene. They are defined in [1] as regions where neighboring disparities differ by more than a certain gap, dilated by a window of a given width.

Example

Flowerpots Left ImageFlowerpots-Left Image

Flowerpots Right ImageFlowerpots-Right Image

Flowerpots, Ground Truth Left-to-RightFlowerpots-Ground Truth

Disparity Map=Left-to-Right Image

Flowerpots, Ground Truth Right-to-LeftFlowerpots-Ground Truth

Disparity Map-Right-to-Left Image

Flowerpots-Depth-DiscontinuousFlowerpots-Depth-Discontinuous Regions

(Pixels marked as white are depth-discontinuous)

MATLAB Code

% *************************************************************************
% Title: Function-Find Discontinuous regions of an image
% Notes: 
% 1. According to paper [A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms], Depth Discontinuous regions are defined as pixels whose neighboring
% disparities differ by more than a certain gap, dilated
% by a window of width windowSize.
% 2. According to paper [ An Experimental Comparison of Stereo Algorithms], A pixel is a depth 
% discontinuity if any of its (4-connected) neighbors has a disparity that differs by more than 1 from 
% its disparity. Neighboring pixels that are part of a sloped surface can easily differ by 1 pixel, but
% should not be counted as discontinuities.
% Author: Siddhant Ahuja
% Created: May 2008
% Copyright Siddhant Ahuja, 2008
% Inputs: Input Image-Depth Map (var: inputImage), Window Size (var: windowSize),
% Threshold (var: thresh) Typical value is 2
% Outputs: Discontinuous Map (var: discontinuousImg) , Time taken (var: timeTaken)
% Example Usage of Function: [discontinuousImg, timeTaken]=funcDiscontinuousRegions('disp_l_r.png', 9, 2);
% *************************************************************************
function [discontinuousImg, timeTaken]=funcDiscontinuousRegions(inputImage, windowSize, thresh)
% Read the input image
try
    % Read an image using imread function
    inputImage=imread(inputImage);
    % grab the number of rows, columns, and channels
    [nr, nc, nChannels]=size(inputImage);
     % Grab the image information (metadata) of input image using the function imfinfo
    inputImageInfo=imfinfo(inputImage);
    % Determine if input left image is already in grayscale or color
    if(getfield(inputImageInfo,'ColorType')=='truecolor')
        inputImage=rgb2gray(inputImage);
    else if(getfield(inputImageInfo,'ColorType')=='grayscale')
        inputImage=inputImage;
        else
        error('The Color Type of Left Image is not acceptable. Acceptable color types are truecolor or grayscale.');
        end
    end
catch
    % if it is not an image but a variable
    % grab the number of channels
    [nr, nc, nChannels]=size(inputImage);
    if(nChannels)>1
        inputImage=rgb2gray(inputImage);
    else
        inputImage=inputImage;
    end
end
% Check the size of window to see if it is an odd number.
if (mod(windowSize,2)==0)
    error('The window size must be an odd number.');
end
% Create an image of size nr and nc, fill it with zeros and assign
% it to variable discontinuousImg
discontinuousImg=zeros(nr, nc);
% Create an image of size nr and nc, fill it with zeros and assign
% it to variable edgeImg
edgeImg=zeros(nr, nc);
% Find out how many rows and columns are to the left/right/up/down of the
% central pixel based on the window size
win=(windowSize-1)/2;
inputImage=double(inputImage);
tic; % Initialize the timer to calculate the time consumed.
% Find variation/edges in disparity values in horizontal and vertical directions
for (i=1:1:nr-1)
    for (j=1:1:nc-1)
        % Traverse in horizontal direction
        if (abs(inputImage(i,j)-inputImage(i,j+1)) > thresh )
            edgeImg(i,j) = 255;
            edgeImg(i,j+1) = 255;
        end
        % Traverse in vertical direction
        if (abs(inputImage(i,j)-inputImage(i+1,j)) > thresh )
            edgeImg(i,j) = 255;
            edgeImg(i+1,j) = 255;
        end
    end
end
% Dilate within the window
for (i=1+win:nr-win)
    for (j=1+win:nc-win)       
        % Go over the square window
        sum = 0.0;
        for (a=-win:1:win)
            for (b=-win:1:win)                
                sum = sum + edgeImg(i+a,j+b);
            end
        end  
        % Apply Threshold
        if (sum>0)
            discontinuousImg(i,j) = 255;
        else
            discontinuousImg(i,j) = 0;
        end
    end
end
% Stop the timer to calculate the time consumed.
timeTaken=toc;

References

[1] D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” International Journal of Computer Vision, vol. 47(1/2/3), pp. 7-42, Apr. 2002.








Follow

Get every new post delivered to your Inbox.

Join 180 other followers