Computer Vision Overview

From RoboWiki
Jump to: navigation, search

Computer vision is the science of extracting useful information from a digital image. A digital camera or camcorder captures an image. Next, the image is processed and useful information gathered from the image (ie. objects) are relayed to the main control program. OpenCV is a useful library that the club has used to develop most of its vision code.



OpenCV is a useful library developed by Intel. The library allows provides common computer vision tools such as filters, convolution, transforms and thresholding. Two helpful websites if you want to become more familiar with OpenCV are Introduction to programming with OpenCV Introduction to programming with OpenCV and CV Reference Manual.


You can install this using yum, through the two commands of "yum install opencv.i386" and "yum install opencv-devel.i386".

If you are manually building OpenCV, grab the latest linux release on Sourceforge. Extract the contents by running the command "tar -xvf [file name]". Move into the extracted directory and run the command "./configure --prefix=/usr". This command will check if you have all needed dependencies for OpenCV as well as system settings for correct OpenCV support. After configuration, run the "make" command to start building the library. Then run "make install" to install it onto your system.

Source(s): Computer solutions

Path Detection

The robotics club uses path detection for Penn State Abington's Mini Grand Challenge. By using a Hue, Saturation and Value thresholding a path can easily be distinguished from surrounding terrain. (add sample images of image ->hsv->warp)

Lane Detection

Lane detection was required for the AUVSI IGVC. The club tried three different methods until we found one that worked properly in the widest range of conditions.


Our first method was thresholding. Since we had experience using thresholding for the Mini Grand Competition, we wanted to reuse the code for the lane detection. Thresholding was not a proper choice because light reflecting off pieces of grass can appear white (the same color as the lane markers). The reflections return false positives to the robot which then has no where to navigate due to the numerous incorrect obstacles surrounding it.

Edge Detection

A Canny Edge Detector is a popular and useful method in computer vision for finding the edges of an object. The image is smoothed and then the gradient is found in the x and y directions. Next, non-maximum suppression is applied to the image. This technique provides a threshold to limit the number of false positives while finding an edge that has continuity.

Hough Transform

The Hough Transform proved to be the simplest and most robust of the methods. First the image converted to a gray scale image. Next the Hough Transform is applied to the gray scale image. The Hough Transform find lines of a given length using thresholding, separation and direction and connects the points. Connecting the points was useful because the camera did not always capture the entire line. Also, since the points have to be in a line and in a specific direction, the noise from reflections that caused problems with the other two methods was no longer a problem.

Obstacle Avoidance

In order to avoid an object, a robot must first identify the object and then recognize where the object is with respect to itself. For the AUSVI IGCV Competition we used the Player/Stage environment. After locating an object of line marker, a homography matrix was created in order to map the image coordinates to world coordinates. This step allowed us to accurately provide the robot with obstacle distances. Through Player/Stage we used the Sick Lidar driver to transfer the information.

Further Reference

If you find computer vision interesting, you might want to consider taking Penn State's EE/CMPEN 454 course. This course provides the fundamental math as well as concepts to better understand the basics of computer vision.

Sample Code

// Time-lapse photo software. Created by Jeremy Bridon ( []
// Include standard IO and openCV files
#include <iostream>
#include "cv.h"
#include "cvaux.h"
#include "highgui.h"
using namespace std;
int main()
	int totalTime;	// In seconds
	int timeDelay;	// In seconds
	int currentTime = 0;
	char fileFormat;
	cout << "Enter total time in minutes: ";
	cin >> totalTime;
	totalTime *= 60; // Convert to minutes
	cout << "Enter delay time is seconds: ";
	cin >> timeDelay;
	cout << "Save file format - Enter I for Image or M for Movie: ";
	cin >> fileFormat;
	// Initialize the device (default index is 0, will usualy find the device)
	CvCapture *myCamera = cvCaptureFromCAM(0);
	if(myCamera == NULL)
		cout << "Unable to find camera..." << endl;
		return -1;
	// Create the Image or Video file format
	CvVideoWriter* myVideo = NULL;
	if(toupper(fileFormat) == 'M')
		// Create movie file format with MPEG-4 codec
		CvSize imageSize;
		IplImage *sizeImage = cvQueryFrame(myCamera);
		imageSize = cvGetSize(sizeImage);
		myVideo = cvCreateVideoWriter("TimeLapseVideo.avi", -1, 24, imageSize);
		// Error check
		if(myVideo == NULL)
			cout << "Unable to create video format..." << endl;
			return -1;
	else if(toupper(fileFormat) == 'I')
		// Create image
		// Nothing to do...
		cout << "Wring output format choice..." << endl;
		return -1;
	// Keep looping untill we finish the time
	cout << "Starting time-lapse camera" << endl;
	int frameCount = 0;
	while(currentTime < totalTime)
		// Wait the timeDelay
		Sleep(1000 * timeDelay);
		// Add to time
		currentTime += timeDelay;
		// Add to frame count
		cout << "Took picture number: " << frameCount << "." << endl;
		// Take a picture from the device (these two lines are the same as cvQueryFrame())
		int result = cvGrabFrame(myCamera);
		IplImage* myImage = cvRetrieveFrame(myCamera);
		// Now that we have an image we can save it as an IMAGE or put it into a MOVIE
		if(myVideo != NULL)
			result = cvWriteFrame(myVideo, myImage);
			bool debug = true;
			char fileName[64] = "CameraImage_";
			char fileEnd[8] = ".jpg";
			char numberString[8] = "";
			sprintf(numberString, "%d", frameCount);
			strcat(fileName, numberString);
			strcat(fileName, fileEnd);
			result = cvSaveImage(fileName,myImage);
	// Release the camera resource
Personal tools