Date: Sat, 24 Jun 2006 23:09:50 -0500 (CDT) Subject: road segmentation with YUV chroma X-UID: 178 Road segmentation using simple two chroma channel histogram thresholding (YUV space, not RGB) in a circular image buffer works well enough. It's good enough for the robot. It just has to be integrated with the rest of the control system. http://golem5.org/robot1/video/roadseg20060624.mpg The original images came from a digital camera. It was on an upside down Bogen tripod that I held while walking down the path. The weight of the tripod helped to stabilize the camera and prevent it from bouncing too much with my walk. The upper image is the original while the lower is the output from CircularImageBuffer::outputRoad(). There is very little source code for this. Not counting the third party open source libraries like jpeglib, the entire onboard robot control and image processing system will probably fit in under 2000 lines of code. Only a few hundred lines of that will be for vision. All black means road. The blue line indicates the end of the road. It approximates the direction and speed the robot should take. This will be the output of the monocular vision system. The road segmentation key area is a small rectangular patch at the bottom center of the frame. This is presumed to be the road. The rest of the image is compared against this key in YUV space, ignoring the Y intensity channel. This works well enough although shadows are still a problem. Note that the road segmentation is never really accurate when there are any shadows. Note that no use of the Y intensity channel is used. Everything is done entirely with the U and V chroma channels. To address shadows properly, I suspect that information in the Y channel is necessary. I've also tried running old data sets from previous robot drives through the system. They are far more problematic. One set was done just after dawn and is underexposed. The image lacks sufficient dynamic range for histogram thresholding techniques to work well. The other is done in the middle of a parking lot with drivable surface boundaries formed from concrete instead of green grass. The common factor is insufficient dynamic range in the image chroma histogram for easy segmentation. That can be a result of available light and the camera sensor or an environment in which there is insufficient difference in color. I believe that good field vision systems are possible. But they are not easy. It is like chess computers. The concept is simple. Implementation of grandmaster level systems requires significant resources with optimized search. A vision system is similar in that it must use many techniques and somehow evaluate a best guess.