Back to the main page

Examples using the sample tools

Note: The ImageMagick tool display is used throughout. ImageMagick is an excellent free, open source set of general image processing tools.

Image segmentation - finding the road or driveable ground

The ppm2seg tool segments an image using a specified "ground truth" patch. Every pixel in the image is compared with this patch. Pixels sufficiently similar in either luminosity (Y channel) or chromaticity (Cb and Cr channels) are marked as part of the same image segment the patch came from. It would be better to segment using both luma and chroma at the same time (in RGB, with appropriate statistical analysis). Otsu's method automatically computes an optimal segmentation threshold (that maximizes the statistical variance between the cluster of segment pixels and the cluster of non-segment pixels). This threshold can also be manually overridden.

We will use an old trick. The surface immediately before us at the bottom of the image frame is taken as the ground truth key for the road. Assuming we are already on a road, this is probably true (unless we are turned sideways and heading off the road). That's what we will do here. First, let's see if this can possibly work by examining the chromaticity image.

cat parkpath.jpg | ./jpg2ppm | ./ppm2chroma -s 8 | display -

Usage:    cat input.ppm | ./ppm2chroma [-s number] > output.ppm
  reduce edge magnitudes by a power of 2 (default is -s 5)
      -s number of bits to right shift

The ppm2chroma command above shows the Cb channel as blue and the Cr channel as red. The shift factor of 8 attenuates detected luminosity (Y channel) edges in green. Normally, it is convenient to see where the blobs of pure chroma align with the edge outlines of the image. That indicates if what we think are objects really do have distinct colors. But this time, we'll tone down the luma edges to make chroma stand out more.

The road surface appears very different from the surrounding grass. It is almost entirely a single color and only has a few small "holes" in it. This means that segmenting the road based on chroma will probably work. If the road were made up of many different blotches of color or filled with large holes, then segmentation would alone not work very well.

cat parkpath.jpg | ./jpg2ppm | ./ppm2seg -x 320 -y 480 -p 16 | display -

Usage:    cat input.ppm | ./ppm2seg [-x column] [-y row] [-p size] [-t threshold] [-l|-c] > output.ppm
  location of key patch (default upper left corner -x 0 -y 0)
      -x column
      -y row
  size of key patch (default pixel square -p 16)
      -p number pixels per side (square size must be multiple of 8)
  segmentation threshold radius (default -t 1)
      -t distance from key patch
  segment in luma or chroma (default is -c)
      -l luminosity, the Y channel
      -c chroma, the Cb and Cr channels

The first attempt at segmenting the road from the image does not work very well. The key patch region is a square 16 pixels on a side at the bottom center of the image. Otsu's method gave a threshold of 3. The road is partially segmented, along with lots of tree branches. But at least none of the grass is touched.

cat parkpath.jpg | ./jpg2ppm | ./ppm2seg -x 320 -y 480 -p 128 | display -

This time, we try a much larger key patch region. It is a square 128 pixels on a side. Otsu's threshold is still 3. This means that despite the road appearing of a uniform color, there is still a lot of variation in the pixel values. A fairly large representative sample is required to accurately characterize the road color. The first time, the key patch happened to include outlying pixels too far away from the center of the road pixel values. With a larger sample of the road, the median value more accurately reflects the road's chromaticity.

Still, tree trunks and branches are also being segmented out of the image. Let's see what happens if we segment in luminosity instead of chromaticity.

cat parkpath.jpg | ./jpg2ppm | ./ppm2seg -x 320 -y 480 -p 128 -l | display -

Hmm, this doesn't seem to work very well. The trees are not segmented this time. But now we have random specks in the road and a patch of sky in the distance. Let's try again. This time, we will force the segmentation threshold higher (to include more) than the level of 2 that Otsu's method generated.

cat parkpath.jpg | ./jpg2ppm | ./ppm2seg -x 320 -y 480 -p 128 -l -t 20 | display -

This looks much better. Now we can see the road clearly segmented. And there are only a few isolated specks in the trees (in the reduced JPEG on this page, they are invisible, sorry). On the downside, there is a good bit of the sky segmented too. That's why Otsu's method stops at a lower threshold than the one we just used. It works purely from the pixel value histogram. It has no concept of pixel locations in the image to give proximity or "connectedness" of regions. And of course, it knows nothing about the shape of things like roads in real life. It is purely a statistical technique.

So chroma segmentation identified the road and trees. Luma segmentation identified the road and sky. If we take the intersection, then all we have is the road! That's what we want. Obviously, image segmentation is best done in both luminosity and chromaticity rather than one or the other alone. Color and light intensity both classify the road.

Still, even with pure color image segmentation, it is possible to improve the results. We will again use image segmentation with chromaticity. But now the result is cleaned up with morphological operations (region closing and opening). The thin segmentation "noise" in the trees can be eroded away while the thick road segment we are interested in remains. The ppm2morph tool erodes, dilates, opens and closes using a 5x5 structuring element.

cat parkpath.jpg | ./jpg2ppm | ./ppm2seg -x 320 -y 480 -p 128 | ./ppm2morph -g -o 1 | display -

Usage:    cat input.ppm | ./ppm2morph [-r|-g|-b] [-d num|-e num|-o num|-c num] > output.ppm
  which color channel to process (default is all channels)
      -r red image channel only
      -g green image channel only
      -b blue image channel only
  morphological operation (default is -e 1)
      -d number of times repeat dilation (expand region)
      -e number of times repeat erosion (shrink region)
      -o number of times repeat opening (remove small blobs)
      -c number of times repeat closing (fill in holes)

The image segment has been morphologically "opened" - eroded and then dilated. Much of the thin segmentation noise in the trees disappears during erosion. Of course, erosion also affects the main part of the segment over the road. But dilation restores this for us. While we see improvment, there is still a lot of small segmentation bits in the trees. Morphological opening should be more aggressive to remove this.

cat parkpath.jpg | ./jpg2ppm | ./ppm2seg -x 320 -y 480 -p 128 | ./ppm2morph -g -o 3 | display -

This time, the morphological opening is more aggressive. The erosion is repeated 3 times. Then it is followed by 3 dilation steps. The stray bits of segment in the trees are gone. The road is clearly segmented. However, there are still some holes in it. Some of that is inevitable as lanes are painted with dashes along the center of the path. But we can still do better (at the cost of more computation - really, none of what we are doing is free and at some point, we just need results that are good enough for a robot or computer vision system to use).

NOTE: The segmented pixels along the left and right sides of the image are a side effect of how the morphological operations are implemented. Edges are really a special case that I decided, for now, not to handle (in the interest of simplicity and speed).

cat parkpath.jpg | ./jpg2ppm | ./ppm2seg -x 320 -y 480 -p 128 | ./ppm2morph -g -o 3 | ./ppm2morph -g -c 3 | display -

An extra round of morphological closing fills in some of the small holes in the road segment. But...the larger holes are still there. At this point, we've reached diminishing returns. Another strategy is probably required. We could use luminosity based road segmentation to improve accuracy. This might be enough to close up some holes in the road segment. Or we could use our knowledge of what a road looks like to ignore the holes we still see. From the very little I've read about Stanford's DARPA Grand Challenge robot named Stanley, it uses all of these strategies to accurately segment the road (and multiple SICK LIDAR units to establish unambiguous ground truth - a luxury many robots and computer vision systems do not have).