Date: Sat, 13 May 2006 19:27:06 -0500 (CDT) Subject: image segmentation and classifier problems X-UID: 171 Content-Type: IMAGE/JPEG; name="img2896.jpg" Some broad details of Stanley's software (Stanford's DARPA GC robot - drive by wire VW Touareg) have been made public. For instance, there is a page about the road segmentation using a combination of ground-truth LIDAR with a RGB colorspace threshold classifier. Note that Stanford is never mentioned by name. But it is obvious the robot is Stanley. http://www.intel.com/technology/itj/2005/volume09issue02/art03_learning_vision/p 05_road_segmentation.htm I think that there are enough details in public to duplicate Stanley's vision system. That's what I'm going to do. It will be first used on my robot. Then I'll give it away for free as open source. In the longer run, a general purpose image segmentation and feature extraction classifier engine is of great value. There is a lot of technology that is floating around but requires expert level knowledge and skill to use. It would be nice to mainstream these ideas as off-the-shelf software. This is a significant change in the robot project. In a sense, the robot as a physical entity is secondary to software and theory development. What is valuable and useful is the software. In preparation, I picked up _Image Processing, The Fundamentals_ at Nerdbooks (warehouse is in Richardson and they will sell directly to you there). The main attraction of this book is in the attached picture. I have learned that very hard books with too much intellectual machismo have little value to me. I can not understand them! Like electronics and control theory, image processing is a very deep subject. The big surprise for me is to see how much stochastic methods are used. In university, linear algebra and matrix theory is a largely separate area from probability. But for control systems and image processing, they overlap. Unfortunately for me, I am very uncomfortable with the multi-variate stochastic mathematics involved as I never had it in school. Last week, I saw a wall sized poster that diagrammed the Al-Qaeda network based on public information from 1992 to 2003. It was the semantic web for that organization. Hundreds of people and events were interrelated in a massive multi-layered graph. This diagram has been shown on television but does not appear to be widely available. One of the guys who made it at Raytheon explained it to me. If you've never seen this diagram, just think of a show like "Alias" or "La Femme Nikita" with some very busy looking graphical display of the terrorist network - that's exactly what this poster was except that it was real. What does this poster have to do with robot vision? The connection is that both problems require classifiers, some way of measuring the distance between things: documents, images, pixels, etc. The robot needs to know which pixels represent road and which are non-road. Then it can know where it is safe to go. The data mining intelligence system needs to derive a basis of actors and events from documents. Then it needs to estimate the interrelationships between them. Another major application of these techniques are search engines. This often leads to representing things as vectors and relationships as transform matrices. All of the stuff about finding eigenvalues then becomes a spectral method for culling out the information from high dimensional sparse matrices. It is strange how much stuff ends up back in linear algebra and matrix theory at a basic level.