Thursday, May 21, 2015

Road and Lane Detection: Different Scenarios and Models

Advanced Driver Assistance Systems are an integral part of vehicles today. They can be passive, as in merely alerting the driver in case of emergencies, or actively respond by taking over vehicle controls during emergency scenarios. Such systems are expected to reach full autonomy during the next decade. The two major fields of interests in the problem are: road and lane perception, and obstacle perception. The former involves finding out road and lane markers, to ensure that vehicle position is correct, and to prevent any departures. Obstacle detection is necessary to prevent collisions with other traffic, or real-life artifacts like streetlights, stray animals, pedestrians, etc.

Problem Scope


Road and lane perception include detecting the extent of the road, the number and position of lanes, merging and splitting lanes, over different scenarios like urban, highway or cross-country. While the problem seems trivial given recent advancements in image processing and feature detection algorithms, the problem is complicated by the presence of several challenges, such as:

•    Case diversity: Due a verity of real-world parameters, the system has to be tolerant of a huge diversity of incoming parameters. These include:
  1.     Lane and Road appearance: Color, texture and width of lanes. Road color, width and curvature differences.
  2.     Image clarity: Presence of other vehicle, shadows cast by objects, sudden changes in illumination.
  3.     Visibility conditions: Wet roads, presence of fog or rain, night-time conditions.
•    High reliability demands: In order to be useful and acceptable, the assistance system should achieve very low error rates. A high rate of false positives will lead to driver irritation and rejection, while false negatives will cause system compromise and low reliability.

Modalities Used


The state-of-the-art research and commercial systems are looking at several perception modalities s sensors. A quick view at their operation and pros-cons is presented here:

1.    Vision: Perhaps the most intuitive approach is to use vision based systems, as lane and road markers are already optimized for human vision detection. Use of front-mounted cameras is nearly standard approach in almost all systems, and it can be argued that since most of the signature of lane marks is in the visual domain, no detection system can totally ignore the vision modality. However, it must be stressed that the robustness of the current state-of-the-art processing algorithms is far from satisfactory, and they also lack the adaptive power of a human driver.

2.    LIDAR: The most emerging technology is the use of Light Detection and Ranging sensors, which can produce a 3D structure of the vehicle surrounding, thereby increasing robustness as obstacles are more easily detected in 3D. In addition, LIDARs are active sources- thus they are more illuminance adaptive. The LIDAR sensors are however very expensive.

3.    Stereo-vision: Stereo-vision uses two cameras to obtain the 3D information, which is much cheaper in terms of hardware, but requires significant software overhead. It also has poorer accuracy, and leads to more probability error.

4.    Geographic Information Systems: The use of prior geographic database together with known host-vehicle position can in effect replace the on-board processing requirement and enable worldwide ‘blind’ autonomous driving. However, the system needs very accurate positioning in terms of resolution of the vehicle position, as well as updating the geographic database in real-time with changing traffic dynamics and obstacle positions, either by satellite imagery or GPS measurements. The uncertainty in obtaining and updating highly accurate map information over large terrains has constrained it as a complementary tool to on-board processing.

5.    Vehicle Dynamics: The presence of sensors like Inertial Measurement Units (IMUs) provides insight into the motion parameters of the vehicle such as speed, yaw rate and acceleration. This information is used in the temporal integration module, to relate data across several time-frames.

Generic Solutions


The road and lane detection problem can be broken into the following functional modules. The implementation of said modules uses different approaches across different research and commercially available systems, but the ‘generic system’ presented here is present as the holistic skeleton for them.

1.    Image Cleaning: A pre-filer is applied to the image to remove most of the noise and clutter, arising from obstacles, shadows, over and under exposure, lens flare and vehicle artifacts. If training data is available or data from previous frames is harnessed, a suitable region of interest can be extracted from the image to reduce processing.

2.    Feature Extraction: Based on the required subtask low-level features such as road texture, lane marker color and gradient, etc. are extracted.

3.    Model Fitting: Based on the evidence gathered, a road-lane model is fitted to the data.

4.    Temporal Integration: The model so obtained is reconciled with the model of the previous frames, or the GPS data if available for the region. The new hypothesis is accepted if the difference is explainable based on the vehicle dynamics.

5.    Post Processing: After computation of the model, this step involves translation from image to ground coordinates, and data gathering for use in processing of subsequent frames.

Future Prospects


In concluding remarks, we can stress that road and lane segmentation are fundamental problems of Driver Assistance Systems. The extent of complexity can range from passive Lane Departure Warning systems to fully autonomous ‘blind’ drivers. The next step forward is to extend the scope of current detection techniques into new domains, and to improve its reliability. The first requires a better understanding and development of new road-scene models that can capture multiple lanes, non-linear topographies and other non-idealities successfully. The reliability challenge is harder, especially for closed-loop systems, where even small error rates may propagate. It might become essential to include modalities other than vision, and incorporate machine learning to train algorithms better.