VLBV 98 Panel 1: Image/Video feature extraction and segmentation

VLBV 98: Oct 8-9, 1998, University of Illinois at Champaign/Urbana


B. S. Manjunath  (Panel Statement of Purpose):

Image and video segmentation is a crucial initial step for many vision and multimedia applications. With a good segmentation, it is possible to access and manipulate objects in the image and video. It also allows high-level image analysis such as object recognition and scene interpretation. Some important applications of the segmentation problem include the emerging object-based video coding standard MPEG-4 and in MPEG-7 which is related to content-based retrieval of multimedia databases. Good segmentation tools are crucial to the success of these future standards.

Main Issues:

Generality vs Application Specific: Constraining to specific applications enables development of robust schemes. This is particularly true for images having well defined figure-ground separation. However, much work is needed in segmenting more general scenes. Robustness on segmenting large data sets needs to be demonstrated.

Accuracy: Applications such as in entertainment where individual objects are manipulated, require a high level of accuracy. If an automated technique does 99% good work, the remaining 1% may take as much time as the whole job, and this is not acceptable.

However, in other cases such as applications in MPEG-7, precision of segmentation depends on the context. Perhaps different segmentation tools are needed to meet these contrasting requirements.

Image/video features for segmentation: Closer integration of various image features to achieve better segmentation (than combining the segmentation results based on individual features). This is also true for spatio-temporal segmentation.

Simplicity: Very little parameter tuning is desirable; If it takes few hours to fine tune the parameters for each image, then it is not going to be very useful.

Application enabled by MPEG-4 and MPEG-7 demad simple yet robust segmentation tools, with demonstrated capabilities on large and diverse image collections. Standard data sets with ground truth are needed to benchmark the performance of the algorithms, and researchers should be encouraged to provide results on a significant collections of such data.