Visual Content Analysis and Coding: The Consumer Perspective
M. Ibrahim Sezan
Sharp Laboratories of America
A Position Statement for Panel 4: Applications
The 1998 International Workshop on Very Low Bitrate Video Coding

Visual content analysis ranges from its simplest form like video editing to the more advanced form like content-based indexing and retrieval for database applications. Currently, in any form, content analysis is difficult and expensive for the consumer.

In order to appreciate the difficulty, let's start from the very first necessary step: transferring digital video from a digital video (DV) camcorder to a PC. (Consider a DV camcorder since it provides higher visual quality compared to its analog counterpart.) Recent DV camcorders are equipped with IEEE1394 connections for digital video transfer to PC's. To utilize this option, one has to invest in the required PC hardware and software, as well as additional hard disk to manage the video once it is on the PC platform. Given the high price of the camera itself, this entire set up may be an expensive proposition for the consumer. Even if all the necessary components are available, it is still not an easy task for the consumer to get the digital video into the PC. In short, current solutions are expensive and difficult to use even for the relatively "simple" end goal of video editing. It is reasonable that the current situation will improve in the near future as prices drop and the hardware and software becomes more stable and easy to use.

Now consider the video content analysis problem, perhaps aimed at advanced applications such as database indexing and retrieval. Content analysis aimed at extracting semantic information (most useful for consumers) on the basis of low-level features is especially difficult. In any case, the consumer will not have the expertise nor the patience to execute complex video analysis algorithms.

So far, I have portrayed a pessimistic picture of the current situation. The difficulty may suggest that content analysis is a hopeless task for consumer applications. I do not think this should be the case. Content analysis has great advantages. Therefore, innovation will emerge which will find ways of making this benefit accessible to consumers. The innovation is not an easy task.

For discussion, I like to propose two major directions that can be adopted to bring content analysis within the reach of the consumer. First, digital video appliances that acquire or receive video should improve in functionality and help the content analysis process. Currently, such video appliances mostly duplicate the functionalities of their analog counterparts. They should become smarter by fully utilizing the capabilities offered by the digital framework. They should do more than only capturing images or receiving broadcast. They should be capable of absorbing and utilizing additional information that may be readily available (e.g., data services carrying information related to content in DTV broadcast) or easily computable. Such video appliances can provide valuable information for the subsequent content analysis step, which may significantly reduce the complexity of content analysis task thereby reducing the burden on the user.

Second direction is not new. Practical computer vision algorithms should be developed, that intelligently utilize the available information and specifics of a well-defined problem to provide a robust solution (instead of addressing problems in their most general form or attacking the holy grail problems of computer vision for information extraction and recognition).

In conclusion, content analysis is currently difficult for the consumer. This situation will change with help from new appliances and algorithms.