References:
[1] M. Flickner, et. al., "Query by Image and Video Content: The QBIC System,"
IEEE Computer, Vol. 28, No. 9, pp. 23-32, 1995.
[2] H.D. Wactlar, T. Kanade, M.A. Smith and S.M. Stevens, "Intelligent
Access to Digital Video: Informedia Project," IEEE Computer, Vol. 29,
No. 5, pp. 46-52, 1996.
Introduction
CBR in Image and Video Databases
Some Existing CBR Systems/Applications
Digital Library
- An evolution from small databases, to image databases, ..., to
Digital Library
- Tremendous potentials and challenges to effective multimedia
information retrieval
e.g., Chabot [Ogle and Stonebraker] -- image database for California
Dept. of Water Resources, and
IBM Digital Library
used in Vatican Library, Indiana University School of Music, etc.
Content-Based Retrieval (CBR)
7.2.1. Basic problems
- How to store multimedia data?
Most modern databases (e.g. Postgres) support
complex types such as variable-length arrays, images.
For digital images, BLOBs (binary large objects) define data format
(ala C structure), index strategies, operators, and other methods.
- How to specify search?
- Keyword based
- (Color) histogram based
- Edge-feature based
- Edge pixels
- Sketches (lines and curves)
- Shape (junctions, surfaces, polyhedra, ...)
- Texture based
- Statistics on gray-level images --> co-occurrence
matrix, autocorrelation
- Edge-based methods --> "edgeness" (i.e., edge
density), mean/variance/percentage on edge
orientations/separations
- Motion based (especially in video)
- Knowledge based
- geometric models, e.g., lines and corners on soccer fields
- "Concept queries" in Chabot, e.g., "sunset"
- Possible Improvements:
- Multiresolution approach (low resolution to be faster, focus on
certain areas at higher resolutions for details or motion, ...)
- Working on compressed image and video. Take advantage of
useful information in JPEG, MPEG (DC, AC, motion vectors, ...).
7.2.2. A Note on Edge Detection
- In digital images edges are used to depict local and
significant intensity changes (discontinuities).
- We will only discuss the gradient-based edge detection method which
relies on first derivatives.
- Given an image f (x, y) (where f denotes the
intensity), the gradient image can be defined by the following:
Implementation of the gradient-based edge detection
- The partial derivative images can be generated using simple
edge operators (masks consisting of 1 and -1).
- If S(x, y) [the edge magnitude at (x, y)] is greater
than a certain threshold value, then an edge is detected at
(x, y), its corresponding edge direction is theta.
- Currently supported features: query by color, texture, simple
shape, and keywords
- It is being used in IBM's Ultimedia Manager in OS/2, and in
DB2 Extenders
Query by example using color histograms
- Example image:
- One of the color histograms for the example image:
- Store a histogram vector for each image in the database,
compare images using mean squared difference,
return images sorted by this metric.
- We'll find these images:
- +: Very, very fast
+: Easily implemented
+: Invariant to small changes in camera angle (and sometimes large ones!)
- -: Sensitive to illumination changes.
-: Sensitive to different levels of gamma correction.
-: Doesn't account for location of color. These two are equivalent:
- Could use three additional color edge images and their
histograms to solve this problem.
- Color constancy problems: two otherwise equivalent scenes will
have different histograms depending on the color of the incident light.
Possible variations on the theme
- Use other metrics: mean squared difference, mean
absolute difference, cosine of angle, etc.
- Use dominant colors -- k bins in the histogram that have
larger counts
- Use other color spaces, and/or uneven weights (e.g., 8 bits for
green, 5 red, 3 blue) --> perceptual weighting
- Also, color histogramming can be augmented by other simple
statistical measures, e.g., mean brightness, moments, etc.
Other Queries in QBIC
- By texture: coarseness ("edgeness"), contrast, and edge directionality.
- By sketching: Sketch a shape and use it for a pattern match
(the current demo version doesn't work well).
- By keywords: Often there's a set of keywords and other textual data
associated with a picture.
7.3.2. CBR in Video
References:
[1] R. Zabih, J. Miller and K. Mai, "A feature-based algorithm for
detecting and classifying scene breaks," Proc. ACM Multimedia '95,
pp. 189-200, 1995.
[2] H.J. Zhang, C.Y. Low, S.W. Smoliar and J.H. Wu, "Video parsing,
retrieval and browsing: an integrated and content-based solution,"
Proc. ACM Multimedia '95, pp. 15-24, 1995.
(1) Video Parsing (Temporal Segmentation)
- Scene breaks/transitions (cuts, fades, dissolves, wipes, etc.)
- Based on global representation (e.g., color, intensity
histogram)
+: insensitive to camera or object motions
-: miss scene breaks when they have similar distributions
- Local feature based (e.g., entering/exiting edges in [Zabih,
et al.])
+: capable of using more detailed spatial information
-: sensitive to moving objects and cameras
- Recover camera motion, zoom, pan
- Key frame extraction often follows the temporal
segmentation. In [Zhang, et al.], on average 2-3 key frames are
extracted for each shot.
- Salient still -- all the frames in a shot are reduced to
a single image which is to capture the content and context of the
entire shot (difficult and open problem).
(2) Domain Knowledge Based Retrieval
Example 1: Model-based content extraction of soccer games
Example 2: "Concept queries" in Chabot
- Choose from menu a query "sunset", the system will generate a
query which contains the following:
q.desc ~ "sunset" or
q.hist ~ "mostly Red" or
q.hist ~ "mostly Yellow" or
...
Example 3: Automatic indexing of real-time video by NTT, Japan
- The goal is to generate indexing of 24-hour programs of a TV station:
Suppose SC = {S1, S2, ..., Sn), where Si is the time when a scene
break occurs, then the set of commercial advertisements can be defined
as:
{(Si, Sj) | j - i > 3, Sj - Si = 15 (or 30)}, which states that if in a
period of 15 (or 30) seconds there are more than 3 scene breaks, then
it is a commercial program.
Further Exploration
QBIC,
IBM Digital Library
Top |
CMPT 365 Home Page |
CS