Anda di halaman 1dari 31

Large Scale Image Processing with Hadoop

Brandyn White bwhite@cs.umd.edu Advisor: Prof. Larry Davis

Outline
'Big Data' in Computer Vision Map/Reduce and Computer Vision Map/Reduce Image Search Application: Screenshot Retrieval

'Big Data' in Vision


Traditional Vision: Focus on the model Pose Est.: 2D Image -> Virtual 3D model + Camera Under-constrained, slow, sensitive to noise Object Recognition: SVM + features Breaks with many classes (e.g., every flickr tag) New Trend: Focus on the data DB of images (w/ metadata) -> query image Problem becomes similar image search Transfer metadata from DB images to query image KNN methods simple and scalable Clustering, hashing, metric learning NLP: rule-based models -> statistical models

Example: Image Search -> Metadata


Query Image

Example: Image Search -> Metadata


Query Image Retrieved Images (flickr)
Tags Location (GPS) Title Date Groups Comments Owner Views

Tags Location (GPS) Title Date Groups Comments Owner Views

Tags Location (GPS) Title Date Groups Comments Owner Views

Example: Image Search -> Metadata


Query Image Retrieved Images (flickr)
Tags Location (GPS) Title Date Groups Comments Owner Views Tags Location (GPS)

Output Metadata

Tags Location (GPS) Title Date Groups Comments Owner Views

Tags Location (GPS) Title Date Groups Comments Owner Views

Big Data in Vision: Pose Estimation


Goal: Given an image of a person, estimate 3D pose.

G. Shakhnarovich, P. Viola, T. Darrell Fast pose estimation with parameter-sensitive hashing, October 2003.

Big Data in Vision: Scene Completion


Goal: Given an image and a selected region, fill the region with a plausible texture.

J. Hays and A. A. Efros, "Scene completion using millions of photographs," in SIGGRAPH '07: ACM SIGGRAPH 2007 papers. New York, NY, USA: ACM, 2007, pp. 4+.

Big Data in Vision: IM2GPS


Goal: Given an image, guess where in the world it was taken.

J. Hays and A. A. Efros, "Im2gps: estimating geographic information from a single image," Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, vol. 0, pp. 1-8, 2008.

Big Data in Vision: Object Recognition


Goal: Given an image, select a noun that describes it.

A. Torralba, R. Fergus, and W. T. Freeman, "80 million tiny images: A large data set for nonparametric object and scene recognition," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 30, no. 11, pp. 19581970, May 2008

Big Data in Vision: Pixel Annotation


Goal: Given an image, annotate every pixel (e.g., building).

C. Liu, J. Yuen, and A. Torralba, "Nonparametric scene parsing: Label transfer via dense scene alignment," Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, vol. 0, pp. 19721979, 2009.

Big Data in Vision: One Frame Motion


Goal: Given an image, estimate the pixel motion.

C. Liu, J. Yuen, A. Torralba, J. Sivic, and W. T. Freeman, "Sift flow: Dense correspondence across different scenes," in ECCV '08: Proceedings of the 10th European Conference on Computer Vision. Berlin, Heidelberg: SpringerVerlag, 2008, pp. 28-42.

Outline
'Big Data' in Computer Vision Map/Reduce and Computer Vision Map/Reduce Image Search Application: Screenshot Retrieval

Hadoop+CV: No Reducer

Map

Map

Map

Example Maps Object Detection (e.g., cars, faces) Feature Computation (e.g., SIFT) Sliding Windows (given a region+image)

Hadoop+CV: Model Creation

Map

Map

Map

Reduce
Map: Feature Computation Red: Model Creation Examples Classifiers (e.g., SVM, Bayes) Geometry Problems (e.g., RANSAC, SfM)

Hadoop+CV: Expectation Maximization


Vec0 Map Vec1 Map Vec2 Map
Parameter Estimate (in JAR or cache)

Reduce
Map: Fit data to model given parameters (E-Step) Red: Compute new model parameters given data (M-Step) Iterate until stopping conditions are met. Examples Clustering (e.g., K-Means) Mixture Models (e.g., MoG)

Outline
'Big Data' in Computer Vision Map/Reduce and Computer Vision Map/Reduce Image Search Application: Screenshot Retrieval

Image Retrieval with Hadoop


Analogies between image and text retrieval Bag of Words -> Bag of Features Document -> Image Visual Word: Cluster of similar visual features Compute Local Image Features (e.g., SIFT) Cluster Features (i.e., create visual words) Find cluster medians Make Hamming Embeddings (compact feature) [1] Efficient binary code (256 -> 8 Bytes per feature) Hamming Distance Benefit: Small size means more in memory Inverted Index
[1] H. Jegou, M. Douze, and C. Schmid, "Hamming embedding and weak geometric consistency for large scale image search," in ECCV '08: Proceedings of the 10th European Conference on Computer Vision. Berlin, Heidelberg: Springer-Verlag, 2008, pp. 304-317

Hadoop Job Workflow


(Database Images)

Image Features (SURF 64D) Remove Dupes (Curr./Prev.) K-Means Clustering (Initial) K-Means Clustering Median Computation Hamming Embedding

Hadoop Job Workflow: Image Features


(Database Images)

Image Features (SURF 64D)

Map In: (image_url, image_hash, image_data, image_tags) Map Out: (image_hash, image_url, image_features)

Hadoop Job Workflow: Remove Dupes


Image Features (SURF 64D) Remove Dupes (Curr./Prev.)

Map In: [image_hash, image_url, image_features] or Map In: [image_hash] (for images already in the DB) Map Out Key: image_hash Map Out Val: image_features Reduce Out: [image_hash, image_feature]

Hadoop Job Workflow: K-Means (init)


Remove Dupes (Curr./Prev.) K-Means Clustering (Initial) Map In: [image_hash, image_feature] Map Out Key: random [0,1] Map Out Val: image_feature (extended by 1 dim to get count) 1 Reducer (outputs once per cluster) Reduce Out: [cluster_num, cluster_mean]

Hadoop Job Workflow: K-Means


K-Means Clustering (Initial) K-Means Clustering

File: cluster_means Map In: [image_hash, image_feature] Map Out Key: cluster_num (nearest cluster) Map Out Val: image_feature (extended by 1 dim to get count) Reduce Out: [cluster_num, cluster_mean]

Hadoop Job Workflow: Medians


K-Means Clustering Median Computation

File: cluster_means Map In: [image_hash, image_feature] Map Out Key: cluster_num (nearest cluster) Map Out Val: image_feature Reduce Out: [cluster_num, cluster_median]

Hadoop Job Workflow: Ham. Emb.


Median Computation Hamming Embedding

File: cluster_means, cluster_medians Map In: [image_hash, image_feature] Map Out Key: cluster_num (nearest cluster) Map Out Val: hamming_embedding Reduce Out: [cluster_num, hamming_embedding]

Image Retrieval Overview: Query


(Query Image)

Image Features (SURF 64D) For each feature... Find Nearest Cluster Compute hamming embedding (using cluster median) Vote (tf-idf) for DB image if a feature if hamming dist < Thresh

Outline
'Big Data' in Computer Vision Map/Reduce and Computer Vision Map/Reduce Image Search Application: Screenshot Retrieval

Current Work: PC Help Doc. Retrieval


Goal: Take a screenshot and retrieve books and websites that provide relevant help documentation.

Tom Yeh, Brandyn White, Larry Davis, and Boris Katz

Outline
'Big Data' in Computer Vision Map/Reduce and Computer Vision Map/Reduce Image Search Application: Screenshot Retrieval

Conclusion
Vision has 'Big Data' applications Many image search applications Common design patterns for M/R+Vision Hadoop useful image search

References
[1] P. Duygulu, K. Barnard, J. de Freitas, and D. Forsyth, "Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary," in Computer Vision ECCV 2002, ser. Lecture Notes in Computer Science, 2002, ch. 7, pp. 349-354. [2] A. Makadia, V. Pavlovic, and S. Kumar, "A new baseline for image annotation," in ECCV '08: Proceedings of the 10th European Conference on Computer Vision. Berlin, Heidelberg: Springer-Verlag, 2008, pp. 316-329. [3] Matthieu Guillaumin, Thomas Mensink, Jakob Verbeek and Cordelia Schmid, "Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation." ICCV 2009

[4] A. Torralba, R. Fergus, and W. T. Freeman, "80 million tiny images: A large data set for nonparametric object and scene recognition," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 30, no. 11, pp. 1958-1970, May 2008.

Anda mungkin juga menyukai