Large Scale Image Processing

Large Scale Image Processing with Hadoop
Brandyn White bwhite@cs.umd.edu Advisor: Prof. Larry Davis
Outline
'Big Data' in Computer Vision Map/Reduce and Computer Vision Map/Reduce Image Search Application: Screenshot Retrieval
'Big Data' in Vision

Traditional Vision: Focus on the model Pose Est.: 2D Image -> Virtual 3D model + Camera Under-constrained, slow, sensitive to noise Object Recognition: SVM + features Breaks with many classes (e.g., every flickr tag) New Trend: Focus on the data DB of images (w/ metadata) -> query image Problem becomes similar image search Transfer metadata from DB images to query image KNN methods simple and scalable Clustering, hashing, metric learning NLP: rule-based models -> statistical models
Example: Image Search -> Metadata

Query Image

Query Image Retrieved Images (flickr)
Tags Location (GPS) Title Date Groups Comments Owner Views

Query Image Retrieved Images (flickr)
Tags Location (GPS) Title Date Groups Comments Owner Views Tags Location (GPS)
Output Metadata
Big Data in Vision: Pose Estimation

Goal: Given an image of a person, estimate 3D pose.
G. Shakhnarovich, P. Viola, T. Darrell Fast pose estimation with parameter-sensitive hashing, October 2003.
Big Data in Vision: Scene Completion

Goal: Given an image and a selected region, fill the region with a plausible texture.
J. Hays and A. A. Efros, "Scene completion using millions of photographs," in SIGGRAPH '07: ACM SIGGRAPH 2007 papers. New York, NY, USA: ACM, 2007, pp. 4+.
Big Data in Vision: IM2GPS

Goal: Given an image, guess where in the world it was taken.
J. Hays and A. A. Efros, "Im2gps: estimating geographic information from a single image," Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, vol. 0, pp. 1-8, 2008.
Big Data in Vision: Object Recognition

Goal: Given an image, select a noun that describes it.
A. Torralba, R. Fergus, and W. T. Freeman, "80 million tiny images: A large data set for nonparametric object and scene recognition," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 30, no. 11, pp. 19581970, May 2008
Big Data in Vision: Pixel Annotation

Goal: Given an image, annotate every pixel (e.g., building).
C. Liu, J. Yuen, and A. Torralba, "Nonparametric scene parsing: Label transfer via dense scene alignment," Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, vol. 0, pp. 19721979, 2009.
Big Data in Vision: One Frame Motion

Goal: Given an image, estimate the pixel motion.
C. Liu, J. Yuen, A. Torralba, J. Sivic, and W. T. Freeman, "Sift flow: Dense correspondence across different scenes," in ECCV '08: Proceedings of the 10th European Conference on Computer Vision. Berlin, Heidelberg: SpringerVerlag, 2008, pp. 28-42.
Outline
Hadoop+CV: No Reducer
Map
Map
Map
Example Maps Object Detection (e.g., cars, faces) Feature Computation (e.g., SIFT) Sliding Windows (given a region+image)
Hadoop+CV: Model Creation
Map
Map
Map
Reduce
Map: Feature Computation Red: Model Creation Examples Classifiers (e.g., SVM, Bayes) Geometry Problems (e.g., RANSAC, SfM)
Hadoop+CV: Expectation Maximization

Vec0 Map Vec1 Map Vec2 Map
Parameter Estimate (in JAR or cache)
Reduce
Map: Fit data to model given parameters (E-Step) Red: Compute new model parameters given data (M-Step) Iterate until stopping conditions are met. Examples Clustering (e.g., K-Means) Mixture Models (e.g., MoG)
Outline
Image Retrieval with Hadoop

Analogies between image and text retrieval Bag of Words -> Bag of Features Document -> Image Visual Word: Cluster of similar visual features Compute Local Image Features (e.g., SIFT) Cluster Features (i.e., create visual words) Find cluster medians Make Hamming Embeddings (compact feature) [1] Efficient binary code (256 -> 8 Bytes per feature) Hamming Distance Benefit: Small size means more in memory Inverted Index
[1] H. Jegou, M. Douze, and C. Schmid, "Hamming embedding and weak geometric consistency for large scale image search," in ECCV '08: Proceedings of the 10th European Conference on Computer Vision. Berlin, Heidelberg: Springer-Verlag, 2008, pp. 304-317
Hadoop Job Workflow

(Database Images)
Image Features (SURF 64D) Remove Dupes (Curr./Prev.) K-Means Clustering (Initial) K-Means Clustering Median Computation Hamming Embedding
Hadoop Job Workflow: Image Features

(Database Images)
Image Features (SURF 64D)
Map In: (image_url, image_hash, image_data, image_tags) Map Out: (image_hash, image_url, image_features)
Hadoop Job Workflow: Remove Dupes

Image Features (SURF 64D) Remove Dupes (Curr./Prev.)
Map In: [image_hash, image_url, image_features] or Map In: [image_hash] (for images already in the DB) Map Out Key: image_hash Map Out Val: image_features Reduce Out: [image_hash, image_feature]
Hadoop Job Workflow: K-Means (init)

Remove Dupes (Curr./Prev.) K-Means Clustering (Initial) Map In: [image_hash, image_feature] Map Out Key: random [0,1] Map Out Val: image_feature (extended by 1 dim to get count) 1 Reducer (outputs once per cluster) Reduce Out: [cluster_num, cluster_mean]
Hadoop Job Workflow: K-Means

K-Means Clustering (Initial) K-Means Clustering
File: cluster_means Map In: [image_hash, image_feature] Map Out Key: cluster_num (nearest cluster) Map Out Val: image_feature (extended by 1 dim to get count) Reduce Out: [cluster_num, cluster_mean]
Hadoop Job Workflow: Medians

K-Means Clustering Median Computation
File: cluster_means Map In: [image_hash, image_feature] Map Out Key: cluster_num (nearest cluster) Map Out Val: image_feature Reduce Out: [cluster_num, cluster_median]
Hadoop Job Workflow: Ham. Emb.

Median Computation Hamming Embedding
File: cluster_means, cluster_medians Map In: [image_hash, image_feature] Map Out Key: cluster_num (nearest cluster) Map Out Val: hamming_embedding Reduce Out: [cluster_num, hamming_embedding]
Image Retrieval Overview: Query

(Query Image)
Image Features (SURF 64D) For each feature... Find Nearest Cluster Compute hamming embedding (using cluster median) Vote (tf-idf) for DB image if a feature if hamming dist < Thresh
Outline
Current Work: PC Help Doc. Retrieval

Goal: Take a screenshot and retrieve books and websites that provide relevant help documentation.
Tom Yeh, Brandyn White, Larry Davis, and Boris Katz
Outline
Conclusion
Vision has 'Big Data' applications Many image search applications Common design patterns for M/R+Vision Hadoop useful image search
References
[1] P. Duygulu, K. Barnard, J. de Freitas, and D. Forsyth, "Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary," in Computer Vision ECCV 2002, ser. Lecture Notes in Computer Science, 2002, ch. 7, pp. 349-354. [2] A. Makadia, V. Pavlovic, and S. Kumar, "A new baseline for image annotation," in ECCV '08: Proceedings of the 10th European Conference on Computer Vision. Berlin, Heidelberg: Springer-Verlag, 2008, pp. 316-329. [3] Matthieu Guillaumin, Thomas Mensink, Jakob Verbeek and Cordelia Schmid, "Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation." ICCV 2009
[4] A. Torralba, R. Fergus, and W. T. Freeman, "80 million tiny images: A large data set for nonparametric object and scene recognition," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 30, no. 11, pp. 1958-1970, May 2008.

Large Scale Image Processing

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Large Scale Image Processing

Diunggah oleh

Hak Cipta:

Format Tersedia

Large Scale Image Processing with Hadoop

Brandyn White bwhite@cs.umd.edu Advisor: Prof. Larry Davis

'Big Data' in Vision

Example: Image Search -> Metadata

Example: Image Search -> Metadata

Tags Location (GPS) Title Date Groups Comments Owner Views

Tags Location (GPS) Title Date Groups Comments Owner Views

Example: Image Search -> Metadata

Tags Location (GPS) Title Date Groups Comments Owner Views

Tags Location (GPS) Title Date Groups Comments Owner Views

Big Data in Vision: Pose Estimation

Big Data in Vision: Scene Completion

Big Data in Vision: IM2GPS

Big Data in Vision: Object Recognition

Big Data in Vision: Pixel Annotation

Big Data in Vision: One Frame Motion

Hadoop+CV: Model Creation

Hadoop+CV: Expectation Maximization

Image Retrieval with Hadoop

Hadoop Job Workflow

Hadoop Job Workflow: Image Features

Image Features (SURF 64D)

Hadoop Job Workflow: Remove Dupes

Hadoop Job Workflow: K-Means (init)

Hadoop Job Workflow: K-Means

Hadoop Job Workflow: Medians

Hadoop Job Workflow: Ham. Emb.

Image Retrieval Overview: Query

Current Work: PC Help Doc. Retrieval

Tom Yeh, Brandyn White, Larry Davis, and Boris Katz

Anda mungkin juga menyukai