Anda di halaman 1dari 2

Doctor in Information Technology 1

Advance Database Management Systems

Assignment 003

Note:
Read the following papers to answer the essay questions:
 Human-powered Sorts and Joins by Marcus et al.
Check for the link of the paper from Week 5 course module.

Essay Questions:
1. What was the authors’ approach in the execution of joins and sorts?
2. Discuss the facets of this study that made it favorable over similar researches in
crowdsourced databases.

Answers:

1. The authors of the study used an experimental approach in executing joins and sorts. In
implementing the join operation, there were four approaches used: simple join, naive batching,
smart batching, and alternative join algorithms. While in sorting there were three approaches
used: comparison-based, rating-based, and a hybrid of the two. The authors also proposed a
number of optimizations, including task batching, replacing pairwise comparisons with
numerical ratings, and pre-filtering tables before joining them, which dramatically reduce the
overall cost of running sorts and joins on the crowd. The study also used a system called Qurk
on the implementation of two of the most important database operators, joins and sorts. This
was the first to systematically study focusing on the implementation of these operators in a
crowdsourced database.

2. The study was very comprehensive, the authors provided clear methodology of joining and
sorting. To address cost-based optimization, and to introduce an exciting new landscape for

Advance Database Management Systems


query optimization and execution research, the authors have built a system called “Qurk”, a
declarative query processing system designed to run queries over a crowd of workers, with
crowd-based filter, join, and sort operators that optimize for some of the parameters. Qurk’s
executor can choose the best implementation or user interface for different operators
depending on the type of question or properties of the data. The executor combines human
computation and traditional relational processing (e.g., filtering images by date before
presenting them to the crowd). Qurk’s declarative interface enables platform independence
with respect to the crowd providing work. Finally, Qurk automatically translates queries into
Human Intelligence Tasks (HITs) and collects the answers in tabular form as they are completed
by workers. The study provided clear dicussions, unlike other studies conducted related to
joins and sorts which also used MTurk that do not provide a detailed discussion of
implementation alternatives or performance tradeoffs (Fanklin et al. who used CrowdDB).
Lastly, the result oft the study showing a reduce of the overall cost from $67 in a naive
implementation to about $3, without substantially affecting accuracy or latency, is highly
favorable. The end-to-end experiment, which reduced cost by a factor of 14.5, is highly favorable
to a company or an organization.

Anda mungkin juga menyukai