Anda di halaman 1dari 5

Exercise

Show that quicksort’s best-case running time is Ω(n lg n).

Exercise

Show that Randomized-quicksort’s expected running time is Ω(n lg n).

Exercise

A sequence of stack operations is performed on a stack whose size never exceeds k. After k operations, a copy
of the entire stack is made for backup purposes. Show that the cost of n stack operations, including copying the
stack, is O(n) by assigning suitable amortized costs to the various stack operations.

Solution:

Charge $2 for each PUSH and POP operation and $0 for each COPY. When we call PUSH, we use $1 to pay for
the operation, and we store the other $1 on the item pushed. When we call POP, we again use $1 to pay for
the operation, and we store the other $1 in the stack itself. Because the stack size never exceeds k, the actual
cost of a COPY operation is at most $k, which is paid by the $kfound in the items in the stack and the stack
itself. Since there are k PUSH and POP operations between two consecutive COPY operations, there are $k of
credit stored, either on individual items (from PUSH operations) or in the stack itself (from POP operations) by
the time a COPY occurs. Since the amortized cost of each operation is O(1) and the amount of credit never goes
negative, the total cost of n operations is O (n).

Exercise

Discuss and exemplify the Las Vegas and Monte Carlo randomized based approaches.
Exercise

Given an algorithm with the following structure:

Task2
Task5

Task1 Task3 Task6

Task4

and the following value of work: W(Task1) = 100, W(Task2) = 90, W(Task3)=50, W(Task4)=250, W(Task5)= 70,
W(Task6)=300

1. Compute the WORK and the PARALLELISM of the algorithm (Report the complete equations)
2. Suppose that each task corresponds to a call to a function with the same name: write the OpenMP
implementation of the algorithm
(Remember that an open pragma has the form
#pragma omp <directive>
where <directive> can be parallel, sections, section, master, single, for
3. Questions (briefly justify your answers):
a. Ideally which is the minimum number of threads to guarantee the maximum parallelism?
b. How many of them are alive during the call of Task5?
c. Does this implementation perform better than sequential code?
4. Starting from the structure of the proposed algorithm, is there any other better parallel algorithm?

Answers:

1. W=100+90+50+250+70+300=860
S=100+MAX(MAX(90,50) + 70, 250)+300 = 650
P=
2. Task1();
#pragma omp parallel
{
#pragma omp sections
{
#pragma omp section
{
#pragma omp parallel
{
#pragma omp sections
{
#pragma omp section
{
Task2();
}
#pragma omp section
{
Task3();
}
}
}
Task5();
}
#pragma omp section
{
Task4():
}
}
}
Task();
3. The answers are:
a. The minimum number of threads to guarantee the maximum parallelism is 3 (the maximum
number of tasks in parallel)
b. There are 2 threads alive during the call of Task5 since this task is contained in a sections of size 2.
c. It depends on the overhead introduced by the creation and destruction of threads. If this overhead
is too large the parallel implementation can have worse performances than sequential.
4. The following algorithm has the same WORK and the same SPAN, but it requires only two threads to
achieve the maximum parallelism

Task3 Task2 Task5

Task1 Task6

Task4
Exercise

Given the following pthread implementation of merge sort, answers these questions

1. How many threads will execute the application (justify your answers)?
2. How many times does each thread wait at the barrier? How many other threads will be waited (justify your
answers) ?
3. Which threads writes to even_data[0], odd_data[0], even_data[7] and odd_data[7]?
4. What are the messages printed by thread[2]?
/* Data array */
int even_data[8], odd_data[8] = {8,7,6,5,4,3,2,1};
/* Size of data array */
unsigned int array_size = 8;
/* Number of threads */
unsigned int num_threads = 4;
/* Barriers */
pthread_barrier_t barrier;
/* Number of round */
unsigned int current_round = 0;
/* Variable used to store return value of thread 0 */
unsigned int return_thread;

/**
* Merge two subarrays already sorted
* @param input_data is the array containing the two subarrays to be merged
* @param starting_cell is the index of the first cell of the first subarray
* @param size is the sum of the sizes of the two arrays
* @param output_data is where result has to be stored
*/
void bottom_up_merge(int * input_data, int starting_cell, int size, int * output_data)<

/**
* Sort a subarray
* @param starting_cell is the index of the first cell of the subarray
* @param size is the size of the subarray to be sorted
* @return true if final data are stored in odd_data
*/
bool bottom_up_sort(unsigned int starting_cell, unsigned int size)
{
printf("Sorting cell %d %d\n", starting_cell, starting_cell + size);
/*The size of the subsequence to be sorted in the current iteration */
int width = 0;
/*The number of the current iteration */
int iteration = 0;
for(width = 2; width < size*2; width = width * 2, iteration++)
{
/*The index of the subsequence to be considered */
int sequence = 0;
for(sequence = 0; sequence < size/width; sequence++)
{
/* Even iteration: the result is stored in even_data */
if(iteration%2 == 0)
{
bottom_up_merge(odd_data, starting_cell + sequence * width, width, even_data);
}
else
{
bottom_up_merge(even_data, starting_cell + sequence * width, width, odd_data);
}
}
}
return iteration%2 == 0;
}

/**
* Sort an array
* @param local_data is the index identifying a thread
*/
void * sort_array(void * local_data)
{
/* The index of the current thread */
const unsigned int thread_index = *((unsigned int *) local_data);
/* Compute the size of the subarray to be processed */
unsigned int subarray_size = array_size/num_threads;
/* The starting cell */
const unsigned int starting_cell = thread_index * subarray_size;
/* Sort the assigned array portion */
bool odd = bottom_up_sort(starting_cell, subarray_size);
/* The inverse ratio of active threads */
int active_ratio = 2;
while(active_ratio <= num_threads)
{
printf(“Active ratio %d\n”, active_ratio);
/* Wait all the other threads */
pthread_barrier_wait(&barrier);
/* True if this thread has to merge the result with a right part */
const bool merge = thread_index%active_ratio == 0;
if(merge)
{
if(odd)
bottom_up_merge(odd_data, starting_cell, subarray_size * active_ratio, even_data);
else
bottom_up_merge(even_data, starting_cell, subarray_size * active_ratio, odd_data);
}
odd = !odd;
/* Halve the active threads */
active_ratio *= 2;
}
if(thread_index == 0)
{
return_thread = odd;
printf("Exiting from thread 0\n");
fflush(stdout);
pthread_exit(&return_thread);
}
return 0;
}

int main(int argc, char ** argv)


{
/* The return value of a thread */
void * return_value;

/* Thread data structure */


pthread_t threads[num_threads];
unsigned int indexes[num_threads];
pthread_attr_t attr;
/*Initialize the barriers*/
pthread_barrier_init(&barrier, NULL, num_threads);
for(index = 0; index < num_threads; index++)
{
indexes[index] = index;
pthread_create(&threads[index], &attr, sort_array, (void *) &indexes[index]);
}
/* Wait the end of the first thread */
pthread_join(threads[0], &return_value);
const int * output_data = *((int *)(return_value)) == 1 ? odd_data : even_data;
/* Print the final result */
for(index = 0; index < array_size; index++)
{
printf("%d ", output_data[index]);
}
printf("\n");
}
Answers:

1. 5 threads (the 4 created threads plus the initial)


2. Two times, since loop of sort_array is executed twice (active_ratio is 2 before the first iteration, 4 at the
end of the first iteration and 8 at the end of the second iteration). 4 threads synchronize themselves each
time, so three other threads are waited.
3. threads[0] is the only thread which writes to odd_data[0] and even_data[0], odd_data[7] is written by
threads [2], even_data[7] is written by threads[2] and threads[0].
4. Sorting cell 4 6
Active ratio 2
Sorting cell 4 8
Active ratio 4

Anda mungkin juga menyukai