Anda di halaman 1dari 86

Data Structures for Java

William H. Ford William R. Topp

Chapter 21 Hashing as Map Implementation


Bret Ford 2005, Prentice Hall
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash Table

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Pengertian

Merupakan struktur data yang menawarkan operasi insertion dan searching (juga deletion) dengan sangat cepat

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Pengertian

Hash tabel biasa juga disebut map, lookup table, assosiatif array atau dictionary Hash tabel adalah container yang mengizinkan akses langsung oleh indeks dengan tipe apapun. Hash tabel bekerja seperti array, akan tetapi indeksnya tidak harus integer. Contoh yang sederhana dari hash tabel adalah kamus
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Kekurangan Hash Tabel

Hash tabel berbasis array, dan array sangat sulit ditambah setelah array tersebut dibuat Untuk beberapa bentuk Hash tabel, performance nya semakin menurun bila tabel telah penuh. Sehingga programmer dari awal sudah harus memperhitungkan berapa besar data yang akan disimpan
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Introduction to Hashing

Hash function memetakan value ke indeks pada tabel. Fungsi tsb menyediakan akses ke suatu elemen, seperti suatu indeks menyediakan akses ke suatu elemen dari array. Hash tabel menyediakan implementasi Set dan Map interface

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Introduction to Hashing (continued)

Hasing merupakan suatu struktur penyimpanan data yang menghasilkan O(1) waktu pengembalian rata-rata. Dengan cara ini, item independen terhadap jumlah item lain pada collection

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Introduction to Hashing (continued)

Hash tabel adalah reference dari array

Yang berhubungan dengan hash tabel adalah hash function yang mempunyai key sebagai argument dan mengembalikan nilai integer Dengan menggunakan sisa setelah membagi hash value dengan ukuran tabel, kita mempunyai pemetaan dari key ke indeks pada tabel

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Introduction to Hashing (concluded)


Hash Value: HashTable index: hf(key) = hashValue hashValue % n

hf(key) = hashValue hashValue % n = i

0 1 i n-1
key entry

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Using a Hash Function

Misalkan hash function hf(x) = x, dimana x adalah nonnegative integer (identity function). Asumsikan tabel adalah array tableEntry dengan n = 7 elemen
hf(22) = 22 22 % 7 = 1 0 1 2 3 4 5 6
tableEntry[1]

hf(4) = 4

4%7=4

tableEntry[4]

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Using a Hash Function (concluded)

Dengan hash function hf() dan ukuran tabel n, indeks dari tabel untuk key i = hf(key)%n. Collision akan muncul jika ada dua key yang berbeda dengan dibagi oleh n
hf(22) = 22 hf(36) = 36 22 % 7 = 1 36 % 7 = 1 0 1 2 3 4 5 6
tableEntry[5] tableEntry[1]

hf(5) = 5 hf(33) = 33

5%7=5 33 % 7 = 5

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Designing Hash Functions

Beberapa prinsip dalam mendesain hash function


Mengevaluasi hash function harus efisien Hash function harus menghasilkan nilai hash yang terdistribusi secara uniform. Ini akan menyebarkan indek hash tabel ke tabel yang meminimalisasi collision

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Designing Hash Functions (continued)

Java programming language menyediakan hashing function dengan method hashCode() pada Object superclass
public int hashCode() { }

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Designing Hash Functions (continued)

hashCode mengkonversi internal address dari objek menjadi integer value, yang mempunyai aplikasi yang terbatas karena 2 objek yang berbeda akan mempunyai value yang berbeda untuk hashCode(), walaupun mereka menyimpan data yang sama

// strings one and two are the same; not so for integer values // one.hashCode() and two.hashCode() String one = "java", two = "java";

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Designing Hash Functions (continued)

Integer class menyediakan identity function untuk hashCode()


public int hashCode() { return value; }

Kecuali data integer memiliki karakteristik random, ini bukan fungsi hash yang baik

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Designing Hash Functions (continued)

Pada sebagian besar aplikasi hash-tabel, key bertipe string

Untuk membuat hash function yang efisien, kita harus menggabungkan urutan karakter pada string untuk membentuk integer.
public int hashCode() { int hash = 0; for (int i = 0; i < n; i++) hash = 31*hash + s[i]; return hash; }
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Designing Hash Tables

Ketika ada dua atau lebih data di-hash ke dalam indeks yang sama,mereka tidak dapat mengisi posisi yang sama pada tabel

Pilihan yang dapat kita lakukan adalah mengalokasikan salah satu item ke posisi yang lain dalam tabel (linear probing) atau mendesain ulang tabel untuk menyimpan sekumpulan key yang collide pada setiap indeks (chaining dengan list yang terpisah)

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash tabel adalah array dari elemen yang berasosiasi dengan hash function. Untuk menambahkan item

Linear Probing

Awalnya, tag masing-masing entri pada tabel dengan empty Gunakan hash function thd key dan bagi value dengan ukuran tabel untuk memperoleh indeks. Jika entry nya kosong, masukkan item Jika tidak, mulai pada indeks hash berikutnya dan scan indeks-indeks berturut-turut,. Insertion dilakukan pada lokasi pertama yang terbuka
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Designing Hash Functions (concluded)


The following are hash code values for three different strings. The value for string strB is a negative number due to integer overflow.
String strA = "and", strB = "uncharacteristically", strC = "algorithm"; hashValue = strA.hashCode(); hashValue = strB.hashCode(); hashValue = strC.hashCode(); // hashValue = 96727 // hashValue = -2112884372 // hashValue = 225490031

In general, a hash function may result in integer overflow and return a negative number. The following calculation insures that the table index is nonnegative.
tableIndex = (hashValue & Integer.MAX_VALUE) % tableSize
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Linear Probing (continued)

Pencarian menghasilkan lokasi awal dari hash tanpa menemukan slot yang terbuka, tabel penuh dan algoritma linear probing throw suatu exception

tableIndex = x % 11

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Linear Probing (concluded)

Jika ukuran tabel relatif lebih besar dari jumlah item, linear probing akan bekerja dengan baik, karena hash function yang baik membuat indek yang terdistribusi ke semua tabel dan collision akan minimal. Karena rasio dari ukuran tabel terhadap jumlah item yang didekati 1, algoritma lebih buruk dari sequential search.

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Linear Probing (continued)


// compute hash index of item for a table of size n int index = (item.hashCode()&Integer.MAX_VALUE)%n, origIndex; // save the original hash index origIndex = index; // cycle through the table looking for an empty slot, a // match or a table full condition (origindex == index). do { // test whether the table slot is empty or the key matches // the data field of the table entry if table[index] is empty insert item in table at table[index] and return else if table[index] matches item return // begin a probe starting at the next table location index = (index+1) % n; } while (index != origIndex); // we have gone around table without finding match or open slot throw new BufferOverflowException();
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

HashCode

Urutan key/value yang disimpan pada tabel HashMap tergantung capacity dari tabel dan nilai dari hash code dari object

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Chaining with Separate Lists

Chaining dengan daftar terpisah mendefinisikan tabel hash sebagai urutan indeks dari linked list. Setiap list, disebut bucket, mengandung satu set item yang hash ke lokasi tabel yang sama

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Chaining with Separate Lists (continued)

Bucket adalah singly linked list. Masing-masing entri dari array merupakan simpul pertama dalam urutan item yang di hash ke indeks tabel. Node memiliki struktur dengan dua field, satu untuk nilai dan satu untuk referensi ke node berikutnya.

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Chaining with Separate Lists (continued)

Untuk menambahkan objek item, gunakan hash function untuk mengidentifikasi indeks dari bucket yang tepat dalam array (tabel).

If table[i] is null, add item as the first entry in the list. Otherwise begin with the first node, entry = table[i], and compare item with entry.nodeValue. If there is no match, continue the scan with node entry.next, and so forth. If item is not in the list, add it to the
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Chaining with Separate Lists (continued)


Consider the following sequence of eight elements {54, 77, 94, 89, 14, 45, 35, 76} with the identity hash function and tableSize = 11. The figure displays the lists. Each entry in a table includes the number of probes to add the element.

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Chaining with Separate Lists (concluded)

Chaining dengan daftar terpisah umumnya lebih cepat daripada probing linear karena chaining hanya mencari item yang hash ke lokasi table yang sama Dengan linear probing, jumlah entri tabel adalah terbatas pada ukuran tabel, sedangkan linked list yang digunakan dalam chaining bertambah sesuai dgn yang diperlukan Untuk menghapus elemen, hanya dengan menghapusnya dari daftar terkait.
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

As the number of entries in the hash table increases, search performance deteriorates. Rehashing increases the hash table size when the number of entries in the table is a specified percentage of its size.

Rehashing

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

A Hash Table as a Collection

The generic class Hash stores elements in a hash table using chaining with separate lists and implements the Collection interface.

hashCode() must be provided by the generic type. The constructor creates a hash table with initial size 17. The table grows as rehashing occurs. The method toString() returns a commaseparated list that, by the nature of hashing, is not ordered.
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash Class Implementation

The hash table is an array whose elements are the first node in a singly linked list.

Define an inner class Entry with an integer field hashValue that stores the hash code value and avoids recomputing the hash function during rehashing.

hashValue = item.hashCode() & Integer.MAX_VALUE;

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

A Hash Table as a Collection (concluded)

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Entry Inner Class


private static class Entry<T> { // value in the hash table T value;
// save value.hashCode() & Integer.MAX_VALUE int hashValue; // next entry in the linked list // of colliding values Entry<T> next; // entry with given data and node value Entry(T value, int hashValue, Entry<T> next) { this.value = value; this.hashValue = hashValue; this.next = next; } }
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash Class Instance Variables


The Entry array, table, defines the singly-linked lists that store the elements. The integer variable hashTableSize specifies the number of entries in the table. The variable tableThreshold has the value
(int)(table.length * MAX_LOAD_FACTOR)

where the double constant MAX_LOAD_FACTOR specifies the maximum allowed ratio of the elements in the table and the table size.
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash Class Instance Variables (concluded)

MAX_LOAD_FACTOR = 0.75 (number of hash table entries is 75% of the table size) is generally a good value. When the number of elements in the table equals tableThreshold, a rehash occurs. The variable modCount is used by iterators to determine whether external updates may have invalidated the scan.
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash Class Constructor

The Hash class constructor creates the 17-element array table with 17 empty lists. A rehash will first occur when the hash collection size equals 12.

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash Class Outline


public class Hash<T> implements Collection<T> { // the hash table private Entry[] table; private int hashTableSize; private final double MAX_LOAD_FACTOR = .75; private int tableThreshold;
// for iterator consistency checks private int modCount = 0;

// construct an empty hash table with 17 buckets public Hash() { table = new Entry[17]; hashTableSize = 0; tableThreshold = (int)(table.length * MAX_LOAD_FACTOR); } . . .
}
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash Class add()

The algorithm for add():

Compute the hash index for the parameter item and scan the list to see if item is currently in the hash table. If so, return false. Create a new Entry with value item and insert it at the front of the list.

hashValue is assigned to the entry so it will not have to be computed when rehashing occurs.

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash add() (continued)


// add item to the hash table if it is not // already present and return true; otherwise, // return false public boolean add(T item) { // compute the hash table index int hashValue = item.hashCode() & Integer.MAX_VALUE, index = hashValue % table.length; Entry<T> entry; // entry references the front of a linked // list of colliding values entry = table[index];

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash add() (continued)


// scan the linked list and return false // if item is in list while (entry != null) { if (entry.value.equals(item)) return false;
entry = entry.next; } // we will add item, so increment modCount modCount++; // create the new table entry so its successor // is the current head of the list entry = new Entry<T>(item, hashValue, (Entry<T>)table[index]);

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash add() (concluded)


// add it at the front of the linked list // and increment the size of the hash table table[index] = entry; hashTableSize++; if (hashTableSize >= tableThreshold) rehash(2*table.length + 1); // a new entry is added return true; }

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash Class add() (continued)

Increment hashTableSize and modCount. If hashTableSize tableThreshold, call rehash(). The size of the new table is
2*table.length + 1

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash Class rehash()

The method rehash() takes the size of the new hash table as an argument performs rehashing.

Create a new table with the specified size and cycle through the nodes in the original table. For each node, use the hashValue field modulo the new table size to hash to the new index. Insert the node at the front of the linked list.

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash Class rehash() (continued)


private void rehash(int newTableSize) { // allocate the new hash table and // record a reference to the current // one in oldTable Entry[] newTable = new Entry[newTableSize], oldTable = table; Entry<T> entry, nextEntry; int index; // cycle through the current hash table for (int i=0; i < table.length; i++) { // record the current entry entry = table[i];

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash Class rehash() (continued)


// see if there is a linked list present if (entry != null) { // have at least one element in a linked list do { // record the next entry in the // original linked list nextEntry = entry.next; // compute the new table index index = entry.hashValue % newTableSize; // insert entry the front of the // new table's linked list at // location index entry.next = newTable[index]; newTable[index] = entry;
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash Class rehash() (concluded)


// assign the next entry in the // original linked list to entry entry = nextEntry; } while (entry != null);
} } // the table is now newTable table = newTable; // update the table threshold tableThreshold = (int)(table.length * MAX_LOAD_FACTOR); // let garbage collection get rid of oldTable oldTable = null; }

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash remove()

Compute the hash table index. Using variables prev and curr that move through the linked list in tandem, search for item. If not present, return false; otherwise, remove item from the list. If prev == null, this involves updating table[index] to reference the successor to the front of the list. Decrement hashTableSize, increment modCount, and return true.
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash remove() (continued)


public boolean remove(Object item) { // compute the hash table index int index = (item.hashCode() & Integer.MAX_VALUE) % table.length; Entry<T> curr, prev;
// curr references the front of a // linked list of colliding values; // initialize prev to null curr = table[index]; prev = null; // scan the linked list for item while (curr != null) if (curr.value.equals(item)) { // we have located item and will remove // it; increment modCount modCount++;
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash remove() (continued)


// if prev is not null, curr is not the front // of the list; just skip over curr if (prev != null) prev.next = curr.next; else // curr is front of the list; the // new front of the list is curr.next table[index] = curr.next; // decrement hash table size and return true hashTableSize--;

return true;
}

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash remove() (concluded)


else { // move prev and curr forward prev = curr; curr = curr.next; }

return false;
}

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash Class Iterators

Search the hash table for the first nonempty bucket in the array of linked lists. Once the bucket is located, the iterator traverses all of the elements in the corresponding linked list and then continues the process by looking for the next nonempty bucket. The iterator reaches the end of the table when it reaches the end of the list for the last nonempty bucket.
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash Class Iterators (continued)

Iterator objects are instances of the inner class IteratorImpl whose variables are:

Integer index that identifies the current bucket (table[index]) scanned by the iterator. The Entry reference next pointing to the current node in the current bucket. The variable lastReturned that references the last value returned by next(). The iterator variable expectedModCount used in conjunction with the collection variable modCount.
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash Class Iterators (continued)


// inner class that implements hash table iterators private class IteratorImpl implements Iterator<T> { // next entry to return Entry<T> next; // to check iterator consistency int expectedModCount; // index of current bucket int index; // reference to the last value returned by next() T lastReturned; . . . }

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash Class Iterators (continued)

The elements enter the collection in the order (19, 32, 11, 27) using the identify hash function. The iterator visits the elements in the order (11, 32, 27, 19).

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash Iterator Constructor

A loop iterates up the list of buckets until it locates the first nonempty bucket. The loop variable i becomes the initial value for index and table[i] references the front of the list. This is the initial value for next.

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash Iterator Constructor (concluded)


IteratorImpl() { int i = 0; Entry<T> n = null; // the expected modCount starts at modCount expectedModCount = modCount; // find the first nonempty bucket if (hashTableSize != 0) while (i < table.length && ((n = table[i]) == null)) i++; next = n; index = i; lastReturned = null;

}
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash Iterator next()

The method next() first determines that the operation is valid by checking that modCount and expectedModCount are equal and that we are not at the end of the hash table. If the iterator is in a consistent state, next() saves entry.value in lastReturned and uses a loop index i and entry to perform the iterator scan for the subsequent element in the hash table. The return value is lastReturned.
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash Iterator next() (continued)


public T next() { // check for iterator consistency if (modCount != expectedModCount) throw new ConcurrentModificationException();
// we will return the value in Entry object next Entry<T> entry = next; // if entry is null, we are at the end of the table if (entry == null) throw new NoSuchElementException(); // capture the value we will return lastReturned = entry.value; // move to the next entry in the current // linked list Entry<T> n = entry.next; // record the current bucket index int i = index;
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash Iterator next() (concluded)


if (n == null) { // we are at the end of a bucket; search for the // next nonempty bucket i++; while (i < table.length && (n = table[i]) == null) i++; } index = i; next = n; return lastReturned; }

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash Iterator remove()

The remove() method first determines that the operation is valid by checking that lastReturned is not null and that modCount and expectedModCount are equal. If all is well, the iterator remove() method calls the Hash class remove() method with lastReturned as the argument. By assigning to expectedModCount the current value of modCount, the iterator remains consistent.
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash Iterator remove() (concluded)


public void remove() { // check for a missing call to next() or previous() if (lastReturned == null) throw new IllegalStateException( "Iterator call to next() " + "required before calling remove()"); if (modCount != expectedModCount) throw new ConcurrentModificationException(); // remove lastReturned by calling remove() in Hash; // this call will increment modCount Hash.this.remove(lastReturned); expectedModCount = modCount; lastReturned = null; }

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

The HashMap Collection

The design of the HashMap collection is similar to the implementation of TreeMap. A HashMap is not ordered since the position of elements depends on hashing the keys. This affects the method toString() which returns a listing of the elements based on the iterator order.

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

The HashMap Collection (continued)

The HashMap class stores elements in a hash table containing linked lists of Entry objects. The inner class Entry contains key-value pairs.

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

The HashMap Collection (continued)

The inner class Entry implements the Map.Entry interface which defines the methods getKey(), getValue() and setValue(). A toString() method returns a representation of an entry in the format "key=value". The constructor has arguments for each field in the node.

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Entry Class (partial listing)


static class Entry<K,V> implements Map.Entry<K,V> { K key; V value; Entry<K,V> next; int hashValue; // make a new entry with given key, value Entry(K key, V value, int hashValue, Entry<K,V> next) { this.key = key; this.value = value; this.hashValue = hashValue; this.next = next; } ...

}
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Accessing Entries in a HashMap

The methods get(), and containsKey() take a key reference argument and must locate a corresponding entry in the map.

This task is performed by the private HashMap method getEntry() which takes a key as an argument, applies the hash function to the key and searches the resulting list for a key-value pair with the same key.

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Accessing Entries in a HashMap (continued)


// return a reference to the entry with the specified key // if there is one in the hash map; otherwise, return null public Entry<K,V> getEntry(K key) { int index = (key.hashCode() & Integer.MAX_VALUE) % table.length; Entry<K,V> entry; entry = table[index]; while (entry != null) { if (entry.key.equals(key)) return entry; entry = entry.next; }

return null;
}
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Accessing Entries in a HashMap (concluded)


// returns the value that corresponds to // the specified key public V get(K key) { Entry<K,V> p = getEntry(key); if (p == null) return null; else return p.value; }

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Updating Entries in a HashMap

The method put() updates the HashMap.

Construct a table index by applying the hash function for the key and scan the linked list for a match with the key. If a match occurs, apply setValue() and return its result. If key does not occur in the list, insert a new Entry object at the front of the linked list. If the hash map size has reached the table threshold, apply rehashing. Conclude by returning null.
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Updating Entries in a HashMap (continued)


// assigns value as the value associated with key // in this map and returns the previous value // associated with the key, or null if there // was no mapping for the key public V put(K key, V value) { // compute the hash table index int hashValue = key.hashCode() & Integer.MAX_VALUE, index = hashValue % table.length; Entry<K,V> entry; // entry references the front of a linked // list of colliding values entry = table[index];

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Updating Entries in a HashMap (continued)


// scan the linked list. if key matches the key in an // entry, return entry.setValue(value). this // replaces the value in the entry and returns the // previous value while (entry != null) { if (entry.key.equals(key)) return entry.setValue(value); entry = entry.next; } // we will add item, so increment modCount modCount++; // create the new table entry so its successor // is the current head of the list entry = new Entry<K,V>(key, value, hashValue, (Entry<K,V>)table[index]);
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Updating Entries in a HashMap (concluded)


// add it at the front of the linked list // and increment the size of the hash map table[index] = entry; hashMapSize++; if (hashMapSize >= tableThreshold) rehash(2*table.length + 1);

return null;
}

// a new entry is inserted

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Summary of HashMap Design

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

The HashSet class uses a HashMap by composition. The class defines a static Object reference called PRESENT. This becomes the value component for each entry in the map. The constant reference serves as a dummy placeholder in an entry pair.

HashSet Class

Declare a private instance variable map of type HashMap having T as the type of the set elements and Object as the value type. The constructor instantiates the map collection. This has the effect of creating an empty set.
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

HashSet Class (continued)


public class HashSet<T> implements Set<T> { // value for each key in the map private static final Object PRESENT = new Object(); // set implemented using a hash map private HashMap<T, Object> map; // create an empty set object public HashSet() { map = new HashMap<T,Object>(); } . . . }

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

HashSet add()

The set methods are implemented with map methods that use the entry <item, PRESENT> as the argument.

add() uses the map method put(). If a duplicate exists, then put() simply updates the value field of the entry to PRESENT which is its current value. The map method returns null if a new element is added, so a return value of null indicates that the add() inserted item.

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

HashSet add() (concluded)


public boolean add(T item) { return map.put(item, PRESENT) == null; }

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

HashSet iterator()

The HashSet iterator must traverse the keys in the map. Implement the method iterator() by returning an iterator for the key set collection view of the map.
// returns an iterator for the elements in the set public Iterator<T> iterator() { return map.keySet().iterator(); }

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

HashSet remove()

The HashSet remove() method calls the remove() method for the map. To determine whether an element was removed from the set, verify that the return value from the map remove() call is the reference PRESENT.
public boolean remove(Object obj) { return map.remove(obj) == PRESENT; }

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash Table Performance

A good hash function provides a uniform distribution of hash values. Hash table performance is measured by using the load factor = n/m, where n is the number of elements in the hash table and m is the number of buckets.

For linear probe, 0 1. For chaining with separate lists, it is possible that > 1.
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

The worst case linear probe or chaining with separate lists occurs when all data items hash to the same table location. If the table contains n elements, the search time is O(n), no better than that for the sequential search.

Hash Table Performance (continued)

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Hash Table Performance (continued)

Assume that the hash function uniformly distributes indices around the hash table.

We can expect = n/m elements in each bucket.


On the average, an unsuccessful search makes comparisons before arriving at the end of a list and returning failure. Mathematical analysis shows that the average number of probes for a successful search is approximately 1 + /2.

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Assume the number of elements n in the hash table is bounded by some amount, say, R*m, where m is the table size.

Hash Table Performance (concluded)

In this case, = n/m (R*m)/m = R, and the following relationships hold for the average cases, so the average running time is O(1)!
S 1 + /2 1 + R/2 U = R (Successful Search) (Unsuccessful Search)

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Evaluating Ordered and Unordered Sets and Maps

Use an ordered set or map if an iteration should return elements in order (average search O(log2n). Use an unordered set or map when fast access and updates are needed without any concern for the ordering of elements (average search time O(1)).

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Timing Example

Program SearchComp.java:

Reads a file of 25025 randomly ordered words and inserts each word into a TreeSet and into a HashSet. Determines the amount of time required to build both of the data structures. Shuffles the input from the file and times a search of the TreeSet and HashSet for each word in the shuffled input. Displays the time required for each search technique.
2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Timing Example (concluded)


Run: Number of words is 25025 Built TreeSet in 0.078 seconds Built HashSet in 0.047 seconds TreeSet search time is 0.078 seconds HashSet search time is 0.016 seconds

Note that the HashSet search time is considerably better than that for a TreeSet.

2005 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved.

Anda mungkin juga menyukai