Intro to Caching,Caching algorithms and caching frameworks part 1
Introduction: A lot of us heard the word cache and when you ask them about caching they give you a perfect answer but they dont know how it is built, or on which criteria I should favor this caching framework over that one and so on, in this article we are going to talk about Caching, Caching Algorithms and caching frameworks and which is better than the other. The Interview: "Caching is a temp location where I store data in (data that I need it frequently) as the original data is expensive to be fetched, so I can retrieve it faster. " That what programmer 1 answered in the interview (one month ago he submitted his resume to a company who wanted a java programmer with a strong experience in caching and caching frameworks and extensive data manipulation) Programmer 1 did make his own cache implementation using hashtable and that what he only knows about caching and his hashtable contains about 150 entry which he consider an extensive data(caching = hashtable, load the lookups in hashtable and everything will be fine nothing else) so lets see how will the interview goes. Interviewer: Nice and based on what criteria do you choose your caching solution? Programmer 1 :huh, (thinking for 5 minutes) , mmm based on, on , on the data (coughing) Interviewer: excuse me! Could you repeat what you just said again? Programmer 1: data?! Interviewer: oh I see, ok list some caching algorithms and tell me which is used for what Programmer 1: (staring at the interviewer and making strange expressions with his face, expressions that no one knew that a human face can do :D ) Interviewer: ok, let me ask it in another way, how will a caching behave if it reached its capacity? Programmer 1: capacity? Mmm (thinking hashtable is not limited to capacity I can add what I want and it will extend its capacity) (that was in programmer 1 mind he didnt say it) The Interviewer thanked programmer 1 (the interview only lasted for 10minutes) after that a woman came and said: oh thanks for you time we will call you back have a nice day This was the worst interview programmer 1 had (he didnt read that there was a part in the job description which stated that the candidate should have strong caching background ,in fact he only saw the line talking about excellent package ;) ) Talk the talk and then walk the walk After programmer 1 left he wanted to know what were the interviewer talking about and what are the answers to his questions so he started to surf the net, Programmer 1 didnt know anything else about caching except: when I need cache I will use hashtable After using his favorite search engine he was able to find a nice caching article and started to read. Why do we need cache? Long time ago before caching age user used to request an object and this object was fetched from a storage place and as the object grow bigger and bigger, the user had spend more time to fulfill his request, it really made the storage place suffer a lot coz it had to be working for the whole time this caused both the user and the db to be angry and there were one of 2 possibilities 1- The user will get upset and complain and even wont use this application again(that was the case always) 2- The storage place will pack up its bags and leave your application , and that made a big problems(no place to store data) (happened in rare situations ) Caching is a god sent: After few years researchers at IBM (in 60s) introduced a new concept and named it Cache What is Cache? Caching is a temp location where I store data in (data that I need it frequently) as the original data is expensive to be fetched, so I can retrieve it faster. Caching is made of pool of entries and these entries are a copy of real data which are in storage (database for example) and it is tagged with a tag (key identifier) value for retrieval. Great so programmer 1 already knows this but what he doesnt know is caching terminologies which are as follow: Cache Hit: When the client invokes a request (lets say he want to view product information) and our application gets the request it will need to access the product data in our storage (database), it first checks the cache. If an entry can be found with a tag matching that of the desired data (say product Id), the entry is used instead. This is known as a cache hit (cache hit is the primary measurement for the caching effectiveness we will discuss that later on). And the percentage of accesses that result in cache hits is known as the hit rate or hit ratio of the cache. Cache Miss: On the contrary when the tag isnt found in the cache (no match were found) this is known as cache miss , a hit to the back storage is made and the data is fetched back and it is placed in the cache so in future hits it will be found and will make a cache hit. If we encountered a cache miss there can be either a scenarios from two scenarios: First scenario: there is free space in the cache (the cache didnt reach its limit and there is free space) so in this case the object that cause the cache miss will be retrieved from our storage and get inserted in to the cache. Second Scenario: there is no free space in the cache (cache reached its capacity) so the object that cause cache miss will be fetched from the storage and then we will have to decide which object in the cache we need to move in order to place our newly created object (the one we just retrieved) this is done by replacement policy (caching algorithms) that decide which entry will be remove to make more room which will be discussed below. Storage Cost: When a cache miss occurs, data will be fetch it from the back storage, load it and place it in the cache but how much space the data we just fetched takes in the cache memory? This is known as Storage cost Retrieval Cost: And when we need to load the data we need to know how much does it take to load the data. This is known as Retrieval cost Invalidation: When the object that resides in the cache need is updated in the back storage for example it needs to be updated, so keeping the cache up to date is known as Invalidation. Entry will be invalidate from cache and fetched again from the back storage to get an updated version. Replacement Policy: When cache miss happens, the cache ejects some other entry in order to make room for the previously uncached data (in case we dont have enough room). The heuristic used to select the entry to eject is known as the replacement policy. Optimal Replacement Policy: The theoretically optimal page replacement algorithm (also known as OPT or Beladys optimal page replacement policy) is an algorithm that tries to achieve the following: when a cached object need to be placed in the cache, the cache algorithm should replace the entry which will not be used for the longest period of time. For example, a cache entry that is not going to be used for the next 10 seconds will be replaced by an entry that is going to be used within the next 2 seconds. Thinking of the optimal replacement policy we can say it is impossible to achieve but some algorithms do near optimal replacement policy based on heuristics. So everything is based on heuristics so what makes algorithm better than another one? And what do they use for their heuristics? Nightmare at Java Street: While reading the article programmer 1 fall a sleep and had nightmare (the scariest nightmare one can ever have) Programmer 1: nihahha I will invalidate you. (Talking in a mad way) Cached Object: no no please let me live, they still need me, I have children. Programmer 1: all cached entries say that before they are invalidated and since when do you have children? Never mind now vanish for ever. Buhaaahaha , laughed programmer 1 in a scary way, ,silence took over the place for few minutes and then a police serine broke this silence, police caught programmer 1 and he was accused of invalidating an entry that was still needed by a cache client, and he was sent to jail. Programmer 1 work up and he was really scared, he started to look around and realized that it was just a dream then he continued reading about caching and tried to get rid of his fears. Caching Algorithms: No one can talk about caching algorithms better than the caching algorithms themselves Least Frequently Used (LFU): I am Least Frequently used; I count how often an entry is needed by incrementing a counter associated with each entry. I remove the entry with least frequently used counter first am not that fast and I am not that good in adaptive actions (which means that it keeps the entries which is really needed and discard the ones that arent needed for the longest period based on the access pattern or in other words the request pattern) Least Recently Used (LRU): I am Least Recently Used cache algorithm; I remove the least recently used items first. The one that wasnt used for a longest time. I require keeping track of what was used when, which is expensive if one wants to make sure that I always discards the least recently used item. Web browsers use me for caching. New items are placed into the top of the cache. When the cache exceeds its size limit, I will discard items from the bottom. The trick is that whenever an item is accessed, I place at the top. So items which are frequently accessed tend to stay in the cache. There are two ways to implement me either an array or a linked list (which will have the least recently used entry at the back and the recently used at the front). I am fast and I am adaptive in other words I can adopt to data access pattern, I have a large family which completes me and they are even better than me (I do feel jealous some times but it is ok) some of my family member are (LRU2 and 2Q) (they were implemented in order to improve LRU caching Least Recently Used 2(LRU2): I am Least recently used 2, some people calls me least recently used twice which I like it more, I add entries to the cache the second time they are accessed (it requires two times in order to place an entry in the cache); when the cache is full, I remove the entry that has a second most recent access. Because of the need to track the two most recent accesses, access overhead increases with cache size, If I am applied to a big cache size, that would be a problem, which can be a disadvantage. In addition, I have to keep track of some items not yet in the cache (they arent requested two times yet).I am better that LRU and I am also adoptive to access patterns. -Two Queues: I am Two Queues; I add entries to an LRU cache as they are accessed. If an entry is accessed again, I move them to second, larger, LRU cache. I remove entries a so as to keep the first cache at about 1/3 the size of the second. I provide the advantages of LRU2 while keeping cache access overhead constant, rather than having it increase with cache size. Which makes me better than LRU2 and I am also like my family, am adaptive to access patterns. Adaptive Replacement Cache (ARC): I am Adaptive Replacement Cache; some people say that I balance between LRU and LFU, to improve combined result, well thats not 100% true actually I am made from 2 LRU lists, One list, say L1, contains entries that have been seen only once recently, while the other list, say L2, contains entries that have been seen at least twice recently. The items that have been seen twice within a short time have a low inter-arrival rate, and, hence, are thought of as high-frequency. Hence, we think of L1as capturing recency while L2 as capturing frequency so most of people think I am a balance between LRU and LFU but that is ok I am not angry form that. I am considered one of the best performance replacement algorithms, Self tuning algorithm and low overhead replacement cache I also keep history of entries equal to the size of the cache location; this is to remember the entries that were removed and it allows me to see if a removed entry should have stayed and we should have chosen another one to remove.(I really have bad memory)And yes I am fast and adaptive. Most Recently Used (MRU): I am most recently used, in contrast to LRU; I remove the most recently used items first. You will ask me why for sure, well let me tell you something when access is unpredictable, and determining the least most recently used entry in the cache system is a high time complexity operation, I am the best choice thats why. I am so common in the database memory caches, whenever a cached record is used; I replace it to the top of stack. And when there is no room the entry on the top of the stack, guess what? I will replace the top most entry with the new entry. First in First out (FIFO): I am first in first out; I am a low-overhead algorithm I require little effort for managing the cache entries. The idea is that I keep track of all the cache entries in a queue, with the most recent entry at the back, and the earliest entry in the front. When there e is no place and an entry needs to be replaced, I will remove the entry at the front of the queue (the oldest entry) and replaced with the current fetched entry. I am fast but I am not adaptive -Second Chance: Hello I am second change I am a modified form of the FIFO replacement algorithm, known as the Second chance replacement algorithm, I am better than FIFO at little cost for the improvement. I work by looking at the front of the queue as FIFO does, but instead of immediately replacing the cache entry (the oldest one), i check to see if its referenced bit is set(I use a bit that is used to tell me if this entry is being used or requested before or no). If it is not set, I will replace this entry. Otherwise, I will clear the referenced bit, and then insert this entry at the back of the queue (as if it were a new entry) I keep repeating this process. You can think of this as a circular queue. Second time I encounter the same entry I cleared its bit before, I will replace it as it now has its referenced bit cleared. am better than FIFO in speed -Clock: I am Clock and I am a more efficient version of FIFO than Second chance because I dont push the cached entries to the back of the list like Second change do, but I perform the same general function as Second-Chance. I keep a circular list of the cached entries in memory, with the "hand" (something like iterator) pointing to the oldest entry in the list. When cache miss occurs and no empty place exists, then I consult the R (referenced) bit at the hand's location to know what I should do. If R is 0, then I will place the new entry at the "hand" position, otherwise I will clear the R bit. Then, I will increment the hand (iterator) and repeat the process until an entry is replaced.I am faster even than second chance. Simple time-based: I am simple time-based caching; I invalidate entries in the cache based on absolute time periods. I add Items to the cache, and they remain in the cache for a specific amount of time. I am fast but not adaptive for access patterns Extended time-based expiration: I am extended time based expiration cache, I invalidate the items in the cache is based on relative time periods. I add Items the cache and they remain in the cache until I invalidate them at certain points in time, such as every five minutes, each day at 12.00. Sliding time-based expiration: I am Sliding time-base expiration, I invalidate entries a in the cache by specifying the amount of time the item is allowed to be idle in the cache after last access time. after that time I will invalidate it . I am fast but not adaptive for access patterns Ok after we listened to some replacement algorithms (famous ones) talking about themselves, some other replacement algorithms take into consideration some other criteria like: Cost: if items have different costs, keep those items that are expensive to obtain, e.g. those that take a long time to get. Size: If items have different sizes, the cache may want to discard a large item to store several smaller ones. Time: Some caches keep information that expires (e.g. a news cache, a DNS cache, or a web browser cache). The computer may discard items because they are expired. Depending on the size of the cache no further caching algorithm to discard items may be necessary. The E-mail! After programmer 1 did read the article he thought for a while and decided to send a mail to the author of this caching article, he felt like he heard the author name before but he couldnt remember who this person was but anyway he sent him mail asking him about what if he has a distributed environment? How will the cache behave? The author of the caching article got his mail and ironically it was the man who interviewed programmer 1 :D, The author replied and said : Distributed caching: *Caching Data can be stored in separate memory area from the caching directory itself (who handle the caching entries and so on) can be across network or disk for example. *Distrusting the cache allows increase in the cache size. *In this case the retrieval cost will increase also due to network request time. *This will also lead to hit ratio increase due to the large size of the cache But how will this work? Lets assume that we have 3 Servers, 2 of them will handle the distributed caching (have the caching entries), and the 3rd one will handle all the requests that are coming (Which asks about cached entries): Step 1: the application requests keys entry1, entry2 and entry3, after resolving the hash values for these entries, and based on the hashing value it will be decided to forward the request to the proper server. Step 2: the main node sends parallel requests to all relevant servers (which has the cache entry we are looking for). Step 3: the servers send responses to the main node (which sent the request in the 1st place asking to the cached entry). Step 4: the main node sends the responses to the application (cache client). *And in case the cache entry were not found (the hashing value for the entry will be still computed and will redirect either to server 1 or server 2 for example and in this case our entry wont be found in server 1 so it will fetched from the DB and added to server 1 caching list. Measuring Cache: Most caches can be evaluated based on measuring the hit ratio and comparing to the theoretical optimum, this is Widget by Css Reflex | TutZone usually done by generation a list of cache keys with no real data, but here the hit ratio measurement assumes that all entries have the same retrieval cost which is not true for example in web caching the number of bytes the cache can server is more important than the number of hit ration (for example I can replace the big entry will 10 small entries which is more effective in web) Conclusion: We have seen some of popular algorithms that are used in caching, some of them are based on time, cache object size and some are based on frequency of usage, next part we are going to talk about the caching framework and how do they make use of these caching algorithms, so stay tuned ;) Related Articles: Part 2 (Algorithm Implementation) Part 3 (Algorithm Implementation) Part 4 (Frameworks Comparison) Part 5 (Frameworks Comparison) Posted by Ahmed Ali at 1:09 PM Labels: Algorithms, Framework Intro to Caching,Caching algorithms and caching frameworks part 2 Introduction: In this part we are going to show how to implement some of the famous replacement algorithms as we mentioned in part 1, the code in this article is just for demonstration purpose which means you will have to do some extra effort if you want to make use of it in your application (if you are going to build your own implementation and wont use any caching frameworks) The Leftover policy: After programmer 1 read the article he proceeded to review the comments on this article, one of these comments were talking about leftover policy, which is named Random Cache Random Cache: I am random cache, I replace any cache entry I want, I just do that and no one can complain about that, you can say the unlucky entry, by doing this I remove any overhead of tracking references or so, am better than FIFO policy, in some cases I perform even better than LRU but in general LRU is better than me. It is comment time: While programmer 1 was reading the rest of the comments, he found very interesting comment about implementation of some of the famous replacement policies, actually it was a link to the commenter site which has the actual implementation so programmer 1 clicked the link and here what he got: Meet the Cache Element: public class CacheElement { private Object objectValue; private Object objectKey; private int index; private int hitCount; . . // getters and setters . } This is the cache entry which will use to hold the key and the value; this will be used in all the cache algorithms implementation Common Code for All Caches: public final synchronized void addElement(Object key,Object value) { int index; Object obj; // get the entry from the table obj = table.get(key); // If we have the entry already in our table then get it and replace only its value. if (obj != null) { CacheElement element; element = (CacheElement) obj; element.setObjectValue(value); element.setObjectKey(key); return; } } The above code will be common for all our implementation; it is about checking if the cacheElemnet already exists in our cache, if so then we just need to place its value and we dont need to make anything else but what if we didnt find it ? Then we will have to dig deeper and see what will happen below. The Talk Show: Todays episode is a special episode , we have special guests , they are in fact compotators we are going to hear what everyone has to say but first lets introduce our guests: Random Cache, FIFO Cache Lets start with the Random Cache. Meet Random Cache implementation: public final synchronized void addElement(Object key,Object value) { int index; Object obj; obj = table.get(key); if (obj != null) { CacheElement element; // Just replace the value. element = (CacheElement) obj; element.setObjectValue(value); element.setObjectKey(key); return; } // If we haven't filled the cache yet, put it at the end. if (!isFull()) { index = numEntries; ++numEntries; } else { // Otherwise, replace a random entry. index = (int) (cache.length * random.nextFloat()); table.remove(cache[index].getObjectKey()); } cache[index].setObjectValue(value); cache[index].setObjectKey(key); table.put(key, cache[index]); } Analyzing Random Cache Code (Talk show): In todays show the Random Cache is going to explain the code line by line and here we go. I will go straight to the main point; if I am not full then I will place the new entry that the client requested at the end of the cache (in case there is a cache miss). I do this by getting the number of entries that resides in the cache and assign it to index (which will be the index of the current entry the client is adding) after that I increment the number of entries. if (!isFull()) { index = numEntries; ++numEntries; } If I dont have enough room for the current entry, I will have to kick out a random entry (totally random, bribing isnt allowed). In order to get the random entry, I will use the random util. shipped with java to generate a random index and ask the cache to remove the entry that its index equal to the generated index. else { // Otherwise, replace a random entry. index = (int) (cache.length * random.nextFloat()); table.remove(cache[index].getObjectKey()); } At the end I just place the entry -either the cache was full or no- in the cache. cache[index].setObjectValue(value); cache[index].setObjectKey(key); table.put(key, cache[index]); Magnifying the Code: It is said that when you look at stuff from a near view it is better to understand it, so thats why we have a magnifying glass and we are going to magnify the code to get more near to it (and maybe understand it more). Cache entries in the same voice: hi ho, hi ho, into cache we go. New cache entry: excuse me; I have a question! (Asking a singing old cache entry near to him) Old cache entry: go ahead. New cache entry: I am new here and I dont understand my role exactly, how will the algorithm handle us? Old cache entry: cache! (Instead of man!), you remind me of myself when I was new (1st time I was added to the cache), I used to ask questions like that, let me show you what will happen. Meet FIFO Cache Implementation: public final synchronized void addElement(Object key,Object value) { int index; Object obj; obj = table.get(key); if (obj != null) { CacheElement element; // Just replace the value. element = (CacheElement) obj; element.setObjectValue(value); element.setObjectKey(key); return; } // If we haven't filled the cache yet, put it at the end. if (!isFull()) { index = numEntries; ++numEntries; } else { // Otherwise, replace the current pointer, entry with the new one index = current; // in order to make Circular FIFO if (++current >= cache.length) current = 0; table.remove(cache[index].getObjectKey()); } cache[index].setObjectValue(value); cache[index].setObjectKey(key); table.put(key, cache[index]); } Analyzing FIFO Cache Code (Talk show): After Random Cache, audience went crazy for random cache, which made FIFO a little bit jealous so FIFO started talking and said: When there is no more rooms for the new cache entry , I will have to kick out the entry at the front (the one came first) as I work in a circular queue like manner, by default the current position is at the beginning of the queue(points to the beginning of the queue). I assign current value to index (index of the current entry) and then check to see if the incremented current greater than or equals to the cache length(coz I want to reset current pointer- position to the beginning of the queue) ,if so then I will set current to zero again ,after that I just kick the entry at the index position (Which is the first entry in the queue now) and place the new entry. else { // Otherwise, replace the current pointer, which takes care of // FIFO in a circular fashion. index = current; if (++current >= cache.length) current = 0; table.remove(cache[index].getObjectKey()); } cache[index].setObjectValue(value); cache[index].setObjectKey(key); table.put(key, cache[index]); Magnifying the Code: Back to our magnifying glass we can observe the following actions happening to our entries Widget by Css Reflex | TutZone Conclusion: As we have seen in this article how to implement the FIFO replacement policy and also Random replacement policy, in the upcoming articles we will try to take our magnifying glass and magnify LFU, LRU replacement policy, till then stay tuned ;) Posted by Ahmed Ali at 11:05 PM Labels: Algorithms, Framework Intro to Caching,Caching algorithms and caching frameworks part 3 Introduction: In part 1 we talked about the basics and terminologies of cache and we have also shown replacement policies , in part 2 we implemented some of these famous replacement polices and now in this part we will continue talking about the implementation of two famous algorithms which are LFU and LRU. Again, the implementation in this article is for sake of demonstration and in order to use it (we just concentrate over the replacement algorithm and we will skip other things like loading data and so on), you will have to do some extra work but you can base your implementation over it. Meet LFU Cache Implementation: public synchronized Object getElement(Object key) { Object obj; obj = table.get(key); if (obj != null) { CacheElement element = (CacheElement) obj; element.setHitCount(element.getHitCount() + 1); return element.getObjectValue(); } return null; } public final synchronized void addElement(Object key, Object value) { Object obj; obj = table.get(key); if (obj != null) { CacheElement element; // Just replace the value. element = (CacheElement) obj; element.setObjectValue(value); element.setObjectKey(key); return; } if (!isFull()) { index = numEntries; ++numEntries; } else { CacheElement element = removeLfuElement(); index = element.getIndex(); table.remove(element.getObjectKey()); } cache[index].setObjectValue(value); cache[index].setObjectKey(key); cache[index].setIndex(index); table.put(key, cache[index]); } public CacheElement removeLfuElement() { CacheElement[] elements = getElementsFromTable(); CacheElement leastElement = leastHit(elements); return leastElement; } public static CacheElement leastHit(CacheElement[] elements) { CacheElement lowestElement = null; for (int i = 0; i < elements.length; i++) { CacheElement element = elements[i]; if (lowestElement == null) { lowestElement = element; } else { if (element.getHitCount() < lowestElement.getHitCount()) { lowestElement = element; } } } return lowestElement; } Analyzing LFU Cache Code (Talk Show): Presenter: it is getting hotter and hotter now, our next contestant is LFU cache, please make some noise for it. Audience began to scream for LFU which made LFU hesitated. Hello, I am LFU, when the cache client want to add a new element and cache is full (no enough room for the new entry) I will have to kick out the least frequently used entry, by using the help of the removelfuElement method which will allow me to get the least frequently used element, after I get it, I will remove this entry and place the new entry else { CacheElement element = removeLfuElement(); index = element.getIndex(); table.remove(element.getObjectKey()); } If we dived into this method, I am saying if we dived into this method (still nothing happened) LFU tried pressing the next button on the presentation remote control (to get the next presentation slide) but I didnt work. Ahh now we are talking, ok if we dived into this method we will see that the method is just getting the whole elements in cache by calling getElementsFromTable method and then returns the element with the least hit. public CacheElement removeLfuElement() { CacheElement[] elements = getElementsFromTable(); CacheElement leastElement = leastHit(elements); return leastElement; } } By calling leastHit method which loops over the cache elements and check if the current element has the least hit, if so, I will make it my lowestElement which I am going replace the new entry with. public static CacheElement leastHit(CacheElement[] elements) { CacheElement lowestElement = null; for (int i = 0; i <> CacheElement element = elements[i]; if (lowestElement == null) { lowestElement = element; } else { if (element.getHitCount() <> { lowestElement = element; } } } return lowestElement; } LFU stopped talking and waited for any action from the audience and the only action it get was scratching heads (audience didnt get some stuff). One of the production team whispered to LFU cache and said: you didnt mention how the lowest element will be distinguished from another element? Then LFU cache started talking gain and said: By default when you add the element to the cache its hitCoint will be the same as the previous element so how do we handle the hit count thing? Every time I encounter a cache hit I will increment the hit count of the entry and then return the entry the cache client asked for which would be something like that public synchronized Object getElement(Object key) { Object obj; obj = table.get(key); if (obj != null) { CacheElement element = (CacheElement) obj; element.setHitCount(element.getHitCount() + 1); return element.getObjectValue(); } return null; } Magnifying the Code: Did anyone say magnification? Meet LRU Cache Implementation: private void moveToFront(int index) { int nextIndex, prevIndex; if(head != index) { nextIndex = next[index]; prevIndex = prev[index]; // Only the head has a prev entry that is an invalid index so // we don't check. next[prevIndex] = nextIndex; // Make sure index is valid. If it isn't, we're at the tail // and don't set prev[next]. if(nextIndex >= 0) prev[nextIndex] = prevIndex; else tail = prevIndex; prev[index] = -1; next[index] = head; prev[head] = index; head = index; } } public final synchronized void addElement(Object key, Object value) { int index; Object obj; obj = table.get(key); if(obj != null) { CacheElement entry; // Just replace the value, but move it to the front. entry = (CacheElement)obj; entry.setObjectValue(value); entry.setObjectKey(key); moveToFront(entry.getIndex()); return; } // If we haven't filled the cache yet, place in next available spot // and move to front. if(!isFull()) { if(_numEntries > 0) { prev[_numEntries] = tail; next[_numEntries] = -1; moveToFront(numEntries); } ++numEntries; } else { // We replace the tail of the list. table.remove(cache[tail].getObjectKey()); moveToFront(tail); } cache[head].setObjectValue(value); cache[head].setObjectKey(key); table.put(key, cache[head]); } Analyzing LRU Cache Code (Talk show): After LFU finished talking, there were not much screaming, they didnt like the presentation and LFU was hesitating while talking, this gave a big push to LRU which started by saying: This time I will consider the case also when the cache is not full, I am little more complex than those other algorithms, when the cache isnt full and it is the first entry I will just increment the numEntries which represents the number of entries in the cache. After adding a second entry I will need to move it to the front by calling moveToFront method (we will talk about it soon), I didnt do this for the first entry because it is for sure the first element. So lets see some action. As you can see I am stating that the previous of the current entry will have the tail value and the next entry will be -1 (undefined in other words) these are just initial data. After adding the new entry (which isnt the first entry) I will move it to front. if(!isFull()) { if(_numEntries > 0) { prev[_numEntries] = tail; next[_numEntries] = -1; moveToFront(numEntries); } ++numEntries; } The moveToFront method moves an entry to the head of the array so that the least recently used elements reside at the bottom of the array. Before I do any move I check if the head is not equal to current index (this will be false in case we only have 1 entry) if yes, then assign the value of the next of the current entry (which is a pointer to next entry as in linked list) to nextIndex and the value of the previous of the current entry (which is a pointer to the previous entry as in linked list) to prevIndex int nextIndex, prevIndex; if(head != index) { nextIndex = next[index]; prevIndex = prev[index]; Then I assign the value of the nextIndex to the value of next of the previous entry // Only the head has a prev entry that is an invalid index so // we don't check. next[prevIndex] = nextIndex; After that I am going to check for the nextIndex if it is greater that or equal 0 then the previous the next entry will have the value of prevIndex , else the tail will be equal to the prevIndex // Make sure index is valid. If it isn't, we're at the tail // and don't set prev[next]. if(nextIndex >= 0) prev[nextIndex] = prevIndex; else tail = prevIndex; And because I moved this entry to the front so there wont be any previous entry for it so am assigning -1 to it and the next entry of the current entry (top one) will be the head (previous old head) and the prev of head (the old head) will have the index of the current entry and then the new head is assigned the new index (current index) prev[index] = -1; next[index] = head; prev[head] = index; head = index; Magnifying the Code: It is magnifying time! Get your magnifying glass we are going to see some interesting stuff here It is Confession Time! : LRU didnt mention that it is possible to implement the LRU algorithm in a simple way , our previous implementation is based on Arrays , the other implementation that LRU cache didnt mention is through LinkedHashMap which was introduced in JDK 1.4 public class LRUCache2 extends LinkedHashMap { private Widget by Css Reflex | TutZone static final int MAX_ENTRIES = 3; public LRUCache2() { super(MAX_ENTRIES+1, .75F, true); } // This method is invoked by put and putAll after inserting a new entry into // the map. It allows the map to have up to 3 entries and then // delete the oldest entry each time a new entry is added. protected boolean removeEldestEntry(Map.Entry eldest) { return this.size() > MAX_ENTRIES; } } For sure, the LinkedHashMap solution is less time consuming that the array solution and it is more efficient coz you will leave the handling of the deletion and so on to the JDK itself, so you wont bother yourself implementing such stuff. OSCache use such implementation in its LRU caching implementation. Conclusion: We have seen how to implement LFU and LRU algorithms and the two ways to implement the LRU, it is based on you to choose which way to use, Arrays or LinkedHashMap for me I would recommend Arrays for small size entries and LinkedHashMap for big size entries. In next part we will be talking about the Caching framework and a comparison between them and what caching algorithm is employed by which caching framework, stay tuned till then ;) Posted by Ahmed Ali at 10:55 PM Intro to Caching,Caching algorithms and caching frameworks part 4 Introduction: In part 1 we talked about Caching introduction and some terminologies of caching and in part 2 and part 3 we have seen some implementation of the famous replacement cache algorithms and now in this part we will see comparison between open source java caching frameworks as I am not that rich to buy commercial frameworks :D. In this part we will talking about OSCache,Ehcache,JCS and Cache4J and we are going to concentrate on memory caching only, there will be performance comparison based on in memory caching by using JBoss caching benchmark framework and other test cases for cache. The Task: Programming Mania is a famous programming magazine from geeks to geeks every release from the magazine there a section specialized in frameworks comparison like MVC, ORM and so on, this month they decided that they are going to make a comparison about caching frameworks And as we know the editors have programmatic background, in fact they are real programmers (not fake ones). Head of Editors: this time we want to make our comparison article about caching frameworks, so we need to investigate the already used caching frameworks and I dont need to remind you that the economic crisis affected us as well, so we will just care about open source frameworks. Programmer 1: oh, okay no problem in that. Head of Editors: excellent, oh and by the way, we will make it in two parts so try getting as much information as you can. Programmer 1: ok, no problem. Head of Editors: oh yea, one more thing, I am excepting you to be done by the day after tomorrow as we are going to release the article this week. Programmer 1: !!! : (Shocked) First few lines! In order for programmer 1 to make the right comparison he needs to know what type of objects or what caching frameworks cache, some caching frameworks cache just normal POJOs while others cache portions of JSPs and so on, below is a list of common objects that caching frameworks cache 1-POJO Caching 2-HTTP Response Caching 3-JSP Caching 4-ORM Data Access Caching The Checklist: After Programmer 1 read a lot about caching he made a check list which enables him to make the comparison of the different frameworks, he will validates each item from the check list against all the caching frameworks. The check list is as follow: Programmer 1 decided to list the famous caching frameworks he is going to compare between so he selected the following frameworks: Java Caching System (JCS) Ehcache OSCache Cache4J ShiftOne WhirlyCache SwarmCache JBoss Cache As soon he finished listing the frameworks he started to write the first few lines in the 1st part Java Caching System (JCS): JCS is a distributed caching system written in java for server-side java applications. It is intended to speed up dynamic web applications by providing a means to manage cached data of various dynamic natures. Like any caching system, the JCS is most useful for high read, low put applications. The foundation of JCS is the Composite Cache, which is the pluggable controller for a cache region. Four types of caches can be plugged into the Composite Cache for any given region: Memory, Disk, Lateral, and Remote. The JCS jar provides production ready implementations of each of the four types of caches. In addition to the core four, JCS also provides additional plug-ins of each type. JCS provides a framework with no point of failure, allowing for full session failover (in clustered environments), including session data across up to 256 servers JCS has a wick nested categorical removal, data expiration (idle time and max life) Extensible framework, fully configurable runtime parameters, and remote synchronization, remote store recovery, Non-blocking "zombie" (balking facade) pattern "balking facade pattern , if a method is invoked on an object and that object is not in appropriate state to execute that method, have the method return without doing anything is in state or even throw an exception for example 'IllegalStateException' The configurations of JCS are set in a properties file named config.ccf file. -Memory Cache: JCS support LRU and MRU, The LRU Memory Cache is an extremely fast, highly configurable memory cache. It uses a Least Recently Used algorithm to manage the number of items that can be stored in memory. The LRU Memory Cache uses its own LRU Map implementation that is significantly faster than both the commons LRUMap implementation and the LinkedHashMap that is provided with JDK1.4 up. (At least that what JCS claims which we will show below ) -Disk Cache: The Indexed Disk Cache is a fast, reliable, and highly configurable swap for cached data. The indexed disk cache follows the fastest pattern for disk swapping. -Lateral Cache: The TCP Lateral Cache provides an easy way to distribute cached data to multiple servers. It comes with a UDP discovery mechanism, so you can add nodes without having to reconfigure the entire farm. The TCP Lateral is highly configurable. -Remote Cache: JCS also provides an RMI based Remote Cache Server. Rather than having each node connects to every other node, you can use the remote cache server as the connection point. JCS and Check List: JCS in Action: Our programmer 1 was checking JCS site and in the site they claimed that its LRU Map caching algorithm is faster than LinkedHashMap that is shipped with JDK 1.4 and up. So our newbie ran the following test against JCS (1.3) and LinkedHashMap JDK 1.4 and 1.6 The above is the PC specification that we are going to run our test on In order to check what JCS claims we used their own test case from the JCS site (I will be using this test case for the rest of our frameworks testing) The following configuration file was used during the test: JCS After using this test case for LinkedHashMap and JCS we got the following results: JCS vs. LinkedHashMap Ehcache: Ehcache is a java distributed cache for general purpose caching, J2EE and light-weight containers tuned for large size cache objects. It features memory and disk stores, replicate by copy and invalidate, listeners, a gzip caching servlet filter, Fast, Simple. Ehcache Acts as a pluggable cache for Hibernate 2.1. with Small foot print, Minimal dependencies, fully documented and Production tested. It is used in a lot of Java frameworks such as Alfresco, Cocoon, Hibernate, Spring, JPOX, Jofti, Acegi, Kosmos, Tudu Lists and Lutece. One of its features is to cache domain objects that map to database entities. As the domain objects that maps to database entities is the core of any ORM system thats why Ehcache is the default cache for HibernateWith Ehcache you can serialize both Serializable objects and Non-serializable. Non-serializable Objects can use all parts of Ehcache except for Disk Store and replication. If an attempt is made to persist or replicate them they are discarded and a WARNING level log message emitted. Another feature in Ehache is that admin can monitor the cache statistics, configuration changing and managing the cache through JMX service as Ehcache supports it (Which is really nice feature). The configurations of Ehcache are set in an xml file named ehcache.xml file. -Memory Cache: EHCache support LRU, LFU and FIFO. -Disk Cache: Ehcache can store up to 100G of data to disk and access them in a fast manner. Ehcache and Check List: OSCache: OSCache is a caching solution that includes a JSP tag library and set of classes to perform fine grained dynamic caching of JSP content, servlet responses or arbitrary objects. It provides both in memory and persistent on disk caches, and can allow your site to continue functioning normally even if the data source is down(for example if an error occurs like your db goes down, you can serve the cached content so people can still surf the site). When dealing with static HTML pages. The Page response can be cached indefinitely in memory thus avoiding reprocessing of the page. OSCache do so by using the URI and query parameters to form a unique key. This key is used to store page content. HttpResponse caching is implemented as a ServletFilter. Thus, the cache filter abstracts the API usage from the client. By default, the Cache Filter holds the page response in 'Application' scope and refreshes the cache every one hour. These default values can be changed. In case of dynamic pages (JSPs), OSCache provides tags that surround the static part in the page. Thus, only the static part of the page is cached. OSCache can be configured for persistence cache. When the memory capacity is reached, objects are evicted from the memory and stored on a hard disk. Objects are evicted from memory based on the configured cache algorithm. Other caching places (like DB for example) you could also implement your own custom Persistencelistener (to persist in a any place you want) OSCache supports distributed caching. When an application is deployed in a cluster of application servers, the local cache is kept in sync by communication amongst all the caches in the cluster; this is achieved either by JMS or by JGroups. Multiple caches can be created, each with their own unique configuration. Another feature in OSCache is that admin can monitor the cache statistics; configuration changing and managing the cache through JMX service but this is only available via spring framework (while Ehcache supports this feature without the need of any other framework or so). OSCache is also used by many projects Jofti, Spring, Hibernate. OSCache is also used by many sites like TheServerSide, JRoller, JavaLobby The configurations of OSCache are set in a property file named oscache.properties file. -Memory Cache: OSCache support LRU and FIFO, and any other custom replacement algorithm -Disk Cache: OSCache supports the Disk cache, when using memory anddisk since, when capacity is reached, item is removed from memory but notfrom disk. Therefore, if that item is needed again, it will be found on diskand brought back into memory. You get a behavior similar as a browsercache. However you still need to do some administrative tasks to clean the diskcache periodically since this has not been implemented in OSCache. OSCache and Check List: Cache4J: Cache4j is a cache for Java objects that stores objects only in memory (suitable for Russian speaking guys only as there is not documentation in English and the JavaDoc is in Russian also :D). It is mainly useful for caching POJO objects only. In the wish list they stated that they want to support disk caching and distributed handling also but that was long time ago in 2006 but nothing happened. It supports LRU, LFU, and FIFO caching algorithms. For storing objects in its cache, cache4j offers hard and soft references (best practice for caching frameworks is to use the weak reference and soft reference because if the JVM needs to garbage collect some objects to make room in memory, then the cached objects will be the first one to be removed). Cache4j is implemented in a way that multiple application threads can access the cache simultaneously. It also provides easy to use programming APIs -Memory Cache: Cache4J support LRU, LFU and FIFO Cache4J Check List: Performance in action: Ok now it is show time for this performance testing programmer 1 used 3 different test cases which are as follow: 1-Test Case from JCS site (applied on all caching frameworks) 2-JBoss Cache benchmark framework (which is really a very nice cache benchmark framework) 3-Test Case from Cache4J site (applied on all caching frameworks) In the 1st and 3rd cache test case it just simple testing of retrieve and populating the cache, while in JBoss cache benchmark there are a lot of test cases shipped with the benchmark from replication to distributed and clustering testing. All the testing here were performed on a single machine (no distributed testing were performed) and all the testing were performed in memory. The versions of the frameworks we are going to test now are as follow: OSCache: 2.4.1 Ehcache: 1.6.0 JCS: 1.3 Cache4j: 0.4 Configurations Used: OSCache Ehcache JCS Cache4J: SynchronizedCache cache = new SynchronizedCache(); cache.setCacheConfig(new CacheConfigImpl("cacheId", null, 0, 0, 0, 1000000, null, "lru", "strong")); JBoss cache benchmark: We can see here that there is nearly 8 million get operation invoked on the different cache frameworks and the JCS took the smallest time while OSCache took the biggest time We see here that there is nearly 2 million put operation invoked on the different cache frameworks and cache4j took the smallest time while OSCache took the biggest time The cache test performed here was in memory cache and there were 25 threads accessing the cache but we will not depend on this only and we will just continue with our testing JCS Test Case: OScache vs. Ehcache OSCache vs. JCS OScache vs. Cache4J Ehcache vs. JCS Ehcache vs. Cache4J JCS vs. Cache4J The winner in this test in Ehcache which achieved outstanding results against all the other frameworks, this test is just adding 50,000 items to the cache and then retrieves them and measure the time take for adding and getting the items from cache Cache4j Test Case: --------------------------------------------------------------- java.version=1.6.0_10 java.vm.name=Java HotSpot(TM) Client VM java.vm.version=11.0-b15 java.vm.info=mixed mode, sharing java.vm.vendor=Sun Microsystems Inc. os.name=Windows XP os.version=5.1 os.arch=x86 --------------------------------------------------------------- This test can take about 5-10 minutes. Please wait ... --------------------------------------------------------------- GetPutRemoveT GetPutRemove Get Widget by Css Reflex | TutZone --------------------------------------------------------------- cache4j 0.4 2250 2125 1703 oscache 2.4.1 4032 4828 1204 ehcache 1.6 1860 1109 703 jcs 1.3 2109 1672 766 --------------------------------------------------------------- As we can see the OSCache also took the biggest time while ehcache took the smallest time. This test also performs addition and retrieving for cache items which means there is no cache miss (like the test cases in JBoss cache benchmark) And the gold medal goes to! Our candidate framework in this part is ehcache which achieved the best time in most of the testing, best performance for cache miss and cache hits and not only that but also provides very good features from monitoring statistics to distributed functionality. 2nd place goes to JCS and OSCache, JCS is really a great caching framework but wont serve the need of caching response and JSP portions but it will be a great choice for caching POJOs in while OSCache have nice features but unfortunately the performance is not that good that is because an exception is thrown when there is a cache miss which would affect the performance, most of the cache frameworks introduced here just return null if cache miss is encountered. Finally in the last place comes Cache4j which did really a great job in caching but isnt feature rich and also it is Russian documented so wont be helpful when you face a problem with it :D but it still achieved outstanding results. Conclusion: In this part we have seen different cache frameworks and we made a comparison for them but thats not the end we still have more open source caching frameworks to check so stay tuned ;) Posted by Ahmed Ali at 10:55 PM Labels: Algorithms, Framework Intro to Caching,Caching algorithms and caching frameworks part 5 Introduction: In part 1 we talked about Caching introduction and some terminologies of caching and in part 2 and part 3 we have seen some implementation of the famous replacement cache algorithms and in part 4 we saw comparisons between some famous caching frameworks and in this part we are going to continue what we started in part 4 and as in part 4 we will concentrate only on memory caching. The Task: After programmer 1 released the caching article in Programming Mania the geek to geek magazine, he got a lot of threaten mails and terrible messages from caching geeks defending their beloved caching frameworks and warning him if he didnt make their beloved caching framework win the contest, he will regret the day he became a programmer. That didnt scare our programmer and he went on completing the second part of the comparison . ShiftOne WhirlyCache SwarmCache JBoss Cache ShiftOne: ShiftOne or as they call JOCache is a lightweight caching framework that implements several strict object caching policies which comes up with a set of cache algorithm implementations that supports in memory cache. ShiftOne cache forces two rules for every cache: Max Size - each cache has a hard limit on the number of elements it will contain. When this limit is exceeded, the least valuable element is evicted. This happens immediately, on the same thread. This prevents the cache from growing uncontrollably Element Timeout - each cache has a maximum time that it's elements are considered valid. No element will ever be returned that exceeds this time limit. This ensures a predictable data freshness. ShiftOne use decorator pattern in order to make it more flexible for the user to use any underneath caching product to maintain the cache. The following caching products can be plugged into ShiftOne: EHCache SwarmCache JCS Cache Oro Cache ShiftOne enables client to gather statistics (Hit/Miss) about the cache by using JMX, not only that but also enables integration with Hibernate ORM through adaptors. When it comes to in memory caching (which is the only thing JOcache supports) JOCache uses Soft references for the caching entries. JOCache was originally implemented as part of the ExQ project to support ResultSet caching. It was later split out for use by other projects. It was designed to cache large expensive database query results. -Memory Cache: ShiftOne cache supports LRU, LFU, FIFO, Single, Zero ShiftOne and Check List: WhirlyCache: WhirlyCache is a fast, configurable in-memory object cache for Java. It can be used to speed up a website or an application by caching objects that would otherwise have to be created by querying a database or by another expensive procedure it also provides an in-memory cache. WhirlyCache runs a separate thread to prune the cache; in other words, the data from the cache is not provided by the same application thread that the client uses. Thus, there are fewer burdens on the application thread. Whirlycache is built around several design principles that differ from other cache implementations: Require synchronization as infrequently as possible Do as little as possible in the insertion and retrieval operations Soft limits are acceptable for many applications Disk overflow becomes a bad idea very quickly Many attributes of Whirlycache are configurable in an XML file, but the most important components of the cache are the Backend, the Tuner, and the Policy. WhirlyCache support pluggable backend implementations that need to implement the ManagedCache interface (which is a subinterface of java.util.Map, although not all the methods of Map need to be implemented). WhirlyCache currently support two backends: ConcurrentHashMap and FastHashMap. You can even implement your own backed by implementing the ManagedCache interface. The Tuner is a background thread that performs cache maintenance activities specified in the configured Policy implementation. One Tuner thread per cache is created and it is configured to run every n seconds. It depends on your application, but you definitely don't want to run the Tuner too often since it will only serve to burden the system unnecessarily. -Memory Cache: Currently, WhirlyCache offers FIFO, LFU and LRU. You can specify a different Policy implementation per named cache in the whirlycache.xml configuration file WhirlyCache and Check List: SwarmCache: SwarmCache is an in-memory cache intended more for caching domain objects on the data access layer. It offers support for a distributed cache in a clustered environment. SwarmCache supports the LRU caching algorithm. However, SwarmCache is essentially an in-memory cache. When LRU is set as the caching algorithm and the memory capacity is reached, SwarmCache evicts the memory objects as per LRU logic from its memory. SwarmCache uses soft references to the cached objects. So, if the LRU is not set as the caching algorithm, it relies on the garbage collector to swipe through its memory and clean objects that are least frequently accessed. However, SwarmCache recommends a combination of the above two to be set as the caching algorithm. SwarmCache provides a wrapper in order to be used with Hibernate ORM and DataNucleus When used in clustering environment each server instantiates its own manager. For each type of object that the server wishes to cache, it instantiates a cache and adds it to the manager. The manager joins a multicast group and communicates with other managers in the group. Whenever an object is removed from a cache, the manager notifies all other managers in the group. Those managers then ensure that the object is removed from their respective caches. The result is that a server will not have in its cache a stale version of an object that has been updated or deleted on another server. Note that the managers only need to communicate when an object is removed from a cache. This only happens when an object is updated or deleted. The managers do not co-operate beyond this. This means that the amount of inter-server communications is proportional to the amount of updates/deletes of the application. Also notice that there is no "server"; all hosts are equal peers and they can come and go from the cache group as they please without affecting other group members. Thus the operation of the distributed cache is very robust -Memory Cache: LRU, Timeout, Automatic and Hybrid SwarmCache and Check List: JBoss Cache: JBoss offers two kinds of cache flavors, namely CoreCache and PojoCache. JBoss Core Cache is a tree-structured, clustered, transactional cache. It can be used in a standalone, non- clustered environment, to cache frequently accessed data in memory thereby removing data retrieval or calculation bottlenecks while providing "enterprise" features such as JTA compatibility, eviction and persistence. JBoss Cache is also a clustered cache, and can be used in a cluster to replicate state providing a high degree of failover. A variety of replication modes are supported, including invalidation and buddy replication, and network communications can either be synchronous or asynchronous. JBoss Cache can - and often is - used outside of JBoss AS, in other Java EE environments such as Spring, Tomcat, Glassfish, BEA WebLogic, IBM WebSphere, and even in standalone Java programs thanks to its minimal dependency set POJO Cache is an extension of the core JBoss Cache API. POJO Cache offers additional functionality such as: maintaining object references even after replication or persistence. fine grained replication, where only modified object fields are replicated. "API-less" clustering model where POJOs are simply annotated as being clustered. In addition, JBoss Cache offers a rich set of enterprise-class features: being able to participate in JTA transactions (works with most Java EE compliant transaction managers). Attach to JMX consoles and provide runtime statistics on the state of the cache. Allow client code to attach listeners and receive notifications on cache events. Allow grouping of cache operations into batches, for efficient replication The cache is organized as a tree, with a single root. Each node in the tree essentially contains a map, which acts as a store for key/value pairs. The only requirement placed on objects that are cached is that they implement java.io.Serializable. JBoss Cache works out of the box with most popular transaction managers, and even provides an API where custom transaction manager lookups can be written. The cache is completely thread-safe. It employs multi-versioned concurrency control (MVCC) to ensure thread safety between readers and writers, while maintaining a high degree of concurrency. The specific MVCC implementation used in JBoss Cache allows for reader threads to be completely free of locks and synchronized blocks, ensuring a very high degree of performance for read-heavy applications. It also uses custom, highly performant lock implementations that employ modern compare-and-swap techniques for writer threads, which are tuned to multi-core CPU architectures. Multi-versioned concurrency control (MVCC) is the default locking scheme since JBoss Cache 3.x. -Memory Cache: JBoss cache support LRU, LFU, MRU, Expiration, ElementSize and FIFO JBoss Check List: Performance in action: Ok now it is show time for this performance testing programmer 1 used 3 different test cases which are as follow: Test Case from JCS site (applied on all caching frameworks) JBoss Cache benchmark framework (which is really a very nice cache benchmark framework) Test Case from Cache4J site (applied on all caching frameworks) In the 1st and 3rd cache test case it just simple testing of retrieve and populating the cache, while in JBoss cache benchmark there are a lot of test cases shipped with the benchmark from replication to distributed and clustering testing. All the testing here were performed on a single machine (no distributed testing were performed) and all the testing were performed in memory. The versions of the frameworks we are going to test now are as follow: OSCache: 2.4.1 Ehcache: 1.6.0 JCS: 1.3 Cache4j: 0.4 JBoss: 3.0.0 Whirly: 1.0.1 Swarm: 1.0 ShiftOne: 2.0b JBoss cache benchmark: We can see here that there is nearly 8 million get operation invoked on the different cache frameworks and the WhirlyCache took the smallest amount of time (followed by JBoss Cache) while OSCache took the biggest time. we see here that there is nearly 2 million put operation invoked on the different cache frameworks and WhirlyCache took the smallest time while OSCache took the biggest time The cache test performed here was in memory cache and there were 25 threads accessing the cache. JCS Test Case: Cache4j vs. JBoss EhCache vs. JBoss JCS vs. JBoss OSCache vs. JBoss ShiftOne vs. cache4J Shiftone vs. EhCache ShiftOne vs. JCS ShiftOne vs. OSCache ShiftOne vs. Swarm ShiftOne vs.JBoss Swarm vs. Cache4J Swarm vs. EHCache Swarm vs. Jboss Swarm vs. JCS Swarm vs. OSCache Whirly vs. Cache4J Whirly vs. EhCache Whirly vs. JBoss Whirly vs. JCS Whirly vs. OScache Whirly vs. ShiftOne Whirly vs. Swarm The winner in this test in Ehcache which achieved outstanding results against all the other frameworks, in 2nd place comes Whirly Cache and in 3rd place comes JBoss cache Cache4j Test Case: Cache4J Test With Remove As we can see the SwarmCache took the biggest time while ehcache and whirlyCache took the smallest time. This test also performs addition and retrieving for cache items which means there is no cache miss (like the test cases in JBoss cache benchmark) But there is an extra step this test do which is removing cache entries from cache and if we omitted this operation (just concentrated on the put and get operation) we will get the following results Cache4J Test Without Remove As we can see the JBoss and Swarm time is heavily reduced, this mean that the remove operation takes a lot of time in these two cache frameworks, but lets not forget that JBoss is not a flat cache (a structure cache) which might be the reason for the delay and also it uses transaction like mechanism for caching which would affect also its performance but still great feature (and for sure we wont invoke remove method so often) And the gold medal goes to! Our candidate frameworks in this part are WhirlyCache and JBoss cache both of them are achieved very good performance in the cache hit and miss but let's not forget that Whirly is not distributed cache which is a bad thing , beside that JBoss offers structure cache as we discussed before beside the transaction mechanism that is offered by it also , WhirlyCache is really nice for in memory cache either in single or multi threaded application on the contrary Swarm cache performance is really bad in multi threading application , it throw out of memory exception more than once while it is being tested . Second place goes to ShiftOne which is really nice but suffer from lake of support ,documentation and even configuration. If we considered the caching we introduced in the previous part we would have the following order:
Widget by Css Reflex | TutZone First place: EhCache (still the best) along with Whirly and JBoss Second place: ShiftOne and JCS Third place: Cache4J and OSCache The worst performance was achieved by Swarm cache (I guess It would be fast not to cache you objects than caching it with Swarm cache :D ) Conclusion: In this part we have seen the comparison of different Open source cache frameworks we and concluded that EhCache is considered one of the bets choices (beside JBoss and Whirly cache) while Swarm is one of the poorest choice you will ever make. Posted by Ahmed Ali at 9:34 PM Labels: Framework