Anda di halaman 1dari 5

Interpret the results In order to interpret the results of a stress tests, it is important to understa nd some basic elements of Statistics:

: (1) The mean value () The following equation show how the mean value () is calculated: = 1/n * i=1n xi The mean value of a given measure is what is commonly referred to as the average value of this measure. An important thing to understand is that the mean value can be very misleading as it does not show you how close (or far) your values ar e from the average. An example is always better than a long explanation. Lets assume that we are measuring response times in milliseconds in 2 different s tress tests: Stress Test 1: x1=100 x2=110 x3=90 x4=900 x5=890 x6=910 gives you = 1/6 * (100 + 110 + 90 + 900 + 890 + 910) = 500 ms Stress Test 2: x1=490 x2=510 x3=535 x4=465 x5=590 x6=410 gives you = 1/6 * (490 + 510 + 535 + 465 + 590 + 410) = 500 ms In both cases the mean value () is the same. However if you observe closely the v alues taken by the response times you will see that in the first case, the value s are far from the mean value where in the second case, the values are close to the mean value. It is quite obvious with this example that a measure of this distanc e to the mean value is needed in order to draw any kind of conclusion based on t he mean value. (2) The standard deviation () The following equation show how the standard deviation () is calculated: = 1/n * i=1n (xi-)2 The standard deviation () measures the mean distance of the values to their avera ge (). In other words it gives us a good idea of the dispersion or variability of the measures to their mean value. Lets go back to our example and calculate the standard deviation of each of our theoretical stress tests: Stress Test 1:

= 1/6 * sqrt( (100-500)^2 + (110-500)^2 + (90-500)^2 + (900-500)^2 + (890-500)^2 + (910-500)^2 ) 163 ms Stress Test 2: = 1/6 * sqrt( (490-500)^2 + (510-500)^2 + (535-500)^2 + (465-500)^2 + (590-500)^ 2 + (410-500)^2 ) 23 ms The 2 values of the standard deviation calculated above are very different: in the first case, the standard deviation is high compared to the mean value , which shows us that our measures are very variable (or mostly far from the mea n value) and that the mean value is not very significant. in the second case, the standard deviation is low compared to the mean value , which shows us that our measures are not dispersed (or mostly close to the mea n value) and that the mean value is significant. (3) The sampling size and the quality of the measure Another interesting question is whether our calculated mean value is a good esti mation of the real mean value. In other word, when calculating the mean value of t he response time during a test case do we have a good estimation of the real mean response time of the same scenario repeated indefinitely. In probability theory, the Central Limit Theorem states conditions under which the mean of a sufficien tly large number of independent random variables, each with finite mean and vari ance, will be approximately normally distributed. The measures of response times and throughput obtained during stress tests compl y with the Central Limit Theorem as we usually have: a large number of independe nt and random measures which have a finite (calculated by JMeter) mean value and standard deviation. We can thus assume that the mean values of the response tim e and the throughput are approximatively normally distributed. This allow us to calculate a Confidence Interval for these mean values. The Conf idence Interval gives us a measure of the quality of our mean values as it allow s us to calculated the variability of our mean value (interval) with a predefine d probability. You can for example decide to calculate your Confidence Interval at 95%, which will tell you that the probability to have a mean value within the calculated interval is 95%. On the contrary, you can decide to calculate the pr obability to have you mean value within a given interval (see the examples below ). The following equation show how the Confidence Interval (CI) is calculated: CI = [ - Z*/n, + Z*/n] where: is the calculated mean value of our sample, is the calculated standard deviation of our sample and Z is the value for which the area under the bell shaped curve of the stand ard normal distribution represents the half the chosen Confidence C (anyone who can explain this better is welcome). The following table gives values of Z for various given values of Confidence C: C Z 0.80 1.281551565545 0.90 1.644853626951 0.95 1.959963984540

0.98 0.99 0.995 0.998 0.999 0.9999 0.99999

2.326347874041 2.575829303549 2.807033768344 3.090232306168 3.290526731492 3.890591886413 4.417173413469

Source: http://en.wikipedia.org/wiki/Normal_distribution If we go back to our previous examples, we can calculate the confidence interval s of our mean values at 95% : CI1 = [500 - 1.96*163/sqrt(6); 500 + 1.96*163/sqrt(6)] [370; 630] CI2 = [500 - 1.96*23/sqrt(6); 500 + 1.96*23/sqrt(6)] [482; 518] This means that the probability to have a mean response time in the calculated c onfidence interval is 95%. We can also calculate the probability to have the mean value in the interval [49 0, 510]: 10 = Z1 * 163 / sqrt(6) => Z1 = 10 * sqrt(6) / 163 => Z1 0.15 => C1 12% 10 = Z2 * 23 / sqrt(6) => Z2 = 10 * sqrt(6) / 23 => Z2 1.06 => C2 71% Performance Engineering Quick Review Sheet Goals * Determine the client s needs and expectations. * Ensure accountability can be assigned for detected performance issues. Pre-Signing 1. How many users (human and system) need to be able to use the system concurren tly. a. What is the total user base? b. What is the projected acceptance rate? c. How are the users distributed across the day/week/month? d. Example: Assuming evenly distributed billing cycles throughout the month, users spending about 15 min each viewing and paying their bill, and the site is generally acces sed between 9AM EST and 6PM PST (15 hours). Calculated concurrent users with th e following formula. (total monthly users)/(30 days a month * 15 hours a day * 4 {note, 60min/15min p er user} = daily average concurrent user load. Normally test to 200% of daily average concurrent user load. If 1 million monthly users... 1,000,000/(30*15*4) = 556 concurrent users. (2,222 hourly users) Recommend testi ng up to 1000 concurrent users (4,000 hourly users). 2. a. ue b. 3. a. b. General performance objectives Can service up to a peak of ???? customers an hour without losing customers d to performance issues? System stable and functional under extreme stress conditions? What are the project boundaries, such as: Are all bottlenecks resolved Or.... Best possible performance in ?? months Or.....

c. Continue tuning until goals are met Or..... d. Continue tuning until bottlenecks deemed "unfixable" for current release? Pre-Testing 1. What are the specific/detailed performance requirements? a. User Experience (preferred) - i.e. With 500 concurrent users, a user accessin g the site over a LAN connection will experience not more than 5 seconds to disp lay a small or medium bill detail report 95% of the time b. Component Metric (not recommended) - i.e. Database server will keep memory us age under 75% during all tested user loads 2. Detailed description of the test and production environments. a. Associated risks if not identical b. How to best mark up performance if not identical 3. What is the availability of client system administrators/developers/ architec ts? 4. What is the test schedule a. Can tests be executed during business hours? b. Must tests be executed during off hours? c. Must tests be executed during weekends? 5. Are system monitoring tools already installed/being used on the systems under test a. Perfmeter, Perfmon, Top b. Who will monitor/evaluate monitoring tools on client machines? 6. Are any other applications/services running on the systems to be tested. a. Think about associated risks to shared environments, like memory, disk I/O, d rive space, etc. 7. What are of the types of tests/users/paths desired a. Target for 80% of user activity b. Always model system intensive activity 8. Based on answers, create Test Strategy Document Test Design/Execution 1. Design tests to validate requirements 1 Create User Experience Tests 2 Generate loads and collect Component Metric data while under load. 2. Always Benchmark application in a known environment first 1 Architecture need not match 2 Benchmark tests do not need to follow user community model exactly 3 Benchmark tests should represent about 15% of the expected peak load 4 Look for problems "low hanging fruit" 5 Do not focus on maximizing User Experience, just look for show stopping bottle necks 3. Benchmark in Prod environment if possible 1 Look for problems with code/implementation 2 Look for problems out of scope that client must fix to meet performance goals (i.e. Network, Architecture, Security, Firewalls) 3 This benchmark must be identical to the benchmark conducted in the known envir onment (taking into account difference in Architecture) 4 Do not focus on maximizing User Experience, just look for show stopping bottle necks 4. Load Test 1 Iteratively test/tune/increase load 2 Start with about the same load as benchmark, but accurately depicting user com munity 3 Do not tune until critical (primary) bottleneck cause is found - do not tune s ymptoms, do not tune components that "could be faster" unless you can prove they are the primary bottleneck. 4 Re-test after each change to validate that it helped - if it didn t, change it back and try the next most likely cause of the bottleneck. Make a note of the

change in case it becomes the primary bottleneck later. 5 If bottlenecks are client environment related (i.e. out of scope) document the bottleneck and present to PM for guidance. Stop testing until client/PM agree on approach. 6 *Note* If you need performance improved by 200% to reach the goal, tuning a me thod to execute in .2 seconds instead of .25 seconds will not fix the problem. Results Reporting 1. Report test results related to stated requirements only. a. Show summarized data validating requirements b. If requirements are not met, show data as to why and what needs to be fixed/w ho needs to fix it by when c. Show areas of potential future improvement that are out of scope d. Show summary of tuning/settings/configurations if it adds value to the client e. Be prepared to deliver formatted raw data as back-up

Anda mungkin juga menyukai