Performance blog: How to determine two performance runs are statistically different?

Performance tests are good examples of Normal Distribution. A normal distribution of data means that most of the examples in a set of data are close to the "average," while relatively few examples tend to one extreme or the other.

According to statisticians, Two tests are considered statistically different if it is unlikely to have occurred by chance. we can be 99% sure that averages from two runs are really different only when the other average is more than 2.57 standard deviations away.

The same principle can also be applied to performance tests to determine results obtained are statiscally different.

For example, If the Average response time for transaction A is 1 sec during the first run with the standard deviation of 0.2 and for the second run average response time for the same transaction A is 1.2 sec. The difference between the two transaction response time is 1.2 – 1.0 =0.2, which is 1 standard deviations away (Standard deviation observed during the first run is 0.2) on the average. If the difference is 2.57 standard deviations away(we are 99% sure) then the results are statistically different.
Conversely we can even calculate the statistical limits using the formula below
If R1 is average response time and SD is the standard deviation for a performance run then the average response time for the second run should not exceed R1 + (SD* 2.33) or should not be less than R1 - (SD* 2.33). For the above example response time for the second run should not exceed 1 + (0.2* 2.33) =1.466 on the positive side of the average and 1 –(0.2*2.33) = 0.534 on the negative side of the average

Thursday, September 24, 2009

How to determine two performance runs are statistically different?

No comments: