But I don't know if this is statistically significant or not. When I take average of all values for Algorithm A and Algorithm B, I see that the mean of all results that Algorithm A produced are 10% higher than Algorithm B's. As each row represents a different parameter, values in a row represents an algorithm's result for this parameter. The results are merely real numbers between 1-100 (they are percentage values). It seems like a very easy task but I failed to find a scientific measurement function.Īny advice over a built-in function of excel or function snippets are appreciated.Īfter tharkun's comments, I realized I should clarify some points: The wikipedia article explains accurately what I need:
9 probability (or 95% confidence interval)" Can anyone suggest a function?Īs a result, it will be nice to state something like "Algorithm A performs 8% better than Algorithm B with. I want to make statistical significance test of these two algorithms with excel. Each column represents an algorithm and the values in rows are the results of these algorithms with different parameters. Please let me know if you have any queries.I have 2 columns and multiple rows of data in excel. Hopefully, this method and explanation will be enough to solve your problems. In the above article, I have tried to discuss the statistical comparison method elaborately. This result indicates that, for some months, fluctuation of the sales of Rolled oats is higher than those of the Steel cut ones. It is evident from the datasets that Rolled oats have a higher range. Range: In statistics, the range of a set of data is the difference between the largest and smallest values. Consequently, we can summarize that sales values of Rolled oats are more consistent compared to Steel-cut ones.
From our above calculation, we can see that the CV of Steel cut oats is slightly higher than that of Rolled oats. Thus, this indicates the sales values of Rolled oats are spread out over a wider range than those of the Steel-cut oats.ĬV: The coefficient of variation (CV) is a relative measure of variability that indicates the size of a standard deviation to its mean. Here, from our result standard deviation is greater for Rolled oats. On the other hand, a high standard deviation means that the values are spread out over a wider range. For example, a low standard deviation tells us that the values tend to be close to the mean of the dataset.
Standard Deviation: The standard deviation is a measure of the amount of variation of data points or values relative to their average or mean. That means, over time, the sales of Rolled oats are greater than the other one. And, from the above calculation, we can see that Rolled oat’s sales Mean is greater than that of the Steel cut one’s. Mean:Mean is the arithmetic average of a dataset. Let’s compare the data sets depending on the result we got from the above calculation. Statistical Comparison Between Data Sets in Excel Finally, drag down the Fill Handle ( +) tool to copy all the formulas to calculate the Mean, STD Deviation, CV, and Range of the Rolled oats data set.Lastly, by subtracting these minimum values from the maximum one, we will get the Range of the Steel-Cut Oats. And, the MIN function returns the smallest value of that range. The MAX function returns the largest value of the dataset C5:C13.