# The Effect Size and the change in Means can give different answers as to which method of teaching works best

The Effect Size is the change in means divided by the standard deviation.

Example

Take two Maths classes filled with children of equal ability. We are going to give them a test on a topic, teach them using two different methods, test them again and then calculate the Effect Size.

Class 1 – Gets an average mark of 50% at the start and an average mark of 60% at the end.

Class 1 gains 10%

Class 2 gets an average of 50% at the start and 70% at the end.

Class 2 gains 20%

Stop at this point and ask yourself which method of teaching do you think is better?

Now let’s calculate the Effect Sizes.

Class 1 has a standard deviation of 5% so its Effect Size is 10 divided by 5 = 2

Class 2 has a standard deviation of 20% so its Effect Size is 20/20 = 1

The Effect Size says that method 1 is better.

The Effect Size and the change in Means give different answers as to which method of teaching works best.

## 3 thoughts on “The Effect Size and the change in Means can give different answers as to which method of teaching works best”

1. Jan Tishauser

I’m afraid that your way of calculating effect sizes is rather flawed. You should either use the pooled standard deviation or the standard deviation of class 2, if that’s the class that you thought would score higher when you made your hypothesis. A correct calculation would be to subtract the average score of class 1 from class 2 and then dividing the outcome either by the pooled standard deviation or the standard deviation of class 2. Furthermore you don’t report how you calculated your standard deviation. It should be a standard deviation for samples (as opposed to a standard deviation for the population). Given your example, I would calculate an effect size of 0.8 for the teaching method in class 2 in comparison to the method in class 1.

What you forgot in your method of calculating the effect sizes, is that research is always about comparing “this” with “that”, as you did when you compared the means. Comparing “this” with “that” only works with a common SD.

P.S. The difference in SD between the two groups is quit uncommon. In the real world differences in SD between two classes are not very large. It seems you chose your SD’s to help you make your point.

• I’m using the method Hattie uses to calculate his ‘Effect Size’ as explained on Page 8 of Visible learning.

The pooled standard deviation Hattie talks about on Page 8 refers to the pooled before and after results of the same class, not pooled results between two classes.

I think you’re making my point for me. Using the ‘Effect Size’ only works for comparing two things with the same standard deviation. Any two groups will have different standard deviations, therefore they can’t be compared using the ‘Effect Size’.

2. Dylan Wiliam

If you want the effect size to be interpretable at the population level, then you should really use the population standard deviation. However, because you never know that, then the pooled effect size of the control and experimental groups is often recommended because it is assumed to be a better estimate of the population standard deviation than that of either the control or the the treatment group. But because some educational interventions are explicitly designed to reduce the “achievement gap”, using the pooled standard deviation may actually bias your results (and in my experience finding a much smaller SD in the treatment group is not at all uncommon). So when I teach this, I tell my students that they have to defend their choice of which approach they use.