Can we stop yet?

prof coe

If you were not a Mathematician you might think that all Mathematicians are pretty much the same, however, there are three main strands to the Maths that gets taught at University. Pure, Mechanics and Statistics. A bit like Science splits into Biology, Chemistry and Physics. Pure is Algebra, proofs, very abstract things like that, whereas, Statistics is all about analysing data from the real world. Someone who was very accomplished at Pure Maths would, nevertheless, be a total beginner at Statistics as the skills and knowledge aren’t really transferable.

What we have here is a classic case of someone who is an expert in their own field, switching to a different field, forgetting they are no longer an expert, yet, still being supremely confident in their own judgement and opinion. A good analogy would be someone who does a Physics degree up to Quantum Mechanics level who then moves over to Biology. They need to go right back to the beginning and start to quietly learn the different parts of a cell. Imagine if they started to loudly disagree with accepted Biological opinion after a week of lessons. Yet, Professor Coe does disagree with accepted Statistical opinion. No Mathematician uses the Effect Size.

Professor Coe did a Pure Maths degree which had no statistics in it.

John Hattie did an Arts degree which had no statistics in it.

There’s a very simple reason they advocate the use of statistics you won’t find in any Maths textbook, their degrees contained no statistics.

Nobody in Maths uses the Effect Size.

Can we stop yet?

Advertisements

17 thoughts on “Can we stop yet?

  1. Can we stop yet? Er, no, even though this really is getting rather tiresome, not least because you are undermining your own credibility as a mathematician by making points that are essentially unmathematical. First, you say that no mathematician uses effect size. The information content of this claim is effectively zero because you do not define what you mean by a mathematician; as Jim Milgram points out in his critique of the Common Core State Standards for Mathematics in the United States, one of the hallmarks of mathematics is careful definition. Even if you had given a definition of what you were prepared to accept as a mathematician, you have not established that there is no-one on the planet who (a) meets your definition of mathematician and (b) uses effect size. Second, you seem to assume that unless someone has a particular specialism identified in their first degree, then they cannot be classed as an expert in the field. You could of course defend this position by stating that this is what you mean by the term expert, but it would not be consistent with how the term appears to be used by others. John Hattie does indeed have a first degree in Fine Arts but he was also Professor of Educational Research Methodology at the University of North Carolina at Greensboro from 1994 to 1998. Third, the argument that effect size is inappropriate because mathematicians don’t use it seems to me to be rather strange. Mathematicians don’t use tunneling electron microscopy either, but that’s because tunneling electron microscopy is used to solve problems that mathematicians don’t want to solve.

    As I have argued elsewhere, I believe that standardized effect size is a deeply flawed metric for combining the results of different educational research studies but I think the debate would be advanced more effectively if the critique was focused on the flaws in the measures rather than saying it must be bad because of who uses it…

  2. But (a) who do you define as mathematicians; (b) how do you know that not a single mathematician uses it, and (c) how do you address my point about the technique being not useful to them?

    • When I say Mathematicians don’t use it, it’s really short-hand for ‘this is not a recognised technique in the general Mathematical community, e.g. Maths teachers, Maths Professors, people who write Maths text-books etc’.
      Maybe a better way to show is to take a technique that *is* generally recognised like the Correlation Coefficient. If some-one said, ‘Mathematicians don’t use it’, there are numerous sources to prove them wrong, which is what you’d expect to see for a correct technique. We don’t see that with the Effect Size, why not?
      I understand your point about subject specific techniques that don’t transfer, but, that isn’t the case here. There’s nothing specific about the use of the Effect Size to Education. It’s just slightly harder than a basic change of means. If it were correct then everyone would use it.
      It’s not literally about whether or not people use it, obviously people are using it, it’s whether they are correct to do so or not. They might think they’re finding it useful but that’s only because they don’t realise what they’re doing is not an accepted technique and therefore all of their results are invalid.

  3. I know what: Let’s stick to p-values and report statistically significant (p<0.05 or even p<0.01) results with biologically insignificant differences between control and test groups. That'll help. Lots.
    I am assuming that you do not consider statisticians as mathematicians, and since you are a mathematician, you may be better qualified to make that statement than me. However statisticians do use the effect size. What you also do not appreciate is that most peolple without a statistics degree have to learn how to do statistics relevant to their discipline either 'on the job' or during their PhD or research, usually guided by a statistician at some point.

    To illustrate why I'd like to see more effect size-type analyses reported in science (and elsewhere), try this simple task:
    Type a series of numbers into excel (e.g. 6,4,5,6,6,4,6,7,6,5,6,4). This is my control group for year 1 intake. It could represent anything, but for sake of argument in an education context, let 7=1st, 6=2i, 5=2ii and 4=3rd for degree classifications of my tutour group for year 1 intake. Now let the test group be the second year intake (after some marvelous intervention by me) be exactly the same except change 1 single grade from a 4 to a 5. Compare the mean and SD, and do a T-test (or Mann-Whitney if the data turns out to be non-parametric data). Not significantly different (p=approx. 0.3) The effect size (Cohen's d) is very small. I think you'd argue that my intervention did not appear to work very well. Would you base this judgement on the effect size or the T-test?

    Now copy and paste the 12 student grades from year 1 and year 2 intakes and just duplicate them to make n=72 (maybe representing 6 tutor groups for each year intake). Means and SD are really no different, but now try doing a T-test (or other more appropriate test). Here we have a significant result (approaching p=0.01). Yes, we have more confidence that the single grade increase per 12 students occurs every time (rather than a single observation of this) and this is reflected in the T-Test result. However the magnitude of the intervention is still very small. Just as small as before.

    Now try meddling with the numbers in the n=12 group and applying these changes to the n=72 group to see what I would have to do to get an effect size of 1.4. Don't forget the 'pooled standard deviation' calulation as the more you meddle, the greater the differences in SD between the groups and the test no longer works.

    The one key point that I teach my students about the most commonly applied statistical tests is that they almost encourage the reporting of statistically significant results that are biologically irrelevant. The effect size-type calculations could counter this, especially given the recently reported occurance that p values just below 0.05 are far more frequently reported than p-values just over 0.05!
    Whether Hattie applied effect sizes in an appropriate context in all cases is a completely different question.

    *Note to proper statistics pedants: this is not a particularly appropriate use of T-test given the discrete limits of the data set. I made up the scenario to fit the numbers. Sorry.

    • I tend to agree that simply reporting the difference between the means is in many (maybe most) cases preferable to reporting a Cohen’s d.

      However, you’re making your case in a very weird way:
      1. The difference between two means IS AN EFFECT SIZE, just an unstandardised one.
      2. When do mathematicians use unstandardised effect sizes?
      3. Even if they did, who cares? Argue the case, don’t make (in this case inaccurate) appeals to authority.

      • 1. Nobody in Maths calls these things effect sizes.
        2. In all these cases Mathematicians would just use the difference in means if it had first passed the significance test. Your big mistake is you’ve mixed up standardising for a z or t test which is fine with this when it isn’t.
        3. My authority is the whole of Maths. Every Professor, every teacher, every text-book. I might be arguing from authority, but, that’s some authority.

    • Appeals to authority can work, but only if (1) the authority figure is an authority in an appropriate domain, and (2) the authority figure says what you claim it says. Unfortunately your appeal fails on both counts.

      For what it’s worth, since you like appeals to authority: I teach undergraduate statistics and work in a university mathematics department. I haven’t mixed up standardised effect sizes and t tests.

      A sensible case can be made against standardisation in the reporting of effect sizes. But it’s utterly incoherent to argue that standardised mean differences (Cohen’s d) are fatally flawed, but standardised regression coefficients (Pearson’s r) are absolutely fine. That’s just inconsistent nonsense.

      • The Effect Size isn’t subject specific, it’s just measuring the change in means, very basic stuff, proper Mathematicians are the authority here.
        If it were correct then Mathematicians would use it, they don’t so it’s wrong.
        I’m not saying you’ve mixed them up, I’m saying you’ve wrongly extrapolated from z tests and t tests into doing a similar thing with the effect size, again it’s a very basic leap to make, if it were correct then Mathematicians would have done it first.
        I assume you are from the world of Education or Psychology and teach their undergraduates basic Statistics and are an honorary member of the Maths department. However, it might be an idea to ask some proper Statistics professors what they think.

    • I think you’ve misunderstood what I’m saying. I haven’t extrapolated from a t test to anything: I’m sceptical of how useful standardised effect sizes are, it’s just that I’m substantially more sceptical of your argument against them. Especially since you’ve outed yourself as a proponent of the standardised regression coefficient Pearson’s r.

      Explain to me why any argument that can be made against standardisation in the context of a Cohen’s d doesn’t also apply to a Pearson’s r. And when I say “an argument” I mean an argument with substantive content, not an appeal to (unspecified) authority.

      Incidentally, it’s particularly weird that you make an appeal to authority in this of all contexts (and then dispute the status of anyone who disagrees with you). Mathematicians, famously, reject appeals to authority as being unmathematical, and make jokes about people who use them. See the “Appeal to Expert Opinion” in Dana Angluin’s famous “How to Prove It” article:
      http://www.csupomona.edu/~masrinivas/cs311/how_to_prove.pdf

      • I’ll take that as a Yes that you are really from Education or Psychology rather than Maths.
        I’m not a “proponent” of Pearson’s Correlation Coefficient, it’s a standard Mathematical technique. Every A Level student gets taught it, it’s in every text-book, every Maths teacher knows it, it’s part of the accepted Mathematical canon. The opposite is true of the Effect Size.

      • And I’ll take that as a Yes that you don’t really understand what a Pearson’s r is or why it’s conceptually identical to a Cohen’s d.

      • No, I understand what Pearson’s Correlation Coefficient is and where it comes from, it’s just that Cohen’d d isn’t the same, which is why Mathematicians use Pearson’s Correlation Coefficient and don’t use Cohen’s d. Cohen’s d deals with the difference between 2 means, so, the standard deviation isn’t sigma any more. Of course, if you were to standardise with the correct standard deviation you would just be doing a t test, and where’s the fun in doing Maths that everyone else does.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s