I’ve been meaning to write about the Lucy Letby case for a while, but haven’t had the time to follow up on some of the articles written. I still haven’t done enough background reading but a really neat article caught my eye.
It’s a little bit technical in a couple of places, but it’s easy enough to get the gist and it explains why there are some very serious problems with the prosecution’s case. What caught my eye was our old friend medical testing and the familiar linking of sensitivity, specificity, and prevalence.
I’ll say right now that I do not know whether Lucy Letby is innocent. However, if someone can be convicted and condemned on the kind of ‘evidence’ presented by the prosecution, then God help us all.
The case against her is entirely circumstantial and some of it is batshit bonkers.
One of the primary pieces of ‘evidence’ was an attendance chart showing that Letby was working on all the shifts where a suspected murder occurred, and no other nurse was present for them all. When I first saw this chart, back when I hadn’t done much reading on the case at all, my first thought was a humungous WTF?
It’s a partial list. It selects out only those deaths they deemed to be ‘suspicious’ - although the original autopsies conducted showed no evidence of foul play1 in, I think, 6 out of the 7 deaths she was found guilty of causing.
Someone not versed in basic statistics (for example, a typical jury) would see this chart as being rather conclusive when, of course, it’s nothing of the sort.
What they had done was to essentially ‘draw a target’ around these cases and excluded the other deaths that occurred on the unit. In the year 2013-2014 (June to June) there were 4 deaths. In the following year period (2014-2015) there were 3 deaths. In the next year period (2015 - 2016), the ‘Letby’ year, there were 17 deaths - with 7 of those being attributed to Letby.
What the prosecution did in their attendance chart was to focus only on those deaths they wanted to pin on Letby. It’s a bit like this where the target (in red) is drawn afterwards (the grey circle on the LHS is included as an aid to the eye, but not originally there) :
What is nicely shown in the article with an interactive chart is that it’s possible, by excluding data outside of the selected subset of interest, to find any nurse guilty of a selected subset of deaths.
It’s an exercise in target-drawing - you just choose your target and you can make any of the nurses look guilty using this technique.
It’s quite astonishing to me that Letby’s defence team did not present evidence from a statistics expert that would have readily demonstrated the conclusion drawn from this attendance chart to be nonsense; an example of statistical illiteracy.
The pertinent question here is why wasn’t Letby charged with causing the other 10 deaths that year? One can only assume they drew up an attendance chart and found no good correlation for those deaths and so because they were ‘outside the Letby target’ they were not considered at trial.
The determination that the deaths Letby is convicted of were unnatural was not based on findings at the time, but a retroactively applied determination once they had identified their target. It’s all rather circular. My hypothesis at the moment? Faced with a large number of deaths (a jump from 3 to 17)2 and unwilling to concede negligence or some other hospital failing, the ‘powers that be’ in the hospital sought another reason and hit upon a scapegoat.
It’s a monstrous miscarriage of statistics, let alone a monstrous miscarriage of justice.
How does medical testing fit in with all of this?
Testing for Murder
This comes from a neat application in the article likening the retro-active determination of murder vs. natural to a medical test. The expert witness who examined the deaths (well after they’d previously been declared to be non-suspicious at autopsy) is doing a kind of test and we can ask about how good his testing methodology is, just as we can ask about how good the testing methodology of something like a PCR test for covid is.
The prevalence issue, as it was for covid tests, becomes very critical.
The article points out that in a typical year there will be around 3% of babies born in the UK who die. That number did kind of shock me because it seems quite high, but let’s assume it’s correct. This means that you’ll get around 18,000 deaths each year from the approximately 600,000 babies born.
Death by murder of these babies is exceedingly rare. I won’t say the probability is zero, but it must be very close to zero.
So, if you have an imperfect test (murder vs non-murder) which can give you a false positive, the overwhelming probability upon ‘detecting’ a murder is that this is a result of a false positive.
As I’ve previously written, I think the best way to understand this is by thinking about a communication channel. It’s a perfect analogue of the situation. Imagine you have a channel where a transmitter (Alice) is sending messages to a receiver (Bob). Almost all the messages Alice sends are the symbol ‘0’. But every so often she decides (at random) to send the symbol ‘1’ instead. Let’s suppose that, on average, she’s sending 4 of the ‘1’ symbols for every 10,000 of the ‘0’ messages (this is a ‘prevalence’ of about 7 in 18,000).
Let’s suppose Bob receives 17 symbols and finds that 7 of them are ‘1’.
Which is more likely, there is an error on the channel or that all of these 7 messages received as a ‘1’ were, indeed, transmitted as the ‘1’ symbol?
It’s overwhelmingly more likely that what Bob is seeing are false positives (errors on the channel) rather than genuine transmitted symbols.
Of course, assuming a murder ‘prevalence’ of 7 in 18,000 is a gross overestimate so the calculations derived from it are going to lead to an upper bound on the likelihood of murder.
We’re all used to crime dramas in which the medical examiners find conclusive evidence of foul play. And, let’s face it, if someone is lying on the mortuary slab with a knife through their heart then your medical test of murder vs non-murder is going to be pretty damn good (not much room for any false positives is there?).
But with the Letby ‘murders’ we’re not dealing with anything like this knife in the heart scenario. We’re dealing very much with “might be, might not be” types of evidence.
As far as I know, none of the autopsies showed conclusive proof of foul play - or even a very high probability of such.
In such cases, we must account for the possibility of false positives.
As the analysis in the article shows, the existence of this possibility leads to a catastrophic collapse of the prosecution’s case.
Summary
Once again we see the appalling misuse of statistics to manipulate people into a certain position. The prosecutors have basically applied the same reasoning as PCR positive therefore covid. The jury should not be blamed for being so hoodwinked, but it’s very clear to me that Letby’s defence team were woefully inadequate and unprepared. She did not get anything like a fair trial as a result.
It is remarkable that 50% of the large increase in deaths has been ignored. There are 7 deaths not attributable to Letby (over and above an ‘expected’ 3). Clearly there was something else going on. Now either you have 2, essentially independent extraordinary events in the same year, or just one. By far and away the most plausible explanation, given the 7 non-Letby deaths, is that there were some serious problems on the unit that year and that the deaths (all 14) had nothing to do with any malicious intent on Letby’s part3.
Is she guilty? I don’t know, but I have to say that at this point in time I think it overwhelmingly more likely that she is innocent on the basis of the evidence I’ve so far seen. The conviction has not come remotely close to having been “proven beyond a reasonable doubt”.
These autopsies concluded that death was by ‘natural causes’. I believe that the remaining autopsy did not prove that the cause of death was from unnatural causes, but showed that the death could be consistent with a non-natural cause. I stand to be corrected on this.
The article suggests that 3 of the deaths were explained, or ‘expected’, whereas the other 14 are ‘unexplained’. 7 of those deaths were pinned on Letby, but that leaves us with the same number, 7, of deaths which are not caused by Letby. It must have been an extraordinarily unlucky year at that hospital. You have two extraordinary events. A serial killer of babies (unknown previously in UK hospitals) AND a large unusual jump in deaths not caused by this alleged monster. Gotta get the hospital management to buy my next lottery ticket.
It is possible she made some mistakes (but she won’t be alone in that). I do not see enough evidence, however, to suggest malice aforethought.
You don't even need stats to refute the prosecution (I don't know anything about the case beyond your text) - someone being present at the scene of a crime is not the same as them being the guilty party. You must also prove they did do the deed.
Or, it used to be you had to prove that. Not just argue it was a plausible explanation. I feel lower sentences and abolishing capital punishment perversely enough led to more people being convicted without actual proof - this gels well with what we know of how humans form consensus about reality; the less there's an immediate risk of suffering negative conseqiences for being in the wrong, and the more a group share a belief that what they are doing is in the right, the less each member of the group scrutinise what it is they are actually doing.
Asch's experiments showed this some 70 years ago. I've seen it myself when participating in such studies: people will create the reality they feel - instinctively - is the shared experience (or "the experience shared" might flow better?). You don't even need material incentives for this to happen:
One experiment (might have mentioned this?) was seven people seated around a table. Each is given a sealed envelope containing romboids, squares, rectangles, triangles of coloured carboard. The goal is to assemble a square out of all the pieces of "your" colour, it being marked on each envelope.
You may not speak (or use sign language, morse code by tapping, etc) with the others around the table, and you may not simply take the pieces with "your" colour on from their pile.
The experiment is timed and the normal time for completion is between 5-10 minutes.
My group had to be put as an anomaly. We were done in 14 seconds. Everyone looked at one another, and then shoved all the pieces they didn't need into the centre of the table.
What has this to do with the stats and the case?
The other tables (and the researcher had done this hundreds of times), people would hold on to "their" pieces, glaring at each other, and only tentatively swap pieces as if it was a hostage exchange taking place:
The shared belief (that a piece of cardboard being "yours" made it hold value) dominated any logic, rationality or intelligence of the participants.
For the prosecution in this case, no matter if she is objectively speaking guilty of anything, the shared belief that she is guilty and therefore must be convicted has led to them creating reality in such a way she gets convicted.
Which is more or less the opposite of the prosection investigating and finding out what happened.
She wasn't convicted solely on the basis of this statistical misinterpretation though.