You’ll be aware of the old proverb regarding birds and bushes (you boy! stop sniggering at the back, there). The original meaning may have been subtly different, but these days it’s generally understood as a warning against gambling: that you should not give up something you certainly possess, even if it’s relatively worthless, for the uncertainty of possibly having more or possibly having nothing. Thought about as a proverb this is, like a lot of folk wisdom, not actually all that wise: it may encourage a person not to bet their house on the spin of a roulette wheel, but it also encourages them to keep working a minimum wage job instead of pursuing the chance of a lucrative career. But what if we were to think about it literally, and in terms of probability theory?
If we were to discard the bird in hand and go after the two in the bush, we would be left facing three possible outcomes: we capture none of the birds in the bush, or one of the birds in the bush, or both of the birds in the bush. In other words, we would have a 1-in-3 chance of being worse off than if we had kept hold of the bird in our hand, a 1-in-3 chance of being better off, and a 1-in-3 chance of being neither better nor worse off. That means, if we were to gamble, we would have a two-thirds chance of ending up the same or better off against a one-third chance of ending up worse off. So the lesson here is clear – probabilistically speaking, it’s better to gamble than it is not to gamble, because we are much more likely to end up either the same or better off than we are to end up worse off. This is where faintly smug lectures from mathematicians convinced of the superiority of their methods usually end, but it seems to me there’s a but.
I’m not a mathematician – I have a grade C in GCSE maths, which means I only studied very basic probability, and that over 20 years ago – but one of the things I know about probability is that it only comes into play when a scenario is repeated many times. There may be a 50-50 chance that the toss of a coin will come up heads, but that doesn’t guarantee that if I toss a coin twice I’ll get heads one time and tails the other. Probability will only come into play if I toss the coin many times. Or, to put it another way, all probability can tell me is that, on average, a tossed coin will land on each side an equal number of times. It can tell me nothing about an occasional coin toss.
The same applies in the bird-bush scenario. The probability calculation tells me that, if I were to face the same situation many times, it would be better to gamble on the birds in the bush. But it tells me nothing about the particular instance, when I stand there with one glum bird gripped in my fist and two tweeting obliviously in the foliage: it doesn’t help me to understand whether, on that specific occasion, I am better off letting go of the bird in my hand to try and grab the other two. And what if I’m hungry as I stand in front of the bush? If I really need (with apologies to vegetarians and lovers of euphemism) to eat bird, I can’t afford the possibility of having no bird to eat, regardless of how small that risk is.
It seems to me this is, at heart, a problem of the concrete versus the abstract. In abstract, it’s better to gamble on the birds in the bush than it is to bank on the bird in the hand, but in the concrete – well, it depends. It depends how many times you’re going to face the bird-bush dilemma, and it depends on the implications of each potential outcome. If one potential outcome is catastrophic it makes perfect rational sense to avoid the gamble, no matter the result of a probability calculation. This is why, after all, an anomalous reading during a pre-flight check will ground a plane, even though it’s much more likely to indicate a fault with the sensors than it is a fault with a critical system – because of the catastrophic consequences if the warning were accurate, it’s best to avoid the gamble.
I’ve been thinking about all this because of an article I read recently in the London Review of Books. The article’s behind a paywall, I’m afraid, but it’s a review of a book called Thinking, Fast and Slow by Daniel Kahneman. I haven’t read the book (though I have added it to my ‘to read’ list), so obviously I can’t comment on it directly, and I’m also not going to comment on the bulk of Glen Newey’s review in the LRB. I’m going to focus specifically on some of Kahneman’s examples of common cognitive failures or biases, which Newey rehearses in his review.
The first of these is essentially a version of the bird-bush scenario:
A famous experiment conducted in the 1970s by Kahneman and his long-time collaborator, the late Amos Tversky, put the ‘sure thing’ preference to the test. Which of the following would you choose: (a) a sure-thing £200; or (b) a coin toss by which you win either nothing or £400? The expected-money outcomes are of course the same, given in each case by the sum of the pay-offs multiplied by their probabilities. But in tests, subjects show a marked preference for the sure thing
If understanding an ‘expected-money outcome’ is not as much a matter of course for you as it is for Newey – it wasn’t for me – then think of it like this. The result of the probability calculation for the sure thing is a net payout of £200, because the potential payout – £200 – multiplied by the probability of receiving it – 100%, or 1.0 as a decimal fraction – is £200 (200×1 = 200). But the result of the probability calculation for the gamble is also £200, because the potential payout – £400 – multiplied by the probability of achieving it – 50%, or 0.5 – is also £200 (400×0.5 = 200).
The implication here seems clear – that the test subjects were being naïve when they preferred the sure thing. Since the probability calculation indicates that the average result in both cases will be the same, the ‘correct’ answer is to say it makes no difference. The trouble with this conclusion is that Kahneman and Tversky have presented their volunteers, not with an abstract probability calculation, but a concrete scenario in which it may or may not make sense to think probabilistically.
If I had been a participant in the survey, my response would have been to ask how many times the offer was going to be repeated. If it was a one-off then I’d have taken the guaranteed £200, because the probability calculation would have been irrelevant – probability only applies when a scenario is repeated multiple times. On the other hand, if the offer was going to be offered repeatedly then I’d have said it made no difference, because when the scenario is repeated multiple times probability applies.
The point I want to make here is that the supposedly clever, probabilistic decision isn’t necessarily clever. It may be smart to think probabilistically but equally it may not – it depends on whether the scenario is to be repeated enough times for probability to apply, and it depends on how badly you need the money. The presumption that probabilistic thinking is smart thinking is only axiomatic for people who are so busy seeing the world through one narrow paradigm it hasn’t occurred to them to think about these larger questions. If the offer is only going to be made once, then the smart answer is to take the smaller but certain sum – especially if you’re flat broke, and your shoes are leaking.
There’s a similar problem with another test discussed by Newey, this one relating to a public health emergency:
Take the following scenario. A flu epidemic will kill six hundred people if you do nothing. A tested vaccine will save two hundred people for sure but fail to save the other four hundred; or you can use an untested vaccine, with a one-third chance of saving six hundred people and a two-thirds chance of saving nobody. In experiments, most people (72 per cent) went for the sure thing.
Again, the implication here is that to think about this cleverly is to think probabilistically, and thus to recognise that the decision as to which vaccine to use is irrelevant – the net outcome of both probability calculations is the same (200×1 = 200; 600×0.33recurring = 200), so it makes no difference which vaccine you deploy. In this scenario, though, thinking probabilistically is not just possibly wrong, depending on the circumstances – this time round it’s definitely wrong.
This is because we’re told the new vaccine is ‘untested’. One of the very few things I know for certain about medicine is that new, untested interventions are not used willy-nilly on whole populations, because of the very real risk that they may inadvertently cause harm. Before it reached the point where I, as a policy-maker, was being asked which vaccine to deploy, the ‘new’ vaccine would have gone through a complex testing process culminating in human ‘Phase III clinical trials […] on a large scale of many hundreds of subjects across several sites […] under natural disease conditions’. By the time these trials were complete, of course, we wouldn’t just know the vaccine was safe, we’d also know how effective it was.
In reality this would mean that the scenario would be radically altered – because we would know the real-world effectiveness of both vaccines, the choice of which to use would be made on different grounds. But if, by some extraordinary disruption of normal practice, I was offered the choice of deploying either a tested or an untested vaccine, I would opt for the established one. Not because of a cognitive bias, or because I had failed to carry out the appropriate probability calculation, but because (unlike the designers of the study, it seems) I would know better than to short-circuit the clinical trials process. After all, flu is not 100% fatal, and the vaccine will not only be given to the 600 who would have died, but also to people who would have survived. In these circumstances I cannot order the use of a vaccine when I have no data on the harm it may cause.
For this reason I would, by the terms of the study, be part of the 72% of subjects excoriated for giving the ‘dumb’ answer by ordering the use of the established vaccine, but actually my answer would be smart in a way that the study designers didn’t recognise. As with the money gamble, the presumption that the smart way to think about this scenario is to think about it in probabilistic terms only makes sense to someone who is viewing the world through a very narrow paradigm; it’s actually a pretty stupid way of thinking about it.
It’s not just in the area of probability that this tendency to think in terms of an abstract statistical paradigm causes some rather odd conclusions. For example, Newey uses this little titbit on regression to the mean to round out his article:
Why do extremely intelligent women marry less intelligent men? Ask at a party and people will come up with any number of seemingly plausible, usually psychological explanations. And in a given case some of them may be right. But look at a normal distribution curve: it is, as Kahneman puts it, a ‘mathematical inevitability that highly intelligent women will be married to husbands who are on average less intelligent than they are’. It’s unromantic, it’s short on narrative interest, but there it is.
Well, no, there it isn’t.
If a woman selected her romantic partners at random from amongst a representative sample of the population as a whole then, yes, a woman who is towards the top end of the distribution curve for intelligence would most likely end up with a partner less intelligent than she is – most men are less intelligent than her so we would expect, on average, her partner would be less intelligent. But who, apart from Kahneman and Newey, thinks that a woman selects her romantic partners at random, or that she makes her selection from amongst a representative sample of the general population?
The whole point – the one that people who proffer psychological explanations have grasped, and Newey/ Kahneman have not – is that highly intelligent women would be likely to mix in a social milieu that contains a disproportionately large number of highly intelligent men, and relatively few unintelligent ones. It’s precisely because she is of above average intelligence – and therefore more likely than average to have an advanced degree, or a high-flying career, or both – that her work colleagues, the people she went to college with, and so on, are also likely to be of above-average intelligence, as are the friends-of-friends to whom she might be introduced. What this means, of course, is that the mid-point of the distribution curve for the men she is likely to meet and fall in love with is further up the intelligence scale than the mid-point of the distribution curve for men in general. It’s for this reason that one would expect, on average, that highly intelligent women would have partners who were also highly intelligent.
If the partners of highly intelligent women tend to be unintelligent – and I have no idea if this is a real phenomenon, or if it’s just idle chatter about a couple of anecdotes – then that is striking, and not adequately accounted for just by mouthing the words ‘regression to the mean’ and leaning back with a smug sense of self-satisfaction. Trying to explain this may involve a lot of guesswork and speculation – my guess is it’s likely to be psychosocial rather than purely psychological, a consequence of intelligence not being the most important of the attributes women are inculcated to find attractive in a potential mate – but it’s at least a first attempt to find an answer. Kahneman and Newey haven’t even got as far as properly understanding the question.
Sorry, I got a little heated there. If there’s one thing that winds me up it’s people assuming they’re smarter than everyone else, when actually they only think they’re smarter because they haven’t understood something that everyone else has – often subconsciously. It may well be that, in my irritation, I’ve missed something significant, and that Newey/ Kahneman aren’t mistaken in the way I believe them to be. If so, please do let me know in the comments – I’m always glad to be corrected.
Actually, although I’m reasonably confident that I haven’t made a crashing error in anything I’ve said, I do rather anticipate some of you reading this will be thinking something along the lines of “Well, yes, I can see what Aethelread means, but he’s still really missing the point, isn’t he? These are probability scenarios, so of course you think about the probability, not the extraneous details that obscure the calculations”. My response to that would be that, when you take decision-making out of the abstract and into the real world you’re virtually guaranteed to come up against complicating factors that make thinking probabilistically impossible or irrelevant. But I do quite often have the sense – particularly at times like this, when I’m gradually threading my mind back together after a bout of galloping paranoia and mild hallucinatory weirdness – that my mind works differently to most people’s. That I see problems and difficulties where others don’t see them, and also see connections and solutions where other people don’t spot them: that, in effect, I live in a world that’s cognitively different to most people’s worlds.
Take this, for example. Back in April, Neuroskeptic posted about a study which suggested that electrical brain stimulation boosted problem-solving skills. As part of the post he reproduced the ‘difficult puzzle’ that had been used to test the cognitive ability of study participants before and after their brains had undergone electrical stimulation (I checked; neither of the study authors was called Dr Frankenstein…). The puzzle was a picture of nine dots arranged in a symmetrical square of three dots by three dots, together with the instruction to connect all nine dots together by drawing exactly four straight lines, and without re-tracing any line or lifting the pen from the paper. Neuroskeptic included the ‘official’ solution, which depends on drawing a mixture of vertical, horizontal and diagonal lines that travel beyond the edges of the square of dots: the idea is that the puzzle is ‘difficult’ because most people assume the lines must remain within the square, and only people who are literally able to ‘think outside the box’ will find the solution.
I posted a comment after that post saying that I hadn’t found the puzzle particularly difficult, and pointing out an additional four solutions, all of which involved actions that were neither explicitly endorsed nor forbidden in the instructions – in exactly the same way that the solution of drawing longer lines was neither explicitly endorsed nor forbidden. I added a semi-joking line at the end of my comment indicating that there’s such a thing as thinking really outside the box, and Neuroskeptic responded with typical good humour by saying that what I had done involved ‘kicking the box to bits’, while another commenter accused me of cheating.
I wasn’t being entirely serious in that comment, but it did (and does) intrigue me that thinking outside the box in a predictable way is regarded as a mark of great intelligence but thinking outside the box in a more unexpected way is regarded as cheating, or at best as something not to be taken seriously. Given that the ‘good’ solution and my ‘bad’ solutions both involved the same creative interpretation of the instructions, I genuinely don’t understand the basis on which one is applauded and the other derided. This is what I mean about feeling that I live in a different cognitive world: it’s clear from their reactions that most people do see an obvious difference, but I can’t for the life of me understand what it is.
It works the other way round, too, in that I sometimes see difficulties and obstacles where other people don’t. A while ago I was idly dawdling through an online IQ test. (I know IQ tests are pretty much bunk – since people get better at them with practice, it’s obvious they measure learnable skills not raw intelligence – but I was bored.) At one point I came across a question that, from memory, went something like this:
Two cars start off facing in opposite directions on a completely straight road. They each accelerate to 60mph and stay at that speed for 40 minutes, then each car turns off onto a side road and drives at 40mph for another 20 minutes. What’s the distance between them at the end of the hour?
The questions were in a multiple guess choice format, and I was flummoxed because none of the options was ‘don’t know’ or ‘insufficient data to calculate’, or anything similar.
You see, the temptation is to think that this is a relatively straightforward maths question: 40 mins at 60mph = 40 miles; 20 mins at 40mph = 13.33recurring miles; if each car travels 53.33 miles from their point of origin the combined distance between the cars is 106.66 miles. But despite appearances this isn’t actually a straightforward question, because it’s not actually possible from the information given to calculate the distance between the two cars. Why not?
Well, to begin with it’s not clear whether we’re being asked to calculate the distance by road between the two cars (the combined distance travelled) or the distance between the two cars as the crow flies (the direct distance). If the two side roads are on opposite sides of the first road (something else we’re not told), then the direct distance would be a diagonal route between the two cars which would be shorter than the combined distance travelled: in that case we’d have to use trigonometry to work out the direct distance. On the other hand, if the two side roads were on the same side of the first road then we don’t need trigonometry, but we do need to be aware that the direct distance between the two cars would be exactly the same as the point at which the cars turned off: the distance travelled along the side roads is only relevant if we’re calculating combined distance travelled. Oh, and don’t forget as well that the direct distance will also vary depending on whether or not the side roads twist and turn. We’re told the first road is straight, but we don’t know that the side roads are.
Then, too, there’s the whole issue of acceleration and deceleration, about which we’re given no data. We’re told in the problem that the cars start off back-to-back and accelerate to 60mph, rather than passing each other at a start line having already accelerated to that speed. This means, of course, that if we want to calculate the total distance travelled we have to know their rate of acceleration, in order that we can factor that in to our increasingly complex calculations. Oh, and don’t forget that the acceleration calculation has to be run backwards with different values (to take account of the deceleration as the cars approach their side-turnings), and then forwards again with another different set of values to take account of the acceleration back to 40mph. (You might be tempted to argue that this is irrelevant, given that we’re working in units of distance as large as miles. But remember that the correct answer is actually a recurring decimal, so we really do need to be this precise. Plus, don’t forget, we know nothing about the cars’ rate of acceleration; we might assume it’s pretty standard (i.e., 0-60 in a matter of seconds), but from the data in the question it’s entirely possible to infer that the cars had an acceleration of 0-60 in anything up to 40 minutes.)
I suspect that many of you will see this as nit-picking, and perhaps even as showing off (I have prior experience of the way people – including teachers – respond when you raise things like this), but these are real difficulties, for all that. They do prevent us from answering the question correctly, and they did genuinely occur to me immediately I began thinking about the problem – this is not something I do to be awkward, or spend hours thinking about in order to make myself look interesting, it’s just the way my mind works. I realise other people just don’t seem to see these kinds of problems where I do, and I think it may be another instance of the abstract/ concrete issue. Because this has been presented as a real-world problem I can’t help but think about it in real-world terms, with all the concomitant complexities that implies, while everyone else thinks of it in abstract. I suspect, as well, other people don’t find a spontaneous aerial view of the relationship between the two cars forming in their minds, and it’s that diagrammatic way of thinking which makes the uncertainty about the direction and straightness of the side roads and questions of direct distance versus distance travelled obvious to me.
I don’t really have a conclusion to this second part of this post – it’s just a collection of anecdotes that I’m quite possibly over-interpreting. But I am always faintly aware – and at times like this, when I’ve become very conscious of the way my thoughts can become odd and distorted, more sharply aware – of the fact that the mental universe I live in doesn’t seem to be quite the same as most people’s. It seems to be a universe where some things are much easier than most people imagine them to be, and other things are much more complex. And some things that most people find completely simple seem to me to be almost infinitely complex – like working out what to say and when to say it in a conversation, which still sometimes defeats me after decades of practice, even though other people seem to find it as natural as breathing.
Anyway, I’m very conscious of all of this at the moment, because I’m hyper-vigilant of my own thoughts in the way I have to be when I’ve been having to try and work out which voices were real and which weren’t, and which thoughts were legitimate and which were the consequence of batshit craziness paranoia. Things will fade to normal in a few more days, and I’ll be back to fitting so comfortably into my usual modes of thought that I won’t even notice the way my interactions with the world involve a kind of translation between the way things seem to me and the way I have to talk about them if they’re to make sense to other people.
Or, to put it another way, I promise there won’t be too much of this kind of solipsistic narcissism.