11th April 2015 by The Value Perspective
By Kevin Murphy
Touched on most recently in Aversion therapy and Polls apart, the overconfidence of experts, their unwillingness to keep things simple and their disinclination to learn from past mistakes are favourite themes of The Value Perspective and to an ever-growing list of targets that have included economists, weather forecasters, sports pundits and political commentators, we will now add clinical psychologists.
We do so courtesy of an article called ‘Simple models or simple processes? Some research on clinical judgements’, which first appeared in American Psychologist back in 1968. Written by Lewis Goldberg, himself a professor of psychology, it brings together all the elements mentioned above to suggest training and experience do not effect judgemental accuracy very much but can impact confidence –effectively leading to over confidence.
Goldberg begins by noting how past studies of the accuracy of different sorts of professional judgement have “yielded rather discouraging conclusions”. Picking out one example, he says: “One surprising finding – that the amount of professional training and experience of the judge does not relate to his judgemental accuracy – has appeared in a number of studies.”
In this context, the judge is whoever is making the diagnosis or prediction or whatever else the professional judgement may be and Goldberg continues: “Equally disheartening, there is now a host of studies demonstrating that the amount of information available to the judge is not related to the accuracy of his resulting inferences.”
One such study involved giving clinical psychologists a quarter of the available information on a case, along with a number of questions about the patient and possible behaviours. The psychologists were then asked how accurately they thought they had answered those questions and then the same process was repeated after they received half, three-quarters and, finally, all of the information in the file.
As you might imagine, the more information they received about the patient’s case, the more confident the psychologists grew about how accurately they would answer the associated questions. Thus, after receiving a quarter of the available information, they reckoned they would be right on a third of the questions, which was slightly overconfident because they actually turned out to be right on 26%.
After seeing all the available information, however, the psychologists decided they would be right on just over half the questions and yet they were right only 28% of the time – a marginal improvement but hardly in line with their increased confidence. “Such findings relative to the validity of clinical judgements obviously raise questions as to their reliability,” Goldberg comments drily.
Later in his article, Goldberg highlights an experiment of his own, which set out to understand the degree to which clinical judgements might be improved through training. Judges of three different levels of experience – essentially experts, psychology graduates and novices – were trained to make a particular psychological diagnosis and tested on their progress.
Goldberg found the novices showed some improvement as a result – improving from an average accuracy level of 52% at the start to 58% at the end of the 17-week experiment. The middle and expert judges, however, were “virtually indistinguishable”, with both groups achieving an average accuracy percentage of around 65% at both the start and the end of their training.
That process involved assessing thousands of case profiles and yet Goldberg found that, for those expert and middle judges, their training turned out to be “almost completely sample specific” – in other words, of no general benefit whatsoever, with no increased accuracy despite the training.
Faced with these “startling findings”, Goldberg then assigned a number of the judges to five individual groups, with each group undergoing a different kind of training. These included one group that was helped to make a diagnosis by being given the variables of a formula that had 70% accuracy – much like having the ingredients of a recipe but not the proportions – and another group that received not only the variables of the formula but also the right values to use with it – in other words, both a recipe’s ingredients and the right proportions.
Thousands more judgements later, the only groups of the five that showed any increase in the accuracy of their judgements were the pair with access to the formula. However, while the group that were simply given the variables of the formula initially showed a “rapid increase in diagnostic accuracy”, Goldberg found this effect “gradually wore away over time”.
The only group therefore that showed any stable increase in diagnostic accuracy was the one given both the formula and the right values. But even then, despite effectively being spoon-fed the right answer 70% of the time, the accuracy of their diagnoses, says Goldberg, “were not as high as would have been achieved by simply using the formula itself”.
Furthermore, having been encouraged to look for ways in which they might improve on the formula decision, the judges duly tried to do so – with the eventual result that any initial pick-up in their accuracy dulled over time as they became more confident in their own methods and answers rather than those offered by the formula.
Even 50-odd years ago then, there was already a growing body of scientific literature pointing out a simple actuarial formula or algorithm would significantly beat an expert with all their incremental data points. In a stockmarket context, as we suggested in Aversion therapy, value’s approach of buying the cheapest stocks over the longer run could be seen as a simple algorithm – and yet, despite its history of outperformance, many investors are so confident in themselves they choose not to follow the strategy.