Posted 5/9/12 on The Doctor Weighs In
I have frequently been asked to render judgment on another doctor’s diagnosis, or treatment plan. Other times I am asked anxiously: “should I get a second opinion”? The implicit assumption in this sort of questions is that “two heads are better than one.” Or stated more broadly, we put our faith in the “wisdom of the crowd,” whether the “crowd” is made up of two or two-thousand individuals.
I have to admit I’ve had some nagging doubts about this all-encompassing wisdom. For instance, wisdom of the crowd has been amply documented in estimation tasks (“how many people in this crowd? What is your estimate of the completion date of the project?”). The reason this works is that it exploits the benefit of error cancellation; the outlier estimates on either side cancel out each other and we end up with the consensus opinion, that is closest to the truth. But how do you decide when the issue is not quantitative? Think of the virtually unanimous opinion of the White House crowd to go to war in Iraq. Where was the “wisdom” there? More interesting, we could drill deeper and ask why is it that the crowd reached such a wrong decision? Wisdom of the crowd was hailed as a source of near-magical creativity and unparalleled wisdom and forecast accuracy. Some of these attributions have proved to be unfounded. For instance, with respect to creative potential, groups that engage in brainstorming lag hopelessly behind the same number of individuals working alone. The key to benefiting from other minds is to know when to rely on the group and when to walk alone. Wouldn’t it be nice if we had some sort of an algorithm to guide us in making this decision?
In the article Optimally Interacting Minds Bahrami et al. compared individual and dyadic (a group of two) performance in a simple visual task. They asked, “How can signals from the same sensory modality (vision) in the brains of two different individuals be combined through social interaction?” In their experiments, participants judged which of two briefly presented stimuli contained an oddball target. Participants worked in dyads; they first made their decision individually, then shared their decisions, and if they disagreed, they discussed the matter until they reached a joint decision. The results led to the conclusion that “for two observers of nearly equal visual sensitivity, two heads were definitely better than one provided they were given the opportunity to communicate freely.” In discussing the mechanism responsible for the “two-heads-better-than-one” effect, the authors assumed that each individual can monitor the accuracy of his or her performance and can communicate his or her confidence accurately to the other member. So he who communicated more confidence prevails.
Asher Koriat, of Haifa University, went a step further. He asked: What if we subjectively judged the level of confidence of each individual without any communication between them, and then made the decision based on that, rather than on the track record of the highest performing member? In other words, forget how smart the guy is; go with the most confident one.
Surprise, surprise! Using the algorithm of selecting the most confident-sounding opinion rather than the one with the best track record, yielded a significantly higher accuracy (closest to the truth). What it also means is that there is no need for communication between members of the crowd; selecting the most confident opinion will be most likely to yield the best decision. Furthermore, when Koriat increased the group size to 3, the accuracy of decision-by-confidence increased even further.
Circling back to our medical example, the answer is still not clear. In real life we don’t have the controlled conditions of a laboratory experiment. Some environments are geared to misleadand, as Koriat has demonstrated, in that case wisdom-by-confidence will fail. Example: T.R. Reid’s book The Healing of America: A Global Quest for Better, Cheaper, and Fairer Health Care describes his journey in the industrialized world, looking for treatment of an injured shoulder. In every country the orthopedic surgeon examined the shoulder, did some imaging studies, and recommended anti-inflammatory drugs, no weight-bearing exercise of the affected shoulder, and patience. An example of crowd wisdom where no communication between the members was. In the U.S. the surgeon did not examine the shoulder, ordered no imaging, no drugs. He confidently diagnosed a torn rotator calf tendon and proceeded to schedule him for surgery. The difference between the U.S. and the rest of the industrialized world orthopedists was not in their level of confidence, but in the environment. Outside the U.S. the medical environment is cooperative, ours is competitive. In such an environment, groupthink judgment is tainted with ulterior motives –and therefore suspect.
Behavioral economists had discovered the same phenomenon. Wisdom of the crowd on investment decisions is formed by “opinion makers” who “talk their book”, trying to maximize their own economic rewards. Objective economic opinions are expressed by academics who may know more, but sound less “confident”. Beware the “confidence man”.
So back to our second opinion dilemma. How do you decide whom to believe? No easy answers, yet. Ask for data, check the credentials, and then, and only then, go for the confident one.