The Nemov Case
The Nemov Case

On 23 August 2004 in Athens, Russia's Aleksei Nemov put forth a crowd pleasing Olympic Men's High Bar routine. For his effort he was rewarded with a modest score from the judges. The crowd disagreed with the lower than expected score and made their feelings known by booing for several minutes. While the score may have been lower than Aleksei deserved, appearing to let the crowd influence the score did not increase our confidence in the accuracy of the score. Appearing to bend to the crowd's wishes, after a discussion, the judges revised Aleksei's score slightly upward but not enough to put him in medal contention.
What can we learn from Aleksei Nemov's experience in Athens? There are at least six things we can take away from Nemov's Case.
Click on Whiteboard for a larger view.
1. Preconceived Notions - Bias
The judges and the crowd both had preconceived notions about the gymnast. These preconceptions might have included factors such as: (a) the gymnast’s previous performance or track record (halo effect); (b) the gymnast's nationality; and (c) the gymnast’s style, just to name a few. Of course although Aleksei won 12 Olympic medals in Atlanta and Sydney (combined) the "halo effect" didn't seem to be much of a factor in this case.
The judges also have other pre-existing factors influencing their evaluation. These factors might include things such as: (a) evaluation criteria; (b) training in the evaluation criteria; (c) nationality; (d) experience judging gymnastics; and (e) previous scoring errors or mistakes, just to name a few. Not all factors or influences are bad (e.g., training in the criteria). For example: Criteria - In gymnastics the judges rate two dimensions – the difficulty and the execution or performance. Consensus among the judges on the meaning of these criteria is a positive factor.
It is a likely possibility that the previous mistakes that were made by judges during the Paul Hamm Men's All Around contest and the subsequent negative media attention were in the back of their minds when faced with the crowd's negative reaction. Did the previous mistakes and negative press put more pressure on them to change the score – we simply don't know - but it clearly was a “potential” factor.
These factors or influencers are particularly important in a qualitative judging process and in qualitative research. It is not uncommon for the researcher to have some preconceived notions about the organization or the people involved in the research. In fact, sometimes the researcher is even a member of the organization. The key is to put in place methods to mitigate the negative impact of the biases on the research. The real dilemma here is that the judge or the researcher may never really know what sub-conscious factors are influencing their evaluation.
2. Observation – Data Collection
“Real-time” observation is the method used to collect data on a gymnast’s performance. While this is an imperfect approach, the negatives in this case are mitigated by the use of multiple judges. In research this is one way to triangulate that will help increase validity and reduce bias.
Other ways to triangulate include using multiple sources of data and multiple data collection methods. The Olympics are a bit limited in their ability to use these other methods (e.g., video review) but the researcher will often use other techniques to improve the credibility of the study. There are no free lunches - Each additional triangulation technique raises the cost of data collection and analysis.
Researchers often use observation to collect data on a variety of phenomena. Since even the best researchers miss things, especially when observing in real-time, sometimes multiple researchers are used to help provide a more complete picture. For example, when conducting organization assessment the examiners will often interview in pairs so that they have two perspectives on what was heard and two chances to capture the most salient points.
3. Judges Evaluation – Analysis and Conclusions
Influenced by their preconceived notions and factors the judges evaluate the performance or execution of the routine. The evaluation includes converting or translating the qualitative data and evaluation into a numeric or quantitative score. In order to use statistical methods for further analysis researchers often convert qualitative evaluations into quantitative measures. The results are then posted or in the case of the researcher published for all to see. This process of analyzing and drawing conclusions occurs in the judge's mind and is thus hidden from examination. Qualitative research has the same dilemma and consequently qualitative researchers are encouraged to make their thinking and analysis explicit so that others can follow their path to their conclusions.
4. The Kibitzing Crowd – "Free Should the Scholar Be…"
In the Nemov case the crowd compared the score on the scoreboard (overall score 9.725) with their own (albeit unqualified) evaluation and found the judges score to be lower than expected. The crowd then communicated their displeasure by booing for several minutes. The problem with this unsolicited feedback is the crowd is technically not qualified to judge gymnastics. While there might be a few individuals in the crowd who are qualified to judge, the vast majority of the crowd: (a) do not know the criteria; (b) have not been trained in applying the criteria; (c) have little to no experience in judging gymnastics; and (d) are often biased by their own nationality.
While it is seldom that researchers have “booing crowds” to deal with, they sometimes have third parties that will try to influence their analysis and conclusions. This could include the organization, key executives in the organization, company lawyers, peers, etc. While the crowd might not have had anything to gain from the score, often executives in organizations do have something to gain or lose from favorable or unfavorable descriptions and findings in a published research report. Two key issues here for the researcher are (1) the researcher cannot allow another party to have editorial or approval power over the research and (2) the researcher is obligated to design the research methodology so that it will "do no harm."
In the first case it is ok to get feedback from organization members on your data and analysis to help verify that you haven't missed anything. In the second case "do no harm" doesn't mean that the researcher shies away from findings and conclusions that are unpleasant, but it does mean that individuals and organizations need to be protected (anonymity, etc.). Bottom line – In order to produce credible research the researcher must be free from outside influences such as booing crowds.
“In self trust all the virtues are comprehended. Free should the scholar be – free and brave.” Ralph Waldo Emerson – The American Scholar
5. Pressure on the Judges – Giving in to Outside Pressure
We have to speculate a bit here but it seems that the booing crowds put enough pressure on the judges to instigate a judge’s meeting and discussion. The judge's discussion resulted in revisions to some of the scores. Since we do not have any other explanations we have to speculate here as to why they changed the scores. While the sequence of events might suggest that the pressure from the crowd influenced their decision, there are rival hypotheses. For example, the judges might have noticed their own inconsistency and discussed the situation - which possibly was an effective and appropriate use of the multiple (triangulation) judges. In other words maybe the system worked the way that it is supposed to work.
In a previous judging problem earlier in the week there was pressure on the judges to change the scores after the event was over. According to the NBC Olympics website during the Olympics (specific pages no longer available), "There are several important reasons for not going back and changing results the way the South Korean delegation thinks the international gymnastics federation should [In the case of the Men's All Around]. Yes, there was an error on Yang's start value, but there are two sets of judges. One set comes up with the starting score; they add up the entire bonus. Then there is another set of judges, six of them that come up with the deductions in the exercise." The question becomes if they go back and revise the start value should they not also go back and revise the deductions which it appears might have been greater with video review. This is a "can of worms" that the gymnastics community doesn't seem to want to open.
The bottom line here is that in order for the research conclusions to be credible they have to be free from outside pressure.
“We will walk on our own feet; we will work with our own hands; we will speak our own minds.” Ralph Waldo Emerson – The American Scholar
6. Revised Score – Tainted Findings?
The end result in the Nemov case was the judges did revise their scores after the discussion. Again, the reason for the revised scores was not explained so we are left to speculate as to the rationale for such an action. It is not at all clear if the revisions to the score resulted in a more accurate assessment of the performance or a less accurate assessment. Without additional information we are left to conclude that the accuracy of the score is questionable.
For researchers the credibility of the publishing study depends on it being free from outside influences.
“Inaction is cowardice, but there can be no scholar without the heroic mind.” Ralph Waldo Emerson – The American Scholar
john latham (c) 2000 - 2012 all rights reserved