Debunking the "Fatal Flaw" Myth of 360 Surveys

April 2, 2024 Bruce Bennett

Is There Really A Flaw?

In his HBR Blog post, "The Fatal Flaw with 360 Surveys", Marcus Buckingham argues that 360 surveys are "at best, a waste of everyone's time, and at worst actively damaging to both the individual and the organization." Yet, 77 % of organizations surveyed by T&D magazine have used and continue to use 360's. Something does not add up. It turns out that the 'Fatal Flaw' is a faux flaw only found in poorly constructed 360 surveys.

Marcus has had some bad experiences with 360 feedback, and those experiences color his perspective. Marcus describes a 'fatal flaw' in 360 surveys equivalent to a broken engine in a badly used car. Condemn all cars because you bought a lemon, and you miss the value of every other vehicle on the road.

The flaw Marcus has discovered occurs because raters score a leader subjectively. Raters, he suggests, mentally compare the leader to themselves and score accordingly, making the data worthless. Subjective responses are not a flaw, just a reality of collecting survey data. Poor responses become a flaw when a survey contains poorly written 360 survey questions.

The Quality Of The Questions Counts

Write the questions well, and the ‘flaw’ discovered by Mr. Buckingham does not exist.

A specific, observable behavior is among the first requirements for a good survey question. If the behavioral description is clear, anyone and everyone observing a leader can recognize when the leader utilizes the behavior. In statistical parlance, it's called inter-rater reliability.

Such observations are not only the basis of good 360 questions; good questions are the basis of all scientific research. Whether it's a biologist counting bacteria in a petri dish or a naturalist studying tree frogs in the Amazon, the quality of their data depends on how well they have defined what they are hoping to observe.

Excellent 360 survey questions consist of the behavioral description and the response scale; both have to be clear for the question to work. Choose the wrong scale, and you destroy the question. Agreement scales are very risky because you are trying to measure what the person is thinking. You can't see 'thinking .' A frequency scale, combined with a clear behavioral description, creates the condition for valid observations. How many bacteria are in the petri dish? How many times did this leader demonstrate <specific skill>?

Mr. Buckingham’s described flaw is present in a question like, "Do you think this person has a clear vision for the future?" No answer can be independently confirmed because no one rater can see the leader's thinking. Change the question, "How often does this person discuss her vision with our team?" and you will get data that any team member can confirm.

You Can’t Get A Better Sample Size

How many people have to agree before their observations are worthwhile? Researchers pull a representative sample when it's too challenging to poll the whole population. Random samples are critical with large populations. With 360 surveys, 100% of the possible respondents are invited to provide feedback. A 100% sample can't be a skewed sample.

If 100% of a leader invites 100% of her direct reports to answer the question, "how often does the leader do behavior X when working with his direct reports," you can't get a better sample.

If all research had that luxury of polling 100% of the research population, the Chicago Tribune would not have declared Dewey the President on election night in 1948, and no one would have to wait up for the election results today.

Gaps In Perception Are Critical To Improvement

An excellent 360 does not determine IF a leader possesses a particular set of skills. Instead, it measures if the leader demonstrates a skill or chooses to use the skill in certain situations.

An excellent 360 is not just about self-awareness; it's about illuminating the environment in which the leader functions. A difference or gap in respondent group scores does not mean one respondent group is right and the other wrong. It means there is an essential difference in how others perceive the leader's use of the skill, and that difference can inhibit the leader's ability to lead.

Bridging the gap may not be a skill issue. It may be a motivation issue, a system issue, or a lack of information. It may mean that the leader has the skill but only chooses to use it with particular groups.

360 feedback is a powerful diagnostic tool that highlights issues then identifies what group the leader can talk with to understand the issues better and determine how to close the gap.

Not all gaps are bad. In many high-performing organizations, the opposite of 'benevolent distortion' occurs. Leaders, driven by the sense that they can never be good enough, rate themselves lower than their actual performance. When others rate a leader higher than the leader rates herself, it allows them to build on unrecognized strength.

Gaps can identify an underutilized asset in the leader's toolbox. Understanding how valuable others consider a specific behavior informs the leader how to build on her strengths and use those underutilized, highly valued behaviors more often.

Good 360 Surveys Work, And Have For Decades

360 feedback is hard to do well. Observing human behavior is not as easy or exact as counting bacteria in a petri dish, but clear behavioral descriptions and the correct scale produce valid data.

Extraordinary coaches don't use flawed data and know what to do with good data. 360 surveys are a powerful leadership development tool. Even Mr. Buckingham admits he's "…seen some extraordinary coaches use 360 results as the jumping-off point for insightful and practical feedback sessions."

If you happen to buy a lemon with a broken engine, don't let it prevent you from experiencing the thrill and performance of a well-designed automobile.