Most organizations rely on annual performance reviews to evaluate contribution, allocate rewards, and create accountability. The logic feels straightforward: measure results, rate people, and recognize the strongest performers. For decades, this ritual has been treated as a basic tool of management.
But what if the very practice meant to improve performance quietly prevents real improvement from happening?
W. Edwards Deming believed annual performance reviews were not merely ineffective. He argued they were one of the most damaging management practices in modern organizations because they direct leadership attention toward judging individuals instead of improving the system that produces results.
To understand why, we have to rethink what actually creates performance in the first place.
Why Deming challenged performance appraisals
Leaders want to understand how well their organizations are performing. That instinct is healthy; good leadership requires visibility into results and a clear understanding of where improvement is needed.
Annual performance reviews promise a structured way to do this. They compress a year of work into scores, ratings, and rankings that guide compensation, promotion, and recognition.
But Deming argued that this approach misunderstands how organizations actually produce results. He wrote: “Basically, what is wrong is that the performance appraisal or merit rating focuses on the end product, at the end of the stream, not on leadership to help people.”
Basically, what is wrong is that the performance appraisal or merit rating focuses on the end product, at the end of the stream, not on leadership to help people.
— W. Edwards Deming
In other words, reviews judge outcomes after the work is finished rather than improving the conditions that produce those outcomes in the first place. This difference—between judging results and improving the system that creates them—sits at the heart of Deming’s philosophy of management.
To see how this dynamic unfolds in practice, consider the experience of a school district wrestling with teacher evaluations.
A school district confronts the problem
In the Brookfield School District, evaluation season arrived every spring with predictable tension.
Teachers prepared documentation of their work while principals conducted classroom observations. District administrators compared performance scores across schools, and those numbers shaped pay increases, promotions, and professional reputations.
Marcus Lee, principal of Brookfield Middle School, had participated in the process for years, and each cycle followed the same pattern. Teachers worried about their scores, principals debated ratings, and district leaders reviewed charts comparing one school to another.
Yet the classrooms themselves seemed to change very little.
During a district leadership meeting, Marcus raised the concern with Superintendent Elena Ramirez.
“We keep having the same conversations,” he explained. “We review the ratings, we talk about who did well and who didn’t. But the classrooms themselves aren’t improving much.”
Ramirez understood the frustration, but she also saw the system as necessary.
“The reviews help us identify our strongest teachers,” she said. “Without them, how do we know who is performing well?”
Marcus paused before answering.
“That’s the problem,” he replied. “We think the scores explain performance. But most of the time they reflect the conditions teachers are working in.”
He pointed to several examples. Some teachers had consistent collaboration time with colleagues, while others rarely had time to work together. Some classrooms included far more complex student needs, and others had significantly more curriculum support.
The more Marcus studied the situation, the more he saw a pattern emerging.
As evaluation season approached, teachers became cautious. Collaboration slowed, and fewer people experimented with new lesson ideas because trying something new carried personal risk when results were judged individually.
The system was doing exactly what it was designed to do: judge individuals.
But something else was happening as well. Teachers began protecting their own standing rather than sharing openly, and leaders spent hours debating scores instead of studying the conditions shaping learning—curriculum support, scheduling, classroom composition, and collaboration time.
Slowly, the conversation shifted away from improving teaching and toward explaining ratings.
Deming warned about this dynamic decades ago: “Merit rating rewards people that do well in the system. It does not reward attempts to improve the system.”
Merit rating rewards people that do well in the system.
It does not reward attempts to improve the system.
— W. Edwards Deming
That comment stayed with Ramirez after the meeting. If the ratings were not revealing true performance, what should leadership be studying instead?
The answer emerged as district leaders began examining the system around teachers rather than the teachers themselves. They studied scheduling practices, collaboration time, curriculum support, and class composition. As these conditions became visible, many apparent differences in “performance” began to make sense.
The district gradually shifted its approach. Instead of relying on a single high‑stakes annual score, principals began holding ongoing coaching conversations with teachers, and teams studied classroom practices together while sharing what they were learning.
Over time the tone inside the schools began to change. Less energy went into defending ratings, and more attention went toward improving how teaching actually happened.
Why leaders keep returning to ratings
Most leaders adopt evaluation systems for understandable reasons. We want fairness, accountability, and a clear way to recognize strong contribution. Annual reviews appear to offer all three.
But Deming believed the deeper issue lies in how we interpret performance itself.
We often assume differences in results are primarily caused by individual effort or ability. When outcomes vary, we instinctively look for the person responsible, and the evaluation system becomes the mechanism for judging those differences.
In reality, most variation in performance within organizations is produced by the system people work within—its processes, resources, training, incentives, and leadership practices. When leadership focuses primarily on judging individuals, these systemic influences remain largely invisible.
The result is a familiar management cycle. Leaders react to outcomes rather than studying causes, and teams debate individual ratings instead of examining the conditions producing those results.
Over time, the evaluation process begins shaping behavior. People manage appearances, protect their own standing, and hesitate to take risks that might affect their ratings.
From a systems perspective, these responses are entirely predictable because the structure of the evaluation system quietly teaches people how to behave.
What systems-oriented leadership looks like
If judging individuals rarely produces improvement, where should leaders focus instead?
Deming’s answer was clear: study the system.
Leaders who want better performance begin by understanding the conditions shaping the work.
1. Study the system before judging the individual. When results differ, examine the environment in which people are working. Work design, training, tools, workload, and incentives often explain far more variation than most organizations realize.
2. Move leadership attention upstream. Evaluation systems tend to examine results after the work is finished. Systems-oriented leadership focuses earlier in the process—how work is designed, how knowledge flows, and where obstacles appear.
3. Create environments that support learning. Improvement requires experimentation and honest discussion of problems. Organizations improve faster when people feel safe sharing what they are discovering and where work is difficult.
4. Focus conversations on the work itself. Productive leadership conversations explore how work actually happens—how decisions are made, how teams collaborate, and where friction slows progress.
These discussions reveal opportunities for improvement that ratings alone can never capture.
Where real improvement begins
Improving performance is one of the central responsibilities of leadership.
But Deming believed the path to improvement rarely begins with judging individuals. It begins with curiosity about the system that shapes their work.
When leaders study that system—how the work is designed, how people collaborate, and where obstacles appear—they begin to see opportunities that ratings could never reveal.
Methods improve. Cooperation grows. Learning accelerates.
And something else begins to return to the workplace.
People rediscover pride in what they do.
They rediscover joy in work.
Slowly, the organization becomes a place where improvement is simply part of the work itself—growing naturally, every day.
All anyone asks for is a chance to work with pride.
— W. Edwards Deming














