A good measure of measurement

In which our author discovers correlation does not imply causation and how evidence can help when metrics fall short.

11 min readApr 26, 2022

My goodness! So much of my energy these days as an Agile coach operating at a team (squad) level and across the organization is taken up in discussions that center around measurement. From velocity to team maturity, ROIs to KPIs — its hard to have any kind of discussion without the but how will we measure this? question popping up.

How we love to measure things! Initiatives, teams, outcomes, products — you name it we try to measure it. Now I am not “anti-measurement” but I think that we fall into some avoidable traps in our rush to measure all the things. Quoting well-worn slogans like “you can’t improve what you don’t measure” (not true of course as anyone who has learned to ride a bike will tell you) — in our haste to slap numbers on things we miss out, ironically, on some much needed visibility.

So lets try to pick apart this important area. What can be quantified (measured using numbers) and what can’t? I am going to put an emphasis in this blog on measuring change, because this is most relevant to a business context. We want to know what impact an initiative had (or will have), or if a team is performing better than last month. And remember this is about quantifying. Numbers, not words. A conversation is a different thing altogether. OK, let’s take this back to first principles.

Physical change can be quantifiably measured

Pisa experiment by Galileo Galilei. Drawn by Theresa Knott

Sometime between 1589 and 1592, the Italian scientist Galileo Galilei reportedly climbed up the leaning tower of Pisa and dropped two spheres of different weights. His hypothesis was that the balls would travel at the same speed, hitting the ground at the same time. This is indeed what happened, disproving the Aristotelian assertion that objects fall proportionately to their weight.

Increasingly we measure and monitor the physical world around us, from the temperature of the kitchen refrigerator to the revolution speed of a truck engine more and more sensors are appearing, many of them sending their measurements across the internet, creating the backbone of the Internet of Things.

So the physical world, leaving aside things like black holes, the big bang, the universe and other such complicated matters, can be measured. But that’s generally not what we, as knowledge workers seek to measure. So let’s keep going….

Digital change can be quantifiably measured

Increasingly, change happens in the digital arena. Instead of paper notes changing hands, a credit card is swiped, and funds transfer silently from one bank to another. Anyone working in digital marketing will be familiar with the vast array of data generated by sending a single email campaign. The campaign volume, email open rates, click throughs, and any sales conversions that arise directly from the email can all be measured. What happens to this data, its accuracy, and how insights are gleaned varies from company to company. But that is beyond our scope here — suffice to say that digital change, be it new customer sign ups, subscription renewals, user viewing habits — whatever form of digital change, is inherently measurable.

After all, we (humans) created the digital world — this new world, including the much-hyped “metaverse” is a product of our labour. And if it’s a world we created, it stands to reason it’s a world we can measure.

Perceptions can be quantifiably measured

There is a well-known phrase — perception is reality. Not true in a literal sense but what is evident is that perceptions are very powerful and shape the world around us. We can quantifiably measure perceptions — most commonly by doing a survey. And there can be value in canvasing perceptions. However there are a number of caveats that go along with using surveys to gauge perception. To call out the most obvious survey anti-patterns:

Missed nuances. A question like the classic “on a scale of one to ten, how satisfied are you with…” will yield data. But it is not capturing the emotions, and the reasoning behind the number
Wrong Audience. What audience’s perceptions are you measuring and how relevant are these perceptions?
Survey fatigue. Someone receiving a survey regularly may give responses based on irritation at having to complete the task yet again, or may disengage with the process by simply filling in spurious responses to complete the task as quickly as possible.
Bias (generally unconscious) in question authoring.
Strategic answering. For example, survey respondents may be fearful of repercussions if they give a low score. Or respondents may give a low score, with the rational being that a low score will provoke change.

Money can be measured

Not only can the flow of money coming into and out from an organization be measured, but for almost all organizations this flow must be measured, as a legal requirement. We can be confident then that revenue from sales, and expenses are being measured within the companies and organizations we work in. In some ways money is a good measure too. Talk, as they say, is cheap. So a survey could give an indication of how favorably a prospect sees a future product. But its only when the product is developed, and a customer is considering parting with some of their hard-earned funds will we get to see the actual value of the product, as the customer sees it. The product or service is valuable enough to purchase, or it is not.

Some points to note when considering using money as a metric:

Company finance folks don’t always see the world as the teams on the ground see the world. So the Finance Codes that are used to report revenue to shareholders, are not always the ones that would provide valuable data to teams and stakeholders. Collaboration between the front line and financial controllers is required to get bang for our buck out of money as a metric, not always an easy thing to achieve
Money is to a large extent a lagging indicator, not a leading indicator. To take our product example, it is not until the product is in the market and available to the public, does any money flow though the system. For the folks developing a new product, or considering developing a new product, this data comes way too late.
You may be able to tell at a very high level what the money coming in is for — for example a product may have its own revenue code. However this reporting is almost always very broad brushstroke. Money does not speak to why a customer bought a product, only that a sale has occurred.
Which brings us to the attribution problem. To what cause can we attribute a fluctuation in revenue? More on this later.

Human interactions cannot be quantifiably measured

As I argued in a previous blog hypothesizing that value cannot be quantified, some things just cannot be measured in a quantifiable way. In other words, applying a number to these just is not meaningful. Consider the love you have for your partner, your child, your parents, or even the family dog. Does it make sense to try to quantify these relationships? “My love for my son is a 7.32, but I want to get it up to an 8 by the end of the year.” It just doesn’t make sense!

Lets take an example from the world of business — team performance. How can the “performance” of a group of human beings be measured? They have conversations, some productive others less so. They do analysis. They think about problems and how to solve them. They have team meetings And so on. Some teams do these things well and are productive. Other teams struggle. However, the key point is that none of this can be quantified, any more than a human conversation can be quantified. Successful teams have high quality ideas, are highly skilled, have great teamwork and productive conversations. None of it measurable purely by numbers.

Regardless of your religious views, very few people would assert that we have created this world of physical interactions we find ourselves in. Unlike one of personkind’s crowning creations — the digital world - the world of human to human relationships is not our creation. Making it it so much harder (impossible to be exact) to measure using numbers, in a way that makes sense and is helpful.

Much (some would say most, or even all) of what we do in business is done to achieve an outcome. That outcome could be to gain or retain customers. To complete an infrastructure project or to re-brand. In a sense these things can be “measured” in that they were either achieved or not achieved (or somewhere in between). But beyond this — again because these things fall into the realm of human interactions — business initiatives are simply not measurable. Projects can generate all kinds of numbers, but the bottom line is that initiatives are all about perceptions and expectations. And when we drill down deep enough, the feelings we have.

Wait….

“ah … but…” I hear you say. “The initiatives themselves perhaps we cannot measure in a meaningful way, but we can measure the impact of our endeavors.” Let’s unpack this with an example. As we have already established, we can measure perception. So right after our completed initiative we send out a customer satisfaction survey. Let’s say that Net Promoter Score (NPS) is up — Hooray!, success measured. To really cinch this let’s now go to the digital realm (measurable, as we have already established), and what do you know — new customer sign-ups are up! Double hooray!!

So perception is up, and customer sign-ups are up. Correlated with your great new initiative. However, and it is a big however this is where an important principle kicks in:

Correlation does not infer causation.

So, customer acquisition metrics and NPS are both up. But can you prove that this was because of your highly successful initiative? An understanding of systems thinking tells us that in the complex domain, cause and effect are not linear. You cannot draw a clear line from cause to effect. Or, to put this another way you cannot draw a line from effect back to cause. What if the product you were selling was video conferencing software, and the launch of your initiative correlated with a time when the Covid-19 pandemic really began to bite? Would you still be so confident that the upturn in sales was because of your initiative? Or could it be that other factors are at play? This is an important point, so let’s spend a little time getting our heads around this

Correlation does not infer causation

Lets dive into the above (fictional, but quite plausible) graph. Quite clearly shark attacks are correlated with ice-cream consumption. So does this mean that sharks have developed a taste for ice-cream? Or more specifically, a taste for people who have just eaten ice-cream? With our new-found understanding we can confidently say that just because ice cream consumption and shark attacks are correlated, that does not mean that eating ice-cream causes these attacks

A more plausible explanation, as the graphic explains, is that both shark attacks and ice cream consumption are correlated with warmer weather. When summer rolls around, people head to the beach, eat ice-cream, and unfortunately a very few are attacked by sharks, also attracted by the warmer weather.

Lets take this even further — take a look at the below chart, based on actual data.

Source: http://tylervigen.com/spurious-correlations

Very few people would venture to suggest that a high number of Nicolas Cage films is the cause of a high number of pool drownings. But they are highly correlated. All together now: “Correlation does not imply causation!”

So what then can we do?

I am not arguing in this blog for a “no measurements” approach. We need all the situational awareness we can get. For a company that has the resources, it makes sense to measure digital change. So long as the work is put in to achieve the necessary quality, these measurements and their trends can provide valuable glimpses into the landscape we operate in. And the more we understand our landscape the better our strategy. Quantifiable measurements of perspective — surveys and the like — can also give us valuable windows into our world. Finally, we may find valuable data in the company financial accounts.

But the real gold lies beyond the numbers. The world of human to human conversations. Let me share just one concept that can help us as we try to wrestle with getting some sort of handle on the complexity of the world in which we work.

The value of evidence

When an investigation team arrives on a crime scene and opens their investigation, they are not looking for metrics. They are looking for evidence. In a way, we in the business / knowledge world could be compared to investigators. In a complex environment, we look around us and ask “why?” Why is a team getting such great feedback? Why is a project still not delivered? Why was product X so successful? We need to be clear in terms of the evidence we are looking for.

To re-visit our team performance example. Instead of going down rabbit holes looking for ways in which we can quantify a team’s performance, we can instead ask what evidence we are looking for. What patterns are common to high performing teams? Will a successful team use a roadmap? How do they visualize the road ahead? The world of patterns, not the world of numbers is a far happier hunting ground for us.

Getting smarter

To wrap this up my call to action here is that we really need to get a lot smarter when it comes to people, and the interactions we have with each other. What about instead of trying to measure teams we put our energy into helping them achieve their goals? Instead of an annual performance review where someone receives a performance grade, the conversation centers around how we can best support that person? Instead of asking “how will we measure success?” we reframe the question as “what is the evidence by which we will judge success?”

Understanding and accepting reality is the first step to meaningful change. Once we understand the limits of what we can measure, we can first do any grieving we need to do, and then accept reality, pick ourselves up and start asking the questions that really matter.