How we binned big data
We live in an age of ‘big data’. The ability to record, measure and compare everything and anything is upon us. Every day, we create more than 2.5 million terabytes of data, fuelled by an ever growing array of ‘smart’ devices, prolonged screen time and increasing number of social media interactions.
At no other time in history have we had the ability to analyse and scrutinise in such a granular way. And for those optimising goods and services, it has given rise to an industry of data scientists, information architects and quantitative analysts, all claiming to add value and bring about change.
For those at the cutting edge, there is an expectation that bigger data creates a better chance of getting things right - the chance to hone and improve.
So some years ago, I was interested to listen to a senior engineer at McLaren Racing outlining his team’s use of data to improve his Formula 1 car’s performance. He explained that, during the race season, a F1 car is essentially a product permanently in Beta. From one race to the next, adjustments and refinements are made to optimise the car's performance. To help the team understand where improvements can be made, almost every component in the car generates data, which places a huge amount of real-time information at the fingertips of engineers.
But there’s a problem. The senior engineer explained that lots of data takes lots of time to analyse and even longer to action into meaningful change. The result he had therefore observed was that his engineers were prone to ignore huge swathes of data as they just didn't have the time to process it into anything meaningful. Instead, these skilled technicians favoured cherry-picking a small number of key measures, which they could practically handle and felt would drive incremental change.
It’s not an uncommon occurrence in high performance engineering - a focus on what we now label, ‘small data’.
And I see similar issues being faced by most business leaders too.
We work with clients who collect extremely large data sets - from information about website traffic to user movements around individual pages - and we are often tasked to analyse the data and to make practical optimisation recommendations as a result.
However, a lot of the time, the 'big data' recommendations aren't helpful because it misses out 'small data' factors. These may be blockers to optimisation due to problems with organisational structure, misalignment of team objectives, a misunderstanding of skills or capacity from one team to another. My point is that big data often misses the human element of business, the bits that are hard to identify and measure in numbers. They remain, however, the factors that actually have the most impact on organisational progress.
The truth we all know is that there is a plethora of easily gatherable and accessible data out there. We know more about our products, our customers, the competitive context and societal factors than ever before. But just because we have loads of data doesn't mean that we can effectively use it all. And that, in essence, is why we’ve binned big data in favour of using the minimum amount of data we need to solve a problem.
There is not 'one tool to rule them all' because of the different ways that data is collected and recorded. What we often find with clients is they don't know what data they have - particularly larger organisations with multiple stakeholders and cross-functional teams. In these cases, we like to run a 'data amnesty' - a prompt for individuals to surrender all of their data. It's illuminating, and it allows us to synthesise and simplify, boiling down the data into the sets that can actually help us solve a given problem. In the past, this process has helped organisations to identify where there is no tangible value in data that they may have been collecting for years! The result is a reduction in organisational time and effort.
The results have been largely positive. An ability to create products, services, tools and campaigns for our clients more quickly. The ability to see improvements, however marginal, based on much smaller capital outlay. And an approach to risk that is far more proportional due to a more agile testing approach that gets to results quicker (and cheaper).
But there is also a peculiarity in the world of small data sets. The most prevalent use of small data is now in the field of AI where those machines built off of huge data sets themselves, are learning to use smaller amounts of information to make their decisions. The ability to spot patterns from smaller amounts of raw data, rather than needing to apply huge processing power to bigger data sets. How’s that for irony?
In our case, we’re in a slightly different place. Yes, we’re using smaller data sets to make decisions but we’re also conscious to inform our decision making by recognising and incorporating those human elements that cannot be sorted and measured in quite such an orderly fashion. As William Bruce Cameron so eloquently put it, “Not everything that can be counted counts. Not everything that counts can be counted.” And we find that being conscious of that takes us to the best place of all.