Yali is co-founder and analytics lead at Snowplow Analytics, a web event data collection platform.
Data makes a difference. You don’t have to look far to find examples of companies that have used specific kinds of data to dominate their industries. In retail, transaction data has been transformative: In 1995, Tesco in the UK used transaction data collected through its Clubcard program to better serve and monetize their customer base, helping them grow to the largest supermarket in the UK. In manufacturing, data on material inputs, intermediate products, and process times and costs enabled the lean manufacturing techniques that Toyota pioneered and used to establish themselves as leaders in the automotive industry.
If expertise in the use of transaction data helped distinguish the winners among yesterday’s companies, expertise in the use of event data will distinguish the winner’s among tomorrow’s companies. So what is event data, and why am I so confident that it will be so valuable?
What is event data?
Event data is simply data that describes what has happened. To stick with our retail example, a man might enter a store, browse a couple of aisles, pick up a chocolate bar, put it down, pick up a banana, proceed to the checkout, pay for it, and leave. Outside the store, he might peel the banana and then eat it.
Retailers don’t typically collect the above data in-store. But they do from their website and mobile apps. Why has event data become interesting? Anytime a company engages with their customers and prospective customers via digital platforms, it’s possible to collect a very detailed stream of event data that describes those interactions in a huge amount of detail.
Event data can be generated from any type of connected device.
More and more devices are “connected.” Tomorrow, my smart fridge might record my loading it with food items, and then later, monitor my removing those items so I can cook and eat them. My connected frying pan might identify what combination of ingredients I’m cooking, when I added each to the pan, and how long I cooked them for. My smart garbage can might monitor what I throw away afterwards. And my connected wristband might tell me how many of those calories I ate that I’ve burnt, and keep tabs on my heart rate, both while I ate and later when I exercised.
I said earlier that retailers don’t typically collect data in-store, but that may not be true for much longer. QR codes, iBeacons, mobile phones, and CCTV are just some of the technologies that might change that. The distinction between the physical and digital worlds has been eroding for some time, opening up the possibilities to collect event data from everywhere.
Why is event data valuable?
There are straightforward reasons why even the data in the trivial examples above can be useful. A fridge that knows what is stored in it (and when it was added) might be able to optimize its temperature setting to increase energy efficiency. It might warn me when items are likely to go bad. It might alert me when I’m about to run out of an item I eat regularly—or even better, order it for me, so I don’t have to.
But there are more general, higher-value uses this data can be applied to.
Event data tells us how people engage with specific products and services.
Often, people use products and services in surprising ways. (I can’t be the only person who emails himself high-priority tasks to complete.) Event data can tell us exactly how people use a service, and it can be a great indicator of how good that experience was for the person involved. (It actually makes possible lean manufacturing approaches in a whole new set of environments.) If your company delivers one or more of its services via a digital channel, that data is invaluable for spotting opportunities to improve your existing services or to create new services.
Event data tells us a lot about the people involved in the events.
Transaction data tells us what people buy. But buying things is just one of potentially millions of different activities that I as an individual can engage in—one that I spend a small fraction of my time doing. How I spend the rest of my time says a lot more about me, from the books I read, the software I write, and the people I work with, to the ways I spend my leisure time. Any company that understands this data is in a much better position to help me—and therefore to offer me products and services that I will want to buy.
Event data is actionable.
We don’t simply sit back and collect streams of event data from the different digital channels. We can use that data to intervene in those customer journeys in real time, to make them better:
How will companies that successfully use event data compare with those that do not?
Companies successfully using event data will:
What will companies need to successfully use event data?
First, companies will need a low-cost way to collect, validate, and warehouse event data from every digital platform where they offer a service or engage with a user. That data needs to be consolidated across all their different channels, so the customer view is as close to complete as possible.
Our vision for the Snowplow platform is to enable companies to collect, validate, and warehouse event-level data from every digital platform.
Second, companies need a way to use that data to drive insights that benefit both them and their users. Getting value from event data is much harder than from transaction data. For example, it’s relatively easy to infer that an individual buying diapers probably has a young child. In contrast, it’s only by spotting patterns in entire sequences of time-ordered events that provisional hypotheses can be formed about those users—hypotheses that can be tested using other event data or by intervening in the customer journey and tracking the impact of that intervention.
We see huge scope for innovation in tools to help companies use event data. One area that’s received a lot of focus of late: Machine-learning techniques hold the promise of helping us spot patterns in the mass of data. However, before we can use those tools to their full potential, we need to figure out how to represent that event data in a format suitable for processing by these types of algorithms in the first place.
If it is not clear how best to represent event stream data to algorithmic processing, it is also not clear how best to visualize that data so humans can spot patterns in it. There is not a single answer to either of these questions: there are countless ways to represent data to enable different types of insight to be drawn from it. The whole field is open to a wealth of new approaches, each of which will enable us to derive new types of insight.
At the moment, some of the most effective techniques for understanding event data involve exploring it at a macro level to identify trends, drilling down to spot particular patterns, and then validate those patterns against the wider data set. The combination of Looker’s capabilities for exploring data and the agile way in which patterns (in the form of dimensions) can be quickly added to the metadata model and applied across the entire data set makes Looker uniquely suited to deriving insight from event data.
Third, it is essential that the same data used for driving insights is also used for actioning those insights. At Snowplow, this philosophy is driving our investment in Amazon Kinesis: Kinesis will enable companies to analyse and act on a unified event stream (i.e., a single source of truth) in close to real time. That same thought is driving the development of Looker from a data exploration and reporting engine into an application server that can push the data not only to people but also to applications and processes that can action the data.
It's an exciting time to be working with event data. We can’t wait to see what our clients do with Snowplow, Looker, and their event data.