How to approach funnel analysis
Sep 25, 2014
Better event analysis is out there
One of the most popular questions I hear is ‘Where do I start with funnel analysis’? Collecting every interaction on your website and app can seem like a daunting task, and finding real insight is even harder (the classic funnel graph below is a perfect example). The dirty secret is that while everyone is talking about funnel analysis, few are digging deep and making business-changing discoveries.
Can you take any action with this (I can’t)?
With the proliferation of analytics platforms promising deep insights, businesses think high-quality funnel analysis is as simple as plug and play. While tools like Mixpanel, Kissmetrics, Google Analytics, et al have built wonderful web front ends on-top of robust collection frameworks - true insight has been harder to come by. Thus when event analytics choices are made, selecting an event tracker and an analytics tool is conflated as a single choice.
But the most successful companies understand that event collection and event analysis are two independent tasks. Tools that are great for one can be weak for the other.
Choosing your tools
Event collection is an increasingly crowded space. Google Analytics, one of the oldest names in the space, is extremely popular for web analytics, but newer players like Mixpanel, Kissmetrics, and Heap continue to grow. Often these tools focus on doing analysis through browser-based discovery, but offer API access to the raw data. Cost can be a concern with some tools, as pricing often scales with number of events, so as your business doubles, so too will your event collection costs.
In addition to the tools above, more open analytics tools like Snowplow Analytics and Segment.io offer access to the raw event data, allowing easier plug and play into more robust analytical frameworks. Finally, many businesses build in-house event collection tools that push data directly into internal databases, which offers maximum flexibility, but the possibility of ongoing overhead.
This is where your event collection becomes valuable. Unfortunately (we think), most companies limit their analysis toolpack to the event collection framework they select above (ie. the Google Analytics front-end on web). While these collection tools have some extremely slick front-ends (for example, the Mixpanel example above), the flexibility of their analysis functionality can be limiting.
For simple time-series and ratios, these collection tools provide more than enough analytical muscle, but once analysts want to dig deeper limitations can become readily apparent. By pushing data into big, scalable, performant databases like Redshift, the sky's the limit for analysis. And by joining all this event data into your transactional data sets inside your favorite analytical tool, like Looker, event analyses can be seamlessly stitched into larger business datasets.
Why go through the trouble
There are a few reasons separating these providers is beneficial. Below we highlight some crucial examples.
As companies evolve, product changes over time and clean event tracking can be a second thought to releasing product into the wild. Too many times, I have seen events renamed over time from "pageview" to "page_view" to "view". Each time, historical analysis of pageview trends becomes more difficult. This is where an analytics tool can save the day. Where web services may have trouble stitching data together, your custom analysis in-database will handle that with ease via field transformation and quick data unions. So rather than bugging data engineers to clean historical data, analysis is seamlessly available through a simple data model that stitches pageviews, page_views, and views together.
When examining event data, the most important analysis many undertake is a basic funnel analysis. What is the dropoff rate at each stage of the funnel? Do different traffic sources experience different issues reaching the checkout page? Do users in non-English locales get stuck in the checkout flow? All of these questions form important starting points for event data analysis.
But just as funnel analysis without purchase data would be crazy, funnel analysis without your transactional data can hide significant trends in the business. When your disparate data sources aren’t being joined into your event datasets, your analytical firepower is limited. For example, for many, event data is collected entirely independent of ad spend data - so while any user’s source is being tracked in your advertising databases, your event analysis is opaque to all of this data. Do Facebook users have more trouble on the signup page than Google users? If that data isn’t tagged in your event set, you are out of luck. Meanwhile, bringing this data together inside your Redshift cluster is a snap.
This is the most important reason for bringing all your analytics together in one place. Rather than limiting queries to the constraints of analytical web tools (slicing funnels by one dimension at a time, time-bounded analysis, fixed session lengths) - bringing data into a robust analytical tool opens up any type of analysis an analyst can dream up.
One-day session conversion rate of Facebook and Google paid customers by state? Highest product page fall-off rates over time? Does the referral funnel behave differently for your most loyal customers? Does seeing a sold-out product screen affect customer lifetime value? Are sellers in your two-sided marketplace better customers?
This flexibility can also be crucial for evaluating product changes. Rather than limiting yourself to a 30-day funnel, where users have a month to ‘complete’ a transaction, most businesses want to analyze users in a much tighter funnel (daily or even hourly) especially when evaluating significant product changes. With data in your own back pocket, real-time event analytics is a cinch for product managers.
Want to learn more?
Trying to set up your own event collection machine? Reach out and let us help.