At the launch of your service, you know which direction to go, what feature to prioritize and whether or not you’re going in the right direction. With the growth of users and features comes the need to get more objective information to create our roadmap objectively. This meant building our own analytics system.

Before creating a more custom solution, we tried finding one available as a service. They’re multiple big players in this field: Google Analytics, Mixpanel and so on. They work well for a time though we found some key issues. The main one is of course that in the end, everyone stops using those tools. For us, this happened because we couldn’t send all the data we had. Either because it was siloed in another tool (Stripe and Intercom). Or because the tool didn’t work well if the amount of data we intended to send.

Here is an overview of how we built our analytics system and what we learned along the way. We’ll be looking at the entire system: how events coming from multiple sources are then stored in a single database, and finally get analyzed.

We gather data from multiple sources, store them in a centralized database, and finally, analyze it

Our goal was to build a complete analytics system, allowing us to make complex queries, and quickly create and use dashboards, while requiring minimal maintenance on our side. The last part is vital to us. Being a four people team, we can’t waste time maintaining this tool. Some of the questions we expected to answer were:

  • On what type of user should we focus our efforts to increase conversion?
  • Is this feature used enough? Are frequent users more likely to pay?
  • Are we increasing our application’s usage per user over time?

Let’s dive in!

Collecting data from various sources

Events collection was about determining what we want to record and building it.

1. Deciding what to track

We identified three types of data we need to track.

  • Our users and their attributes (their company, when they signed up, etc.)
  • Major events on the platform (i.e., a user invites a coworker, downloads an important file)
  • The requests made per day in our application and where they got spent

You’ll notice we separated the requests and the events. This distinction is mandatory because requests are very frequent on Email Hunter. Some users end up making thousands every day and logging each of them would only create noise. Our solution was to aggregate the usage into a daily item.

Requests are perfect to understand the overall usage of our service. Indeed, every user makes requests, on our Chrome Extension, with the API and so on. Requests then provide a unified way of knowing how users are responding to our service, even if the way they use the data is completely different.

2. Building it

Part of our goal was to avoid spending time maintaining this system. Relying on a third party seemed like an obvious choice. We decided to go with Segment. Their service requires minimum work; they have plugins for most programming languages and platforms. You’ll only need to add the tracking in your application!

Storing the events

Daily, Segment syncs events and users’ info to your warehouse

A lesson we learned the hard way was not to do it ourselves. Spinning up a database is quite easy. But taking care of maintenance, security, and updates isn’t worth it. You’re creating value somewhere else, spending time here would be a waste.

A standard solution is to use Amazon’s Redshift, a fully managed simple and cost effective data warehouse. What they offer is ideal for this use-case. And if you’re already using Segment, they can fully manage your database for a very reasonable cost. Daily, they will load the day’s event. You’re then ready to start querying it!

Analyzing the stored data

Finally, the most interesting part: where we get to make better decisions for our business! Our approach made the analysis straightforward. We can use SQL and Redshift is already compatible with most analytics software. The great thing is that since you’re storing your data, you can quickly switch to another provider. For our needs, we decided to use Mode Analytics. Their service is easy to use, and we love how simple it is to get insightful reports.

1. Reports by feature

Those reports always answer the same types of questions:

  • How much is the feature being used?
  • Is it growing?
  • Is it being utilized by a majority of users or only a fraction? How do paying users consider it?

We make sure the date is on the x-axis to measure our progression over time. Also, we prefer graphs that won’t fluctuate depending on our overall growth. For example, an increase of our number of active users will lead to more API requests. But this doesn’t mean we improved our API. Instead, we want to measure the change of the number of requests per user.

2. Overall engagement

Measuring the overall engagement on the service is both an excellent way to know if we’re heading in the right direction and to make sure the user experience is improving. We can measure the effect of marketing and our support’s responsiveness this way.

One of the reports we use to track the overall service usage

As stated earlier, requests are the unified way of knowing how users react to our service. To measure overall engagement, we track request usage and cross the data with multiple other variables. The goal is to get a global understanding of how users are using the service. We can then better allocate resources depending on where our most valuable users are spending their time.

3. Answering specific questions

Every few days, we have to make new decisions: should this feature be killed? Is this plan viable for us? And so on. Having the data at hand and being able to query it to get an answer in a matter of minutes is crucial. Not being able to get data quickly will lead the team to quit using the analytics.

When done right, it’s amazing how debates change. Reports are objective.They alter the discussion from trying to guess what’s happening to what should be done to reach our goals.

Conclusion

Since deploying our analytics, we’ve been amazed at how insightful they can be. At the same time, they require little extra effort — only adding events. No doubt we’ll be improving how we work over the future. We’ll describe significant changes to our configuration if it can help others.

We hope this article gave you some ideas. I would love to hear your thoughts if you think we’ve missed something! 😉


Antoine Finkelstein
Antoine Finkelstein