In this post, we’re talking to Melisa Kleen, the Head of Business Intelligence and Data Science at Blinkist. Blinkist distills non-fiction books into 15-minute summaries of key insights, aka “Blinks”. We cover how Melisa’s team has built a personalization engine that helps subscribers find what truly speaks to them among thousands of titles and keeps the app interesting for long-term subscribers.
Table of Contents
Blinkist distills non-fiction books into 15-minute summaries of key ideas, so-called “Blinks” that can be read or listened to. Blinkist offers over 3,000 books via their mobile apps, website and Alexa Skill, for over 12 million readers worldwide. Coupled with a free trial, customers can subscribe to Blinkist with monthly or annual plans, starting at $8 / month.
Blinkist has two streams of data work. There first stream is classic, descriptive reporting, with KPIs and dashboards. The second stream are a set of data products targeted at improving the in-app customer experience for the consumer.
One of those data products is the Blinkist personalization engine, which provides tailored content recommendations. Personalization, by picking the right content, at the right time, for the right customer, drives up user engagement, consumption and retention for Blinkist.
“People who discover personalized content retain better as they get more value out of the product.” , says Melisa Kleen, the Head of Business Intelligence and Data Science at Blinkist.
Who is the end-user of this data product?
The recommendation engine runs on event data collected from the Blinkist apps. Personalization happens at different moments in the customer experience, with three distinct ways people consume recommendations.
- Marketing systems: Blinkist uses Facebook Ads (and other ad networks like Taboola) for customer acquisition. With 100s of different ad types, ad-serving is optimized for different audiences, based on historic conversion and engagement data.
- In-app content discovery: The Blinkist app surfaces recommendations in a “For You” section.
- CRM: A recent new project is to send a daily personalized push notification with a new recommendation, orchestrated by Blinkist’s customer relationship management system.
More in general, Melisa’s team is infusing data into all thinking and working at Blinkist. “Today, all product managers need to think, ‘how can we leverage data’? CRM managers need to think, ‘how can we leverage the data’? Anyone who is working with the customer needs to think about what we already know about the customer and how we can use that knowledge to make the experience better. Data is the key ingredient.”
What Business Problem Does this Data Product Solve?
As a content company that targets busy individuals who want to fit more learning into their lives, Blinkist needs to be spot on with their recommendations. A personalized experience is therefore important, as it complements Blinkist’s value offering.
Blinkist started with recommendations in the “For You” section in the app, with a simple collaborative filtering algorithm. That was a step forward in terms of uplift compared to a previous black box model implemented with Apache PredictionIO, an open source machine learning server.
Since the launch of the first collaborative filtering algorithm, there have been several improvements to the model but that’s not the whole story. “With collaborative filtering, you always highlight the popular books. That doesn’t enable the deeper learning that a group of customers wants. They want to go deeper into certain topics, and not stay on the generalist level across many different things”, says Melisa.
That’s where the content-based recommendations came in, and Melisa’s team is currently testing many different models and their impact on user engagement and retention. “We want to increase engagement early on and reduce drop-offs from frustration points. We realized one major thing leading to drop-offs is when users are not engaging with the content that we offer, but rather turn to search. So we decided to improve the content that we offer to them, and also the designs, the whole experience, making it easier to discover content.”
The interesting, differentiated part about Blinkist is that the 15-min summaries or “Blinks” offer a new path to learning, and differentiates Blinkist from the much larger players like Audible.
“Our recommendations, if they fit the customer, unlock books that may have less than a hundred reads. You would never, ever buy that full book. But if it’s interesting, on Blinkist you’ll first get the trailer, and then the 15-min summary of key insights. You learn something new. So in that sense, personalization has a higher role. Personalization turns Blinkist from a book seller into a life-long learning platform.”
What Are the Data Sources & Tech Stack
Unlike many other companies, Blinkist uses their own in-app event tracking called “Alchemist” vs. using 3rd party SDKs in their app.
“We wanted to own that data,” says Melisa. “We want to keep everything at the event level, because that tells us what the customer is doing, regardless of what’s happening in the background.” Blinkist has open-sourced part of “Alchemist” (GitHub repo).
Here are the different stages of the transformations process.
- Data sources: With their own event bus, data comes from the Blinkist apps—the web app, iOS and Androids apps and from Alexa Skills. Events also come back from marketing systems (Facebook) and marketing automation / CRM systems (Braze).
- Data ingest: Alchemist consists of AWS Kinesis coupled with Lambda functions, and also CloudWatch for real-time monitoring. Alchemist processes, transforms, enriches and validates all events and writes them into Amazon Redshift via S3.
- Data warehouse: Amazon Redshift, dc2.large nodes. Blinkist also uses Redshift Spectrum, for joining data in S3 with data in Redshift.
- Data modeling / predictions: Blinkist uses Matillion, an “ELT” product, to transform within their Redshift data warehouse. For the recommendation part, Melisa uses Amazon Sagemaker for machine learning. “We push the final calculations from Redshift into S3. We have multiple Sagemaker notebooks that train on that data and give recommendations.”
- Data serving: The recommendations from Sagemaker are written to DynamoDB and exposed via an API.
- Data visualization: For the business intelligence part, Blinkist uses Periscope Data and Amplitude. Both connect to the final calculations in S3 / Redshift. “We use Periscope data for general and marketing reporting, for its flexibility. User engagement data however goes into Amplitude, to analyze the product analytics side more in depth.”
The output of the data pipeline from the warehouse are fact tables, with a “books-in-blinks interaction funnel” as Melisa describes that part of the data platform. In those fact tables, “we’ve distilled all the information about customer interactions, day by day, on different platforms.”
Melisa deliver the recommendation / transformation work with a team of 6 people. In addition, there’s a data engineering team of 3 people that focuses on leveraging the data infrastructure including Alchemist. At the time of this post, Blinkist has 160 employees, and with a total of 6 +3 = 9 people, 5% of their headcount is dedicated to the core data platform.
Best practices and lessons learned
Melisa has to offer a few lessons learned based on her work in the past few years. Doing it all over again, “I think I would be a little bit more disruptive and aggressive to launch data science within the company. That applies to all companies who want to switch from traditional BI to data science.”
The key is to increase the level of acceptance for data science and data products within a company. Melisa has three key recommendations.
- Make every team member a data science evangelist, and embed them into the organization.
- Be mindful and listen to your audience and your colleagues—machine learning can do nothing if the people building the models don’t understand and don’t work together with the business.
- Build small samples and do more proof of concepts, “MVP” (“minimum viable product”) MVP versions of big projects and just show the tip of the iceberg saying “Hey, this can help. This is just an example of what we can do.”
“In every company, no matter how data aware they are, they are used to descriptive reports. They are used to seeing what they’ve done.” But that’s a different level than “estimating what to expect, predicting what’s going to happen, and recommending what to do.” And so “showing people what we can do with data, how they can incorporate it into their work and show real life examples”, is what distincts successful data teams from the rest.