The art of Storytelling and Data Science

The art of Storytelling and Data Science


Data Science and storytelling are inseparable. And the bridge between the two is data.

I had talked about storytelling in my article about Business Analyst/Data Scientist learnings but thought it would be better to elaborate more about the storytelling part.

Imagine you are a data science executive in an organization. You would be responsible for building machine learning models and analyzing data using statistical techniques to achieve business goals. There will come a time when you will have to present your work to the stakeholders of your project. In a corporate setup, a project has a mix of technical and business stakeholders. Explaining everything to this mixed audience can get tricky. For example –

  • How do you explain the complex model building process to a non-technical stakeholder?
  • Is it possible to translate model parameters to business actionables and recommendations?
  • Is there some way to address this gap?

Storytelling to the rescue!

Not sure how? Let me explain!

Data science and machine learning are technical in nature. In other words, it involves complex mathematical computations and methodologies, alongside its implementation in R or python. Hence, intuitively, machine learning models may not be interpretable by a person who does not have a relevant background. Such people might not be able to understand model complexity, computation time, and other model parameters. They may be driven by business goals. They might be more inquisitive about how the model impacts the company’s top-line. Or how well it reflects in numbers on a corporate level.

In all such situations, there arises a need for a data scientist to assume the role of a storyteller. A data scientist has to make sure that the complex mathematical model yields a fruitful takeaway for every stakeholder involved in the project. They should understand it irrespective of their background in machine learning. Hence, the art of storytelling becomes paramount. Storytelling for data science professionals has the following advantages – 

  1. Helps the stakeholders better understand the value addition of the model
  2. Increases retention: humans are trained to remember stories better
  3. Creates a series of logically connected business points leading to the ultimate outcome
  4. Invokes participation, generates interest

I will be honest, I have experienced similar situations. Situations where I had to present a solution to a set of stakeholders with varying levels of expertise in machine learning and data science algorithms. After experimenting with different storytelling formats, I would like to tell you about the one which I prefer the most.

How can I improve my storytelling skills?

I like to call my method of storytelling the “What-Why-How” format.

To understand the framework better, consider the following scenario – 

John was working on a project to predict employee attrition in the organization for over a month. Today is D-day. He is in the meeting room with all the stakeholders involved in the project. After exchanging pleasantries, he starts the presentation. 

Good afternoon everyone. The data provided to us exhibit non-linear behavior. Hence, we have created a machine learning model called Support Vector Machines with an RBF kernel in python. After doing a grid search for hyperparameters, we were able to estimate that the model performs best with C=1 and gamma = 0.1. The accuracy of the model is 95%

There is a silence for 5 seconds. Moreover, some of the stakeholders look confused. 

But, as a data scientist, achieving 95% accuracy should have made everyone happy. 

What went wrong?

Well, John did not realize that the meeting does not have only technical stakeholders. There are stakeholders from the finance and human resources department of the organization as well. He needed to translate his model into a language that the finance and human resources stakeholders could understand.

The “What-Why-How” framework

The What-Why-How framework helps you break down the business problem into 3 parts.

  • What was the business problem?
  • Why did we want to address that business problem?
  • How did we solve the problem (followed by its impact and recommendations)

Once we are able to translate the project in the above framework, we facilitate –

  • A better understanding of the business problem
  • Its impact
  • Efficacy of the suggested solution

Consequently, there is an actionable takeaway for every stakeholder that they can implement. (Each machine learning model should always be followed by a set of recommendations for the stakeholders)

For better understanding, let us try and break the attrition problem into the What-Why-How framework.

What is the problem?Why to address it?How can we solve it?
High employee attrition rate in the organisation
1: High investment for the organisation to hire a new employee
2: Leads to lowering of employee moral
1: Machine learning models to estimate what factors contribute the most to attrition
2: Helps the human resource department to make informed data decision based on employee feedback

Some of the insights that John could have presented to the stakeholders are –

  • Employees that have stayed in the organization for more than 2 years are less likely to leave
  • High average overtime leads to high attrition
  • Employees who left did not receive any promotion in the last 2 years

When the model presents such insights to the human resource stakeholders, they will have a set of actionable insights based on which they can take subsequent actions. Hence, rather than telling them that the model accuracy was 95% with a given set of hyperparameters, John could have told them how the model will impact the organization as a whole. This would have resonated with the stakeholders. They would have related to it better.


In the present scenario, where the technology advancements have made the model building process quick and easy, it has become essential for a data science professional to upskill themselves with the trend. What machines have not been able to do till now, is to make stories out of the data. To create a series of logically connected events to engage with the stakeholders. There is a need to cohesively bind business acumen the data science skills. Hence, every data science professional or an enthusiast should focus on building stories out of the data, not just models.

I would like to end this article by saying – Do not just be a data scientist, be a storyteller.

This brings us to the end of this article. Until next time, keep learning!

Would love your thoughts, please comment.x