As I explained in a previous article, “Building a Data Science Team,” a data science team should consist of three to five members, including the following:
Together, the members of the data science team engage in a cyclical step-by-step process that generally goes like this:
Data science teams also commonly run experiments on data to enhance their learning. This will help the team collaborate on many data-driven projects.
Experiments generally comply with the scientific method:
Suppose your data science team works for an online magazine. At the end of each story posted on the site is a link that allows readers to share the article. The data analyst on the team ranks the stories from most shared to least shared and presents the following report to the team for discussion.
The research lead asks, “What makes the top-ranked articles so popular? Are articles on certain topics more likely to be shared? Do certain key phrases trigger sharing? Are longer or shorter articles more likely to be shared?”
Your team works together to create a model that reveals correlations between the number of shares and a number of variables, including the following:
The research lead is critical here because she knows most about the business. She may know that certain writers are more popular than others or that the magazine receives more positive feedback when it publishes on certain topics. She may also be best at coming up with key words and phrases to include in the correlation analysis; for example, certain key words and phrases, such as “sneak peek,” “insider,” or “whisper” may suggest an article about rumors in the industry that readers tend to find compelling. This will create a visualization that can communicate even big data to people without a data skill set.
Based on the results, the analyst develops a predictive analytics model to be used to forecast the number of shares for any new articles. He tests the model on a subset of previous articles, tweaks it, tests it again, and continues this process until the model produces accurate “forecasts” on past articles.
At this point, the project manager steps in to communicate the team’s findings and make the model available to the organization’s editors, so it can be used to evaluate future article submissions. She may even recommend the model to the marketing department to use as a tool for determining how to charge for advertising placements — perhaps the magazine can charge more for ads that are positioned alongside articles that are more likely to be shared by readers.
Although you generally want to keep your data science team small, you also want people on the team who approach projects with different perspectives and have diverse opinions. Depending on the project, consider adding people to the team temporarily from different parts of the organization. If you run your team solely with data scientists, you’re likely to lack a significant diversity of opinion. Team member backgrounds and training will be too similar. They’ll be more likely to quickly come to consensus and sing in a chorus of monotones.
I once worked with a graduate school that was trying to increase its graduation rate by looking at past data. The best idea came from a project manager who was an avid scuba diver. He looked at the demographic data and suggested that a buddy system (a common safety precaution in the world of scuba diving) might have a positive impact. No one could have planned his insight. It came from his life experience.
This form of creative discovery is much more common than most organizations realize. In fact, a report from the patent office suggests that almost half of all discoveries are the result of simple serendipity. The team was looking to solve one problem and then someone’s insight or experience led in an entirely new direction.
So what is data visualization? Data visualization is the process of communicating data graphically — in the form of tables, graphs, maps, timelines, matrices, tree diagrams, flow charts, and so on. Their purpose is to convey relationships, comparisons, distributions, compositions, trends, and workflows more clearly and succinctly than can be presented solely in words. You can think of a data science team’s reports as employing two forms of communication:
When building a report, the data science team combines the two forms of communication to tell the story revealed by the data with maximum clarity and impact. Visuals often provide the means of communicating complex information and insights with the greatest simplicity and effectiveness. Often, the audience immediately “gets it” upon viewing a simple graphic that summarizes the data.
When doing data visualizations, a key first step involves choosing the chart type that’s the best fit for the data and what you’re trying to illustrate. The following table provides general guidance to help you make the right choice.
Purpose | Chart Types |
Compare values | Bar
Column Line Pie Scatter plot Spider chart |
Show composition | Area
Pie Stacked bar Stacked column Waterfall |
Show distribution | Bar
Column Line Scatter plot |
Show trends | Column
Dual-axis line Line |
Show relationships | Bubble
Line Scatter plot |
Show locations | Map |
Keep in mind that content and purpose should drive form. Don’t choose a chart or other visual just because it looks pretty. I’ve seen some beautiful charts that do a poor job of communicating the data, as well as ugly charts that are very informative. Ideally, you want a beautiful chart that’s informative and communicates the point you’re trying to make. However, if you have to make trade-offs, clarity trumps beauty.
Creating data visualizations is a team sport. The data analyst should work closely with the other members of the data science team to develop data visualizations that communicate the data most effectively. If the data analyst has to explain the charts to the research lead, they’re probably too complex for other stakeholders in the organization. The team is a good testing ground for ensuring that the visuals in a report will be effective.
Remember that your team works together to explore the data, which means that the majority of the first round of reports you design will be for each other. The research lead drives interesting questions; the data analyst creates a quick and dirty report to explore possible answers; and then the team might come up with a whole series of new questions. This means that most of your initial data visualizations will be quick exchanges — more like visual chitchat than full data reports.
After the team reaches consensus on the data and the visuals, spend some time polishing the data visualizations to share them with the rest of the organization. Your final data visualizations should be even simpler and easier to understand than the versions you shared with team members.
Think of your first round of data visualizations as whiteboard presentations in your data science team meetings. Although you’ll probably do most, if not all, of your data visualizations on a computer, treat them like mock-ups or scribbles on a whiteboard. These data visualizations may be oversimplified. Their purpose is to initiate productive and creative discussions. You may start with a quick and simple scatter plot or linear regression chart and then fine-tune it as you ask more questions and collect and analyze more data. Obtaining and responding to feedback from other team members is the best way to create effective and attractive data visualizations.
Your best charts will be the product of an emergent design. Start with simple reports and improve them over time. You’ll produce much better reports by going through several revisions.
If you’re interested in discovering more about data visualization, I recommend the following two books:
Note: There’s typically nothing in the training of data analysts that prepares them for producing good visualizations. Most graduate programs are still very much rooted in math and statistics. Good data visualization relies on aesthetic and design. It’s a learned skill and may not come easy.
Building a data science culture means different things to different organizations. It may mean introducing a new data science team to the organization, democratizing the data so everyone has access to the data and the business intelligence (BI) tools to do their jobs, or encouraging the entire organization to develop a data-science mindset.
Whatever the meaning, data science organization change is difficult, especially if your organization strongly resists any major change — and many do. To effect a big change, you need some degree of competence in the field of change management— strategies and techniques to prepare, support, and assist individuals, teams, and organizations to adapt to new ideas.
Although change management is a complex topic, in this post I offer several suggestions to overcome common obstacles in implementing any change, including a change in your organization's culture.
Changing an organization's culture is an ongoing, often cyclical process, but before you start, draw up a linear step-by-step plan to ensure that you set out in the right direction. Here's a sample plan that you may want to tweak for your own use:
Having a top-level sponsor to cheer on your team while you do the hard work to effect a change is better than having no top-level support at all. However, any tangible support your top-level sponsor provides adds fuel to the tank and sends a signal to the rest of the organization that people at the top truly support your efforts. Tangible support may be provided in various forms, including the following:
Transforming a culture in which status and expertise drive the decision-making process to one in which data drives the process requires a major overhaul in how everyone in the organization thinks. It requires a never-ending process of continuous improvement. If your expectations are too high regarding the level of change and the time in which it occurs, you and others may get discouraged when you don't see quick, dramatic improvements.
To improve your chance of long-term success, manage everyone's expectations, including your own. Prepare your organization for a long and bumpy ride. Steer clear of quick fixes. Slow and steady wins the race. While this approach may sap some of the energy that drives change, it will help to prevent major disappointments, which tend to threaten overall success.
Building a data science culture is about much more than building a data warehouse and rolling out state-of-the-art business intelligence tools. It's about changing the way people think about what they do and how they do it. According to some schools of thought, you can change people’s thinking by changing their behaviors. Others believe that you can change people’s behaviors by changing their thoughts. I recommend doing both:
In any organization, you'll find pockets of resistance and even vocal critics of any proposed change. Don't ignore this resistance or merely try to steamroll a change over or past your critics. Listen to them and engage them in discussion. If data science truly holds value for your organization, you should have no trouble convincing skeptics. In addition, your critics may point out real weaknesses in your plan that you need to address for a successful implementation.
Many organizations hire outside consultants to implement a desired change in the organization. Some even treat consultants as disposable change agents — hiring a consultant to drive the change and then firing her when it fails. This practice gives management a convenient scapegoat.
A better approach is to choose a well-respected and longtime employee to drive the change internally with the mindset that the change is inevitable — failure is not an option. One or more consultants can then be brought in to provide expert knowledge and insight on how to more effectively implement a data science team. A charismatic insider can more effectively lead the charge by having some skin in the game and communicating in a language that the rest of the organization understands using examples that resonate with the organization's existing culture.
In my previous post, Data Science Culture, I describe four common cultures that develop in various types of organizations:
Certain cultures are more conducive to data science than others, and in some cultures, the organization may resist any attempt to introduce a data science team into the mix. When you're starting a data science team, the ability to overcome any cultural resistance to your team can make or break the initiative.
After you've identified your organization's culture, the next step is to implement change in the data science organization. The change you're trying to implement is to get the organization to accept a data science mind-set and perhaps even adopt that mind-set.
One of the best books on implementing organizational change is Fearless Change: Patterns for Introducing New Ideas, by Marry Lynn Manns and Linda Rising. This book is geared towards "powerless leaders" — those who have no authority — so the book is suitable for anyone at any level of the organization who wants to implement change.
In Fearless Change, the authors describe several myths of organizational change, including the following:
In addition to busting these and other common myths, the book provides strategies and techniques for steering clear of the pitfalls that often derail efforts to effect change. For example, you discover ways to deal effectively with skeptics and even learn from them (after all, they might be right and see something that you overlooked). Powerless leaders are encouraged to rally small groups of people to reach consensus and build momentum to drive change forward. You also discover ways to keep proponents on board, so they don't flip on you and undermine your efforts.
Fearless change is based on the notion that people accept ideas at different rates:
When you're trying to initiate a change in your organization, such as a change in attitude in favor of data science, start with the natural innovators and then expand the movement down through the other groups. You and a small group of innovators can introduce the idea, create some buzz about it, and start getting the others thinking about the potential benefits of data-driven decision-making and innovation. Early in the process, you're softening up the organization to make it more receptive to the change.
When you're introducing data science to various groups and individuals in your organization, avoid abstract language, such as "productivity" and "quality." Instead, talk about data science as a way to better understand the customer or a way to solve specific problems or answer specific questions the person or group is likely to encounter on a daily basis.
Better yet, provide concrete examples of how data science can help a given individual or group do their jobs better, faster, or easier and how data and analytics can make them more innovative.
One of the most effective ways to win over a majority in an organization is to prove the value of the change, so be sure to celebrate and publicize "wins" — any successful application of data science within the organization. When others in the organization see the benefits in action and see how data science has enabled others to excel at what they do, they will flock to your data science team to help them do the same.
Many organizations struggle with the technology to democratize data. Without the right technology in place, running a query or creating a report can take hours or even days depending on the volume and variety of the data, the number of queries being run, and the compute power available. In some cases, a person may wait several hours for report only to find out that the query crashed the system and needs to be run again. When others in the organization hear about these horror stories, they become reluctant to adopt the technology. However, as soon as the kinks are worked out and user-friendly business intelligence tools are rolled out, adoption rates across the organization soar. As soon as the barriers to adoption fall away and the benefits become apparent, people who resisted often become the biggest proponents of the change.
Whenever you endeavor to introduce a big change in your organization, remember to start slow and build momentum. It probably took a very long time for certain beliefs and behaviors to become ingrained in your organization, so don't expect any big change to be embraced and adopted overnight. Most importantly, don't get discouraged.
Every organization has a culture that strongly influences employee beliefs, thinking, decision-making, and behaviors. According to MIT Professor Edgar Schein, an organization's culture is:
A pattern of shared basic assumptions that the group learned as it solved its problems that has worked well enough to be considered valid and is passed on to new members as the correct way to perceive, think, and feel in relation to those problems.
In his book The Reengineering Alternative,William Schneider identifies four common corporate cultures, which are described in greater detail in the sections that follow:
Some cultures are more conducive to data science than others. For example, a collaboration data science culture tends to be more open to curiosity, transparency, exploration, and discovery — all of which are conducive to data science. On the other hand, a data science control culture tends to place more value on certainty over curiosity. Instead of transparency, leadership operates in a cloistered environment of secrecy. Instead of exploration and discovery, goal-setting, planning, and meeting milestones are the objectives.
Unfortunately, an organization's culture can be deeply ingrained and difficult to change, even on a small scale. And if the existing culture is counterproductive to the data science team's mission, it can totally undermine a team's efforts.
When you're trying to get a new data science team up and running, one of the first steps is to identify the culture in which the team will operate, so the team will be more aware of any resistance it may encounter.
A control culture has a distinct pecking order characterized by a corporate hierarchy with an emphasis on compliance. Everyone in a control culture knows who they work for and who works for them. The role of the individual is to comply with the supervisor's requirements. The head of these organizations communicates a vision, and everyone down the line is responsible for implementing it.
Data science teams often struggle in control cultures for several reasons, including the following:
Even with these challenges, many data science teams succeed in organizations with strong control cultures, which often rely heavily on their data and ability to use it to make data-driven decisions.
A competence culture is centered on knowledge and skills and tends to be organized into areas of expertise, so specialization is rewarded. The most highly competent individuals in the organization become the de facto managers. They set the standards and create and delegate tasks. This culture is prevalent in organizations driven by specialized knowledge such as engineering firms and software development firms.
Competence cultures tend to struggle with the data science mindset. Data science tends to be interdisciplinary. Team members are more generalists than specialists. In addition to a familiarity with statistics, mathematics, programming, and storytelling, data science team members need general knowledge that spans all functional areas of the organization. Cultures that put a lot of emphasis on specialization may have trouble appreciating what the data science team has to offer. They may also have trouble accepting the fact that the data science team requires cooperation from other functional areas to do its job; the best questions often come from outside the data science team.
In a cultivation culture, leadership focuses on empowering and enabling people to become the best possible employees. These organizations tend to be structured like a wheel, with employees at the center surrounded by mentors and resources to make each employee successful.
A great deal of emphasis is placed on expressing yourself. Charismatic individuals can quickly rise in the ranks according to their ability to harness the collective talent of team members to solve problems.
Generalists do well, so a data science team is a natural fit in a cultivation culture. However, don't expect quick, decisive action on anything your team proposes. Decision-making can be a long, drawn-out process, because it is highly participatory and organic. The drive is toward consensus, which can be difficult to reach with a large number of diverse opinions.
True cultivation cultures are rare. Some organizations may think they have a cultivation culture, but if you look closely, you'll see that they don't really follow a lot of the key practices. A lot of these organizations are just control cultures with a thin veneer of a cultivation culture.
A collaboration culture is similar to a control culture in that decision-making power is concentrated in the upper levels of the organization. However, instead of a strict top-down management structure, authority is concentrated in groups across the organization. Collaboration is mostly within these groups rather than among them. These collaborative groups tend to make decisions via brainstorming sessions along with some experimentation.
You're likely to encounter the collaborative culture in training organizations, in which leaders tend to be team builders and coaches, and in family-run businesses, where authority is based on relationships.
Compared to the control and competence cultures, the collaborative culture is more open to change, which makes collaborative organizations more likely to embrace a data science mind-set. However, keep in mind that authority is concentrated in pockets and may or may not be pushed down to the team level. These organizations are only slightly more democratic than those with a control culture.
If your organization has a collaboration or cultivation culture, it will have an easier time embracing the key components of a data science mindset, because they value generalists and are accustomed to communicating and collaborating across teams. You can expect more resistance in a strong control or competence culture. Here are a couple suggestions for overcoming such resistance:
One way to approach storytelling is to do everything right. In several previous posts, including Data Science Storytelling and Data Storytelling Structure, I offer guidance on how to tell a data story the right way. Another approach is to avoid the most common data storytelling pitfalls, including the following:
In this post, I provide suggestions on how to avoid these common pitfalls.
Imagine scheduling a doctor's visit to review the results of recent lab tests. Your doctor hands you a copy of the results and leads you through the document. Perhaps your fasting blood sugar level is 100 mg/dL; your total cholesterol is 270 mg/dL, LDL is 220, and HDL is 50; and your triglycerides are at 160 mg/dL. Your doctor says, "Well, the data speaks for itself."
Or imagine turning on the local news and having the meteorologist present a bunch of charts that show changes in temperature, humidity, and barometric pressure over the last 48 hours, along with maps of low- and high-pressure systems across the country. She wraps up by saying, "Well, the data speaks for itself."
As you can see, raw data, even when accompanied by data visualizations such as tables, charts, and maps, can be meaningless without expert interpretation of that data. When you consult an expert, you want the expert's opinion and practical advice — expert insight drawn from the data. In the same way, as a member of the data science team, you must interpret the data for your audience or at least lead the audience through the process of understanding the data and drawing reasonable conclusions of their own.
If your data science team is working in the context of a traditional corporate culture with a strong hierarchy, your team may be discouraged from telling stories or interpreting the data. In organizations like these, presenting the data and visualizations and letting management interpret the data are the politically safe options. Your team simply plays the role of impartial presenter.
The problem with this approach is that the data science team is responsible for the outcome, even if management misinterprets the data.
Although the data science team should certainly be open to different interpretations of the data, team members should interpret the data on their own and clearly communicate their findings. The team should do this by telling a story that connects the dots and extracts meaning from the data. Don't give anyone else carte blanche over interpreting your team's data and visualizations.
Data science is a high-tech pursuit that involves a great deal of specialized language and acronyms. This specialized language is like shorthand — it enables people in the field to communicate efficiently and effectively. Every field has its own specialized language (jargon). If you've ever read a study published in a medical journal, you probably needed a translator to define some of the terminology. However, when a doctor meets with a patient, the doctor uses more common terminology to explain the patient's diagnosis and treatment protocol.
In the same way, when you tell a story, consider your audience and speak to them in a language they understand. Don't use the same language you use with your colleagues on the data science team.
New data science teams often struggle with the idea of creating a story from data. Some data just looks like lifeless columns of numbers. Data visualizations are more attractive but can be equally cryptic. How do you tell a story with a chart?
It's a real challenge for data science teams to reverse engineer tables and charts to tell the story behind the data. Frankly, it’s one of the biggest challenges. One way to overcome this challenge is to humanize your reports. For example, instead of calling a report "Upcoming consumer trends," call it something like, "What people are buying." This simple solution makes it easier to think about your data in terms of real-world events and activities.
Business intelligence (BI) tools produce a dizzying array of data visualizations, making it incredibly tempting to create and use every visualization imaginable to illustrate your presentation. Avoid the temptation. Slides are great for displaying data that supports your claims, but if you or your audience becomes too focused on the data, you will all be distracted from what's most valuable — the interpretation of that data.
Count your slides. As a rule of thumb, if you have 30 slides for a 60-minute presentation, you have too many, and you're not telling a story. Keep in mind that the charts are the first things your audience will forget. To achieve maximum impact, focus on the things your audience will remember. Your audience is more likely to remember a clear, interesting story.
Like any skill, data storytelling takes time to improve. Start thinking about the key elements of a story — plot, setting, characters, conflict, and resolution. Then strive to weave those elements into a story around the data that reveals its meaning and significance and will connect with the target audience.
Over time, your stories will become more robust and interesting. You might even draw stronger conclusions and bolder interpretations. Try to remember to have fun with your stories and your audience. It will improve your stories and make you a more interesting storyteller.
A metaphor is a figure of speech used to describe an object or action in comparison to something that is dissimilar but has something in common. Here are a few examples:
Numerous metaphors are woven into the fabric of data science itself, such as "data warehouse," "data lake," and "data mining." Metaphors are essential to the way humans communicate and process ideas. They enable us to more quickly and easily grasp and assimilate the unfamiliar by comparing it to what is familiar.
When people use metaphors to communicate, you may not even realize they're doing so; in fact, they may not even realize they're using metaphors. However, when composing a data science story, you should consciously look for opportunities to tap the power of data storytelling metaphors, especially when introducing new or complex ideas or concepts. In fact, your entire story may be a metaphor — in the form of a parable or fable — used to illustrate a point you are trying to convey. Remember that metaphors link the unknown to the familiar; your audience is more likely to feel a connection with a story that's familiar to them.
You probably know about other figures of speech that involve comparisons, including the following:
Simile: A comparison between two objects or actions typically introduced with a word such as "like," "as," or "than." For example, "quiet as a mouse" or "more fun than a barrel of monkeys."
Analogy: An extended comparison of two similar but distinct objects or actions for the purpose of showing that the two are similar in more ways than one. (Some people think of "analogy" as a comparison and "simile" and "metaphor" as two ways to express that comparison.)
Allegory: A symbolic, fictional story, poem, or picture that conveys a broader message or lesson about life. For example, the novel Animal Farm, by George Orwell, serves as a satire of totalitarian governance.
Tip: Don't get caught up in the differences between metaphor, simile, analogy, and allegory. In storytelling, think in terms of making comparisons and connecting the unfamiliar to the familiar.
Imagine your data science team is working for a chain of movie theaters. Whenever a new release is available, management wants to know how many screens to show it on in each theater to maximize revenue. Showing the movie on too many screens leaves a lot of empty seats. Showing the moving on too few screens fails to capitalize fully on potential ticket sales.
Your team decides to develop a predictive analytics algorithm to calculate the number of screens on which to show the new release. Your team gathers structured and unstructured data. The structured data shows that people are watching the trailer on numerous websites. The unstructured data indicates high volumes of "mentions" about the movie on Twitter, Facebook, and other social sites.
When your data science team presents its findings to the client, it has two options. The first option is to present the data in language that's familiar to the data science team; for example, "Our analyses of both structured and unstructured data suggest a broad interest in the new release."
The other option is to speak in a language that's familiar among movie theater managers and harnesses the power of metaphors. For example, you may say something along the lines of, “We are picking up a lot of friendly chatter on social media, and traffic on sites that are showing the trailer is through the roof. You are definitely looking at a potential blockbuster.”
Through the use of more descriptive language, including metaphors, you convey the information and insight in a way that's easy for theater managers to understand. They immediately know not only the value of the data but also the sources — websites and social media venues. Instead of asking you to define "structured data" and "unstructured data," they'll ask meaningful questions, such as, "How accurate is the level of friendly chatter in predicting ticket sales?"
Your team may use other metaphors, as well, such as describing tickets for the show as hot tickets or explaining that a few weeks after the initial release of the movie, theaters could expect to see interest in the movie cool off. These metaphors make the story more interesting and fun, which keeps your audience engaged and helps them extract meaning from the data.
Metaphors are great for breaking down language barriers that stand between you and your audience, especially when you have complex concepts or ideas to convey. When you're working with colleagues on your data science team, you naturally use terminology that's familiar with team members. Don't assume that same language is understandable or interesting to others.
Metaphors not only make your story sound more interesting, but they also lower the bar to participation. The more your audience engages, the more likely they are grasp the meaning and significance of the data and the conclusions you've drawn from it.
In Paul Smith's book Lead with a Story, he describes how the CEO of Procter & Gamble would come to presentations and sit with his back to the screen. Smith describes how he delivered a presentation to the CEO, who didn't once turn around to look at the slides. After the presentation, he realized that this wasn't by accident. CEOs in large companies see data all the time. They know the data is the vehicle and that the story the presenter tells holds all the value.
When your data team is composing a story, don't get hung up on the data and on creating dazzling visualizations. Spend more time to make data storytelling personal. Without a great story behind them, the data and visualizations are relatively useless. The story connects the dots, reveals the data's meaning and significance, and educates and transforms the audience.
Your job as a data science team is to reveal the humanity behind the numbers. You need to get personal.
Suppose you're walking through an airport and you notice a cell phone on an empty seat with nobody near it. You pick up the phone to discover that it's unlocked. Being the considerate person you are, you want to return the phone to its rightful owner. How would you go about doing that?
You carefully consider your options. Maybe you should just leave the phone where it is, because you know that there's a high probability that the person who left it there would soon remember and return to the spot to reclaim the phone. However, there's also a high probability that someone else will pick up the phone before its owner returns. Maybe you should turn it in to the desk at the nearest gate, thinking that the person may have boarded a plane at the gate and hoping that one of the flight attendants could get the phone to the passenger prior to departure. Or, maybe you should hold on to the phone assuming the owner would use a borrowed phone to call his own phone and find out where it was.
Now suppose you're telling your story later and recounting the thought process you engaged in to decide on the best course of action. You're a data scientist, so after the event, you perform some research on lost cell phones, analyze the data, and create charts to illustrate the probability of the different scenarios you considered. Now the time has come to tell your story.
You have two options. First option: You could flip through your slides and explain each one in turn. Maybe you have a slide that shows a correlation between the phone value and the likelihood that the owner would return to claim it. You may have another slide that shows the percentage of phones that are turned in at airports and train stations and never claimed. A third slide shows a correlation between lost phones and the number of owners who find their phones by calling their own numbers.
Second option: You get personal. You tell the story in a more human way. For example, you might start with the following:
Two weeks ago, I was passing through LAX, when I spotted a cell phone on a seat at one of the gates with nobody near it. I picked up the phone and was surprised to discover that it wasn't locked. I checked text messages and emails to see if I could find any flight information. I walked over to the desk at the nearest gate and asked the attendant whether anyone had reported a lost phone. She pulled out a box from below the counter and showed me its contents — about 20 phones that were turned in only this past week. I returned to the seat where I found the phone and sat there for about ten minutes hoping that the owner would return to claim the phone or would borrow a phone from another passenger to call. No such luck . . .
Which presentation would you find more interesting — the slide show or the story? Rhetorical question. Of course the story is more interesting, but why? With the slide show, you're removing the human element from the story. You have no characters — no you, no owner of the lost phone, no flight attendant. You have no plot, no setting, no conflict, no resolution. All you have are numbers, statistics, and slides. Boring.
When you tell your data science story, you want to take the focus off the data and place it squarely on the story. You want all eyes on you, pens down, and electronic devices stored safely and quietly away. Audience members should only glance occasionally at the slides. If they spend too much time looking at charts or graphs, chances are they’re thinking about something else. Only after the story hooks the audience members and connects with them personally and on an emotional level will they be receptive to the knowledge and insight you impart and be inspired to embrace whatever change you recommend.
In several previous posts, including "Data Science Storytelling," I cover the essential elements of a data science story, but storytelling goes beyond the story itself. It's the telling of the story that separates a gripping story from one that is merely an accurate accounting of events. To make data storytelling pop, you need to believe it and be passionate about it. Only then will you be able to transfer that passion and conviction to your audience. In short, you need to sell it.
When I was a law school, I took a course on litigation. I learned that part of being a successful lawyer is the ability to make the jury empathize with your client. A jury would always want to know the backstory — how the plaintiff and defendant arrived at this point, what made the plaintiff file suit, what did the defendant do and why?
The professor, who had several years' of jury trials behind him, offered some common-sense advice — say what you believe and say it with clarity and passion. He warned against trying to make ordinary stories extraordinary because what you know is the only account you can truthfully represent. Making up a far-fetched story about what happened is only likely to undermine your credibility.
The same holds true when you're telling a data science story. Don't try to fake it or feign interest in a topic or issue that is no interest to you. The audience will quickly pick up on any insincerity, and at that point, your credibility is shot.
If you've ever been on the receiving end of a good sales pitch, you know the secret ingredients — a salesperson who loves what they do and truly believes that the product would significantly improve your life in some way. It's almost as if the salesperson would buy one for you, if she could afford it, just so you could experience its benefits for yourself. With a good sales pitch, you can hear the passion and conviction in the person's voice and witness it in the person's body language.
On the other hand, if you've ever been on the receiving end of a lousy sales pitch, you probably could feel that you were being sold to — that the salesperson was overselling the product and was motivated by profit, not by a commitment to serve your best interests. Or maybe you felt that the person hated her job or was reluctant to sell this particular product; in other words, the salesperson wasn't sold on it herself.
When a data science team lacks conviction, it often becomes apparent in their use of data visualizations. Instead of telling a convincing story and backing it up with visualizations, the visualizations become a distraction to draw attention from the fact that the story really isn't all that interesting. The team thinks that by dangling a little eye candy in front of the audience, they won't notice that the team has nothing important to say.
Following are a few suggestions for presenting a story in an interesting way:
Remember that you are the most important part of your presentation. Beautiful charts, clever anecdotes, and piles of data won’t make up for a lack of passion, humor, and grace. Even the most extraordinary data will seem boring if you can't tell it in an interesting way. The key is to make sure that you believe that the story is interesting. If you can’t convince yourself, you won't convince an audience.
In a previous post, "Data Storytelling Structure," I provide guidance on the big picture of storytelling — nailing down the five key elements of a story: characters, setting, plot, conflict, and resolution. However, if you've ever heard someone tell a story, you know it takes more than those five elements to make it interesting. The devil is in the details. Skilled storytellers embellish their stories with plenty of details that feed the imagination and stimulate the senses. They make you feel as though you're watching the action unfold before your eyes.
In a similar fashion, your data science team should include plenty of details in every story it tells to flesh it out and make it more memorable. Details are like little mental sticky notes that help the audience remember the characters, setting, plot, conflict, and resolution. In addition, the details provide supporting evidence to the larger observations and claims being presented by the team.
An organization I once worked was struggling to get enough people to participate in its medical studies. The data science team was called in to figure out why. The team conducted some research and discovered that some people are afraid of needles, others are afraid of having their blood drawn, and a cross-section are afraid of both. This cross-section represented a lot of people.
The data science team asked some good questions and made some interesting discoveries. One such discovery was that people who participated in and had a positive experience with a medical study that did not involve needles or blood draws were more inclined to participate in future studies that die involve needles or blood draws.
The research lead (a nurse) had a great idea on how to tell that story with impact. She would start with a case study, changing the participant's name and a few details to protect the patient's anonymity. Her story went something like this:
When I was a nurse I could always tell who was afraid of needles. They always crossed their arms in a certain way. They grabbed both of their elbows as a way to protect themselves from the poke in the arm. There are a lot of people out there like that, and we need them to participate in our medical studies. So I'm going to tell you a little bit about someone I found in one of our reports.
Let's call her Tracy. She participated in one of our medical studies for a drug being developed to help people sleep. The first day of the study she showed up with her own pillow. She must've been optimistic about how well it would work. She was hoping that this new pill would help her since she had some trouble sleeping during periods of high stress.
It turned out that Tracy was one of the participants who didn't get any benefit from the drug. When she left, she told the nurse that her father was a doctor, so she felt some obligation to participate in medical studies. She said she could never be a doctor because she was scared of blood and needles. A few months later she decided to participate in a flu vaccine trial. The study required needles for the vaccination and for later blood tests.
So why did Tracy decide to participate?
The obvious answer the research lead's question is that Tracy participated because she felt an obligation to do so. After all, she didn't actually benefit in any way from the sleep study. She felt as though she couldn't contribute to helping others with their health issues directly by being a doctor or nurse, so she would do her part by participating in studies.
Now, think about the story you just read. What do you recall? Clearest in your mind are probably the details — the description of how people held their arms when they were afraid of needles, Tracy's name, Tracy bringing her pillow to the sleep study, what her dad did for a living, the trials she participated in, and so on. All of these details make it easier to remember the story and to remember the conclusion drawn from the story — that Tracy participated in medical studies because she felt obligated to do so.
When you tell a data science story, try to use details to paint a picture in words. They help your audience connect to characters, setting, plot, conflict, and resolution.
Data science is a combination of science and art. The data science team follows the scientific method to explore and discover — to add to the organization's growing body of knowledge and insight. The team then uses the art of storytelling to convey that knowledge and insight to people across the organization in a compelling and memorable way.
Business presentations are boring. They're not structured to be interesting. They're static. They communicate the current state of affairs. They’re like a verbal “reply all” to the organization's stakeholders. That’s usually fine for status meetings, but it falls short when you need to convey a point, make it stick, and transform the audience in a positive way.
Avoid the temptation to merely deliver reports or presentations. Use the data and the findings from your analysis to tell a compelling story. And be sure to include the details.