Extract Transform Load (ETL) comes from data warehousing. The ETL process is about getting data from multiple data sources and using an ETL tool to extract value.
In several previous posts, including "Data Science Storytelling," I cover the essential elements of a data science story, but storytelling goes beyond the story itself. It's the telling of the story that separates a gripping story from one that is merely an accurate accounting of events. To make data storytelling pop, you need to believe it and be passionate about it. Only then will you be able to transfer that passion and conviction to your audience. In short, you need to sell it.
When I was a law school, I took a course on litigation. I learned that part of being a successful lawyer is the ability to make the jury empathize with your client. A jury would always want to know the backstory — how the plaintiff and defendant arrived at this point, what made the plaintiff file suit, what did the defendant do and why?
The professor, who had several years' of jury trials behind him, offered some common-sense advice — say what you believe and say it with clarity and passion. He warned against trying to make ordinary stories extraordinary because what you know is the only account you can truthfully represent. Making up a far-fetched story about what happened is only likely to undermine your credibility.
The same holds true when you're telling a data science story. Don't try to fake it or feign interest in a topic or issue that is no interest to you. The audience will quickly pick up on any insincerity, and at that point, your credibility is shot.
If you've ever been on the receiving end of a good sales pitch, you know the secret ingredients — a salesperson who loves what they do and truly believes that the product would significantly improve your life in some way. It's almost as if the salesperson would buy one for you, if she could afford it, just so you could experience its benefits for yourself. With a good sales pitch, you can hear the passion and conviction in the person's voice and witness it in the person's body language.
On the other hand, if you've ever been on the receiving end of a lousy sales pitch, you probably could feel that you were being sold to — that the salesperson was overselling the product and was motivated by profit, not by a commitment to serve your best interests. Or maybe you felt that the person hated her job or was reluctant to sell this particular product; in other words, the salesperson wasn't sold on it herself.
When a data science team lacks conviction, it often becomes apparent in their use of data visualizations. Instead of telling a convincing story and backing it up with visualizations, the visualizations become a distraction to draw attention from the fact that the story really isn't all that interesting. The team thinks that by dangling a little eye candy in front of the audience, they won't notice that the team has nothing important to say.
Following are a few suggestions for presenting a story in an interesting way:
Remember that you are the most important part of your presentation. Beautiful charts, clever anecdotes, and piles of data won’t make up for a lack of passion, humor, and grace. Even the most extraordinary data will seem boring if you can't tell it in an interesting way. The key is to make sure that you believe that the story is interesting. If you can’t convince yourself, you won't convince an audience.
Extract Transform Load (ETL) comes from data warehousing. The ETL process is about getting data from multiple data sources and using an ETL tool to extract value.
Hadoop basics is an apache Hadoop tutorial for beginners on how to manage big data. See key concepts such as hdfs, hive and hbase.
Data science is a multi-disciplinary approach to extracting insight from data. The disciplines involved include computer science/information technology, math/statistics, and domain knowledge/expertise (for example, knowledge of a specific industry). The process of extracting insight from data is typically broken down into the following five stages: Shifting the Focus from Data to Science The best way […]