Dark logo

Data Science vs Software Engineering Projects

Published June 19, 2017
Doug Rose
Author | Agility | Artificial Intelligence | Data Ethics

In my previous post, "Data Science Projects," I pointed out the differences between project management and data science. These differences are summarized in the following table:

Project ManagementData Science
PlanningExploring and experimenting
Goals and objectivesDiscovery and knowledge
Schedule- and budget-drivenData-driven
CertaintyCuriosity
ExecutionInnovation

You can see how these differences play out when comparing traditional software projects to typical data science projects, as presented in the following table. While traditional software projects are focused more on achieving a goal and delivering an end product, data science projects are more exploratory and open ended. Both have deliverables, but with software projects, the deliverables are more tangible and deadline-oriented, whereas data science tends to deliver a less tangible and ever growing body of knowledge and insights, which may be of even greater value to the organization.

Traditional Software ProjectTypical Data Science Project
Develop a new customer self-help portalBetter understand a customer’s needs
Create new software based on customer feedbackCreate a data model to predict churn
Install a new server farm to increase scalabilityDiscover new markets and opportunities
Convert legacy code into updated softwareVerify assumptions about customer behaviors

Despite their differences, software project management is fast becoming more like data science with the growing popularity of agile software development methodologies, such as Scrum, Extreme Programming (XP), Lean and Kanban, and Dynamic Systems Development Method (DSDM).

Like data science, many of these newer software development methodologies follow the scientific method, at least to some degree. That is, they often begin with research to assess the customer's (end user's) needs, and they build the software gradually in multiple, iterative cycles (commonly referred to as "sprints"). Team members are encouraged to experiment during these cycles to innovate and build knowledge that the team can draw on to achieve continuous improvement, both in the product being developed and the process used to create that product.

In many cases, the software development cycle is never-ending — the software is in continuous development, improving continuously with each development cycle and with each new release. As with data science, the focus is more on the process than the product and is open-ended — a never-ending cycle of building knowledge and insight and driving innovation. In the case of software development, this knowledge and insight is applied to continuously improve the software. With data science, the knowledge and insight is applied to continuously improve the organization.

Spotify, the digital music, podcast, and video streaming service, follows this same iterative approach in the development of its platform. The company nurtures a creative, failure-friendly culture, as reflected in its values:

  • Agile > Scrum
  • Chaos > Bureaucracy
  • Community > Structure
  • Cross pollination > Standardization
  • Enable > Serve
  • Failure recovery > Failure avoidance
  • Impact > Velocity
  • Innovation > Predictability
  • People > * (anything else)
  • Principles > Practices
  • Servant > Master
  • Trust > Control

Spotify's approach to software development is rooted in the Lean Startup approach of "Think it, build it, ship it, tweak it." The organization even hosts "hack days" and "hack weeks," encouraging its development teams (called "squads") to spend ten percent of their time building whatever they want with whomever they want.

Squads are given a great deal of creative license to develop and test new features with the condition that they try to "limit the blast radius." They accomplish this by decoupling the architecture to enable each squad or "tribe" (a collection of squads) to work on an isolated part of the platform, so any mistakes are limited to that part; and by rolling out new features gradually to more and more users.

Spotify also places an emphasis on "capturing the learning." Teams experiment with new tools, features, and methods and then discuss the results to figure out ways to improve both product and process. They document what they learn and share it with other teams, so everyone in the organization is better equipped to make data-driven decisions instead of decisions driven by authority, ego, or opinion.

Organizations would be wise to follow Spotify's lead not only in developing new software but also in managing their data science teams — or, even better, in allowing and enabling the data science teams to manage themselves. Your organization's data science team should feel free to ask questions, challenge assumptions, formulate and test their own hypothesis, and cross pollinate (reach out to others in the organization for insight and feedback). The team's mission should be more about exploration, innovation, and discovery than about setting goals, meeting milestones, and staying on budget or on schedule.

Related Posts
November 6, 2017
Data Storytelling Metaphors

Data storytelling metaphors are a powerful tool to help your data science team tell stories. Data visualizations are not the best way to communicate complex problems.

Read More
February 13, 2017
Hadoop Basics

Hadoop basics is an apache Hadoop tutorial for beginners on how to manage big data. See key concepts such as hdfs, hive and hbase.

Read More
April 3, 2017
Common Data Science Pitfalls

Learn how to avoid the most common data science pitfalls. These can come up in your products that use analytics, big data and even machine learning.

Read More
1 2 3 18
9450 SW Gemini Drive #32865
Beaverton, Oregon, 97008-7105
Dark logo
© 2022 Doug Enterprises, LLC All Rights Reserved
linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram