Module 7 Data Science Basics
by Patrick Boily and Jen Schellinck
In October 2012, the Harvard Business Review published an article calling data science the “sexiest job of the 21st century”, and comparing data scientists with the ubiquitous “quants” of the ’90s: a data scientist is a “hybrid of data hacker, analyst, communicator, and trusted adviser” .
Would-be data scientists are usually introduced to the field via machine learning algorithms and applications. While we will discuss these topics in later modules, we would like to start by with some of important non-technical (and semi-technical) notions that are often unfortunately swept aside in favour of diving head first into murky analysis waters.
In this module, we focus on some of the fundamental ideas and concepts that underlie and drive forward the discipline of data science, as well as the contexts in which these concepts are typically applied. We also highlight issues related to the ethics of practical data science. We conclude by getting a bit more concrete and considering the analytical workflow of a typical data science project, the types of roles and responsibilities that generally arise during data science projects and some basics of how to think about data, as a prelude to more technical topics.
Note: we encourage readers to take a look at the Programming Primer before diving into data science proper.
7.5 Getting Insight From Data
7.5.1 Asking the Right Questions
7.5.2 Structuring and Organizing Data
7.5.3 Basic Data Analysis Techniques
7.5.4 Common Statistical Procedures in
7.5.5 Quantitative Methods