Module 10 Data Engineering and Management
by Aditya Maheshwari
In this chapter, we briefly explain some of the basic concepts that help data scientists go beyond theoretical/small scale projects (mostly used for experiments/local research/conceptual solutions) and introduce the concepts and frameworks that allow data scientists in conjunction with data teams to building data science products that process and deliver results at scale. We will discuss this in the context of exploring the role of data engineering in data projects and providing an overview of some of the types of data pipeline infrastructure commonly involved in these projects.
In the current data ecosystem, most data scientists are still not required to understand the inner workings of data engineering and data management; however, as modeling tools become increasingly automated, and as machine learning solutions move from conceptual to practical, most data project requirements become engineering focused.
We only provide a cursory look at the topic in this module; in-depth information is available at , , , , and , while shorter overviews can be found at , . Learners interested in database design should consult .
10.1 Background and Context