Skip to content

dshegde/DataEngineering

Repository files navigation

DataEngineering_InClassWork

Course Overview:

This course explores the challenges of designing, building and maintaining data processing pipelines. Focusing on concepts, techniques and technologies for gathering, validating, transporting, transforming, enhancing, storing, integrating and maintaining diverse data sets common to modern enterprises

Objectives:

  1. Understanding how distributed data pipelines are designed and implemented
  2. Analyze ethical issues related to gathering, processing and storing of data
  3. Identify and use common best practices for gathering and validating data
  4. Develop software to check and maintain the validity and quality of data
  5. Explain how software should be designed for transport of data in a distributed system
  6. Design and implement data transformation software
  7. Develop data enhancement modules using appropriate technologies
  8. Recognize opportunities for integration of diverse data sets
  9. Consider diverse technologies for data storage and maintenance

Project:

Build, develop, test and monitor a small-scale data pipeline

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors