About Me

header ads

DATA ENGINEERING AND MLOps (BAD714C)

DATA ENGINEERING AND MLOps

Course Code BAD714C 
CIE Marks 50
Teaching Hours/Week (L:T:P: S) 3:0:0:0 
SEE Marks 50
Total Hours of Pedagogy 50 
Total Marks 100
Credits 04 
Exam Hours 3
Examination type (SEE) Theory




Module-1

Data Engineering: Definition, The Data Engineering Lifecycle, Evolution of the Data Engineer, Data

Engineering and Data Science, Data Engineering Skills and Activities, Data Maturity and the Data Engineer,

The Background and Skills of a Data Engineer, Business Responsibilities, Technical Responsibilities, The

Continuum of Data Engineering Roles, Data Engineers Inside an Organization ,

Internal-Facing Versus External-Facing Data Engineers, Data Engineers and Other Technical Roles, Data

Engineers and Business Leadership.

Data Engineering Lifecycle: The Data Lifecycle Versus the Data Engineering Lifecycle, Generation: Source

Systems, Major Undercurrents Across the Data Engineering Lifecycle

Textbook 1:Chapter 1 (1.1–1.5), Chapter 2 (2.1–2.4)




Module-2

Data Architecture: Enterprise Architecture Defined, Data Architecture Defined, “Good” Data Architecture,

Principles of Good Data Architecture, Major Architecture Concepts, Domains and Services , Distributed

Systems, Scalability, and Designing for Failure ,Tight Versus Loose Coupling: Tiers, Monoliths, and

Microservices , User Access: Single Versus Multitenant , Event-Driven Architecture , Examples and Types of

Data Architecture

Choosing Technologies Across the Data Engineering Lifecycle: Team Size and Capabilities, Speed to Market,

Interoperability, Cost Optimization and Business Value, Total Cost of Ownership Total Opportunity Cost of

Ownership, FinOps, Today Versus the Future: Immutable Versus Transitory Technologies: Hybrid Cloud,

Multicloud , Decentralized: Blockchain and the Edge ,Monolith Versus Modular , Serverless Versus Servers,

Server Versus Serverless evaluation

Textbook 1:Chapter 3 (3.1–3.7), Chapter 4 (4.1–4.6)




 Module-3

MLOps Challenges, MLOps to Mitigate Risk, Risk Assessment, Risk Mitigation, MLOps for Responsible

AI,MLOps for Scale.

Key MLOps Features: Model Development, Establishing Business Objectives, Data Sources and Exploratory

Data Analysis, Feature Engineering and Selection, Training and Evaluation, Reproducibility, Responsible AI,

Productionalization and Deployment, Model Deployment Types and Contents, Model Deployment

Requirements, Monitoring

Developing Models: Machine Learning Model, Required Components, Different ML Algorithms, Different

MLOps Challenges, Data Exploration, Feature Engineering and Selection, Feature Engineering Techniques,

How Feature Selection Impacts MLOps Strategy, Experimentation, Evaluating and Comparing Models,

Choosing Evaluation Metrics, CrossChecking Model Behavior, Impact of Responsible AI on Modeling, Version

Management and Reproducibility

Textbook 2: Chapter 1 (1.1–1.3), Chapter 2 (2.1–2.4)




Module-4

Preparing for Production: Runtime Environments, Adaptation from Development to Production Environments,

Data Access Before Validation and Launch to Production, Final Thoughts on Runtime Environments, Model

Risk Evaluation, The Purpose of Model Validation, The Origins of ML Model Risk, Quality Assurance for

Machine Learning.

Deploying to Production: CI/CD Pipelines, Building ML Artifacts, The Testing Pipeline, Deployment Strategies,

Categories of Model Deployment, Considerations When Sending Models to Production, Maintenance in

Production, Containerization, Scaling Deployments, Requirements and Challenges.

Textbook 2:Chapter 3 (3.1–3.5), Chapter 4 (4.1–4.4)




Module-5 

Monitoring and Feedback Loop: Models Be Retrained, Understanding Model Degradation,

Ground Truth Evaluation, Input Drift Detection, Drift Detection in Practice, Example Causes of Data Drift, Input

Drift Detection Techniques, The Feedback Loop, Logging, Model Evaluation, Online Evaluation

Model Governance: Governance the Organization Needs, Matching Governance with Risk Level, Current

Regulations Driving MLOps Governance, Pharmaceutical Regulation in the US: GxP

Financial Model Risk Management Regulation, GDPR and CCPA Data Privacy Regulations, The New Wave of

AI-Specific Regulation, The Emergence of Responsible AI, Key Elements of Responsible AI (Element 1 to

Element 5), A Template for MLOps Governance (Step 1 to 8).

Textbook 2:Chapter 5 (5.1–5.4), Chapter 6 (6.1–6.3)




Suggested Learning Resources:

Textbooks:

1. Joe Reis, Matt Housley, Fundamentals of Data Engineering, O’Reilly, 2022

2. Mark Treveil & Dataiku Team, Introducing MLOps, O’Reilly, 2020 

Post a Comment

0 Comments