Weekly Topics
📌 Important:
The weekly topics may be subject to minor adjustments. Some topics might be added, removed, or shifted slightly during the course based on class progress and feedback.
Introduction to Databases and Data Concepts (1 week)
Session 1
- Overview of course structure, grading, and final project.
- DIKW pyramid (Data → Information → Knowledge → Wisdom).
- Ethical considerations and bias in data.
- Data types: structured, semi-structured, unstructured.
- Levels of data analytics: descriptive, diagnostic, predictive, prescriptive.
Session 2
- Database and Application Architecture
- Basic database terminology (tables, rows, columns).
Database Models and ER Design Basics (1 week)
Session 3
- Introduction to Entity-Relationship (ER) modeling.
Session 4
- Entities, attributes, relationships, and cardinality.
- Weak entities, composite attributes, multi-valued attributes.
Relational Model and SQL Fundamentals (1 week)
Session 5
- Converting ER to a relational model.
- Normalization vs. Denormalization
Session 6
- Creating databases and tables in PostgreSQL.
- Basic SQL commands:
SELECT,INSERT,UPDATE,DELETE. - Query filtering with
WHERE, ordering withORDER BY.
Intermediate SQL and Advanced SQL Queries (3-4 weeks)
Session 7
- Aggregate functions:
COUNT,SUM,AVG,MIN,MAX. - Grouping with
GROUP BYand filtering withHAVING. - Advanced Text Search
Session 8 -> Midterm Exam
Session 9
- Set operations:
UNION,INTERSECT,EXCEPT. - Joining tables: inner join, left join, right join, full outer join.
Session 10
- Multi-table query exercises.
- Subqueries.
- Common Table Expression (CTE)
Session 11
- Window functions.
- Views
- Transactions
Session 12
- Triggers
- Overview of indexing
Session 13
- Condition Expressions
- User-defined Functions
- Denormalized Data Types
- Array
- XML
- JSON
- Cast
Introduction to NoSQL and Modern Trends (1-2 weeks)
Sessions 14–15-16
- Basics of NoSQL databases (document, key-value, graph).
- Comparing NoSQL and relational approaches.
- Use cases where NoSQL is more suitable.
- MongoDB
OLAP vs OLTP and Analytical Databases (1–2 weeks)
Session 17
- Transactional vs analytical database systems (OLTP vs OLAP).
- Use cases for each system in real-world applications.
- Data Warehouse vs Data Lake vs Lakehouse
- Introduction to DuckDB as an embedded analytical database.
- Importing and querying data directly from CSV/Parquet files.
- Performing descriptive analytics with DuckDB.
Sessions 18-19
- Combining DuckDB with Pandas for hybrid workflows.
- Summary statistics, grouping, and filtering for analysis.
- Handling missing values and outliers.
- Preparing data for visualization and reporting.
- Introduction to Power BI and Streamlit.
- Connecting databases to visualization tools.
- Building dashboards and simple interactive reports.
Big Data and Analytical Frameworks (1 week)
- Overview of big data concepts and challenges.
- Introduction to PySpark for large-scale data processing (conceptual).
- When and why to scale beyond single-node databases.
Vector Databases and AI Search (Overview) (1 week)
- Concept of vector embeddings and semantic search.
- Introduction to pgvector and similarity queries in PostgreSQL.
- Brief mention of retrieval-augmented generation (RAG) use cases.
