← Extra Resources

EXTRA · DATA ENG · CURATED

Data Engineering Resources.

data-engineering etl spark airflow resources mindstack
Data pipelines, ETL, and big-data tooling — books, the foundational distributed-data papers, and the best free course. Links open in a new tab.

Books

ResourceWhatLink
Fundamentals of Data Engineering — ReisThe field's basics.site
Designing Data-Intensive Applications — KleppmannData systems bible.book
The Data Warehouse Toolkit — KimballDimensional modeling.book
Building Data Science Apps with FastAPIData applications.site

Research Papers

ResourceWhatLink
MapReduceDistributed processing.site
The Google File SystemDistributed storage.site
BigtableDistributed NoSQL.site
Spark: Resilient Distributed DatasetsSpark core paper, NSDI '12.pdf

GitHub Repositories

ResourceWhatLink
Awesome Data EngineeringCurated resources.repo
Apache SparkBig data processing.repo
Apache AirflowWorkflow orchestration.repo
Data Engineering ZoomcampFree course repo.repo

Videos & Courses

ResourceWhatLink
Data Engineering ZoomcampFree course.video
Apache Spark TutorialsSpark tutorials.video

Articles & Blogs

ResourceWhatLink
Seattle Data GuyData engineering blog.site
Locally OptimisticData team blog.site
Data Engineering PodcastPodcast + blog.site
Airflow BlogAirflow updates.site
ResourceWhatLink
Data Engineer RoadmapLearning path.repo
Data Engineering CookbookAndreas Kretz's guide.repo
Data Engineering WikiCommunity wiki.repo
where to start Read Fundamentals of Data Engineering + DDIA, do the Data Engineering Zoomcamp (free), and learn Spark + Airflow.
← prev: Security next: AI & ML Graduate Studies →
© cvam — written in plaintext, served warm