What is Airflow? How to Use it to automate EMR/Spark/ETL jobs and MLops/MLTrain/MLTest ?

Airflow is tool which used to automate jobs specially Data Engineering Pipeline and also if needed ML Training and MLops.

Airflow libarary can be used in Python. Airflow has Task and Jobs. One can Schedule Task base on Time schedule or Event base. Airflow shows all task as web visual display. Airflow Can automate Database creation, update and also trigger EMR Spark/Pyspark Jobs. Airflow can also trigger MLFlow job as per events like EMR Pyspark Job complete for data extraction and then Call Pyspark/Python with MLflow with ML model Training & Testing and at last One have MLFlow Dashboard with results of ML training and ML testing. MLFlow is used for ML Training-Testing Mointoring/Visualization

Leave a comment