Setup Apache Airflow on ubuntu to run multiple DAGs and tasks using MySQL
Ubuntu || Apache Airflow || MySQL
By default, Airflow using the SQLite database for storing the meta-information. SQLite doesn't support the multiple connections, only sequential execution by default. Here I'm going to use MySQL for parallel execution.
MySQL Installation Guide ==> MySQL-Setup
Airflow-MySQL Setup:
- Open Terminal and execute
- mysql -u root -p
- mysql> CREATE DATABASE airflow;
- mysql> CREATE USER 'airflow'@'localhost' IDENTIFIED BY 'airflow';
- mysql> GRANT ALL PRIVILEGES ON airflow. * TO 'airflow'@'localhost';
- mysql> FLUSH PRIVILEGES;
- Airflow needs a home, ~/airflow is the default, but you can lay foundation somewhere else if you prefer
export AIRFLOW_HOME=~/airflow
- install airflow using pip
sudo pip install apache-airflow
- create subfloder for your dags
mkdir ~/airflow/dags
Change the Airflow configuration for parallel execution
- Open airflow.cfg, exists in your airflow home
Change the executor and database
- executor = LocalExecutor
- sql_alchemy_conn = mysql://airflow:airflow@localhost:3306/airf..
Initialize the database
airflow initdb
Start the web server, default port is 8080
airflow webserver -D
- Start the scheduler
airflow scheduler -D
- visit localhost:8080 in the browser.