Build optimal Data pipelines which can be leveraged by the data science team for ML, Data Analysis, Data Mining, ETL, feature engineering, Data Modelling
Very strong programming skills in Python and SQL
Good knowledge of Spark and Spark Streaming
Good Database Design Skills
Experience working on cloud platforms (like GCP, AWS)
In-depth understanding of fundamental concepts of data structures and algorithms in Big Data
Exposure to at least one Data Warehouse or Data Lake product
Understanding of data security
Some exposure to Airflow or Azkaban or similar other workflow management tools.