At least 3 years of commercial experience implementing, developing, or maintaining Big Data systems, data governance and data management processes.
Strong programming skills in Python (or Java/Scala): writing a clean code, OOP design.
Hands-on with Big Data technologies like Spark, Cloudera, Data Platform, Airflow, NiFi, Docker, Kubernetes, Iceberg, Hive, Trino or Hudi.
Excellent understanding of dimensional data and data modeling techniques.
Experience implementing and deploying solutions in cloud environments.
Consulting experience with excellent communication and client management skills, including prior experience directly interacting with clients as a consultant.
Ability to work independently and take ownership of project deliverables.
Fluent in English (at least C1 level).
Bachelor’s degree in technical or mathematical studies.
Optional
Experience with an MLOps framework such as Kubeflow or MLFlow.
Familiarity with Databricks, dbt or Kafka.
Your responsibilities
Develop and maintain a high-performance data processing platform for automotive data, ensuring scalability and reliability.
Design and implement data pipelines that process large volumes of data in both streaming and batch modes.
Optimize data workflows to ensure efficient data ingestion, processing, and storage using technologies such as Spark, Cloudera, and Airflow.
Work with data lake technologies (e.g., Iceberg) to manage structured and unstructured data efficiently.
Collaborate with cross-functional teams to understand data requirements and ensure seamless integration of data sources.
Monitor and troubleshoot the platform, ensuring high availability, performance, and accuracy of data processing.
Leverage cloud services (AWS) for infrastructure management and scaling of processing workloads.
Write and maintain high-quality Python (or Java/Scala) code for data processing tasks and automation.