Lead Data Engineer (Spark) (Remote)

Addepto

Warszawa +8 mehr
31920 zł/mth.
Remote
🐍 Python
SQL
Spark
🤖 Airflow
☁️ AWS
Cloudera
🐳 Docker
Java
Scala
🌐 Remote
📊 Big Data
Apache Spark
🚢 Kubernetes
Iceberg

Requirements

Expected technologies

Python

SQL

Spark

Airflow

AWS

Cloudera

Docker

Java

Scala

Optional technologies

Kubernetes

Kafka

Hadoop

Iceberg

Operating system

Windows

macOS

Our requirements

  • 5+ years of proven commercial experience in implementing, developing, or maintaining Big Data systems.
  • Strong programming skills in Python or Java/Scala: writing a clean code, OOP design.
  • Experience in designing and implementing data governance and data management processes.
  • Familiarity with Big Data technologies like Spark, Cloudera, Airflow, NiFi, Docker, Kubernetes, Iceberg, Trino or Hudi.
  • Proven expertise in implementing and deploying solutions in cloud environments (with a preference for AWS).
  • Excellent understanding of dimensional data and data modeling techniques.
  • Excellent communication skills and consulting experience with direct interaction with clients.
  • Ability to work independently and take ownership of project deliverables.
  • Master’s or Ph.D. in Computer Science, Data Science, Mathematics, Physics, or a related field.
  • Fluent English (C1 level) is a must.

Your responsibilities

  • Design and develop scalable data management architectures, infrastructure, and platform solutions for streaming and batch processing using Big Data technologies like Apache Spark, Hadoop, Iceberg.
  • Design and implement data management and data governance processes and best practices.
  • Contribute to the development of CI/CD and MLOps processes.
  • Develop applications to aggregate, process, and analyze data from diverse sources.
  • Collaborate with the Data Science team on data analysis and Machine Learning projects, including text/image analysis and predictive model building.
  • Develop and organize data transformations using DBT and Apache Airflow.
  • Translate business requirements into technical solutions and ensure optimal performance and quality.
Aufrufe: 2
Veröffentlichtvor 2 Tagen
Läuft abin 18 Tagen
ArbeitsmodusRemote
Quelle
Logo
Logo
Logo

Ähnliche Jobs, die für Sie von Interesse sein könnten

Basierend auf "Lead Data Engineer (Spark)"