Data Engineer (Spark) (Remote)

Hexjobs ATS

Apply Now

Data Engineer (Spark) (Remote)

Addepto

Warszawa +6 more

21000 zł/mth.

remote

🐍 Python

Spark

🤖 Airflow

🐳 Docker

📊 Big Data

🌐 remote

Requirements

Expected technologies

Python

Spark

Airflow

Docker

Big Data

Optional technologies

Java

Scala

Kubeflow

MLFlow

Databricks

dbt

Kafka

Iceberg

Kubernetes

Operating system

Windows

macOS

Our requirements

At least 3 years of commercial experience implementing, developing, or maintaining Big Data systems, data governance and data management processes.
Strong programming skills in Python (or Java/Scala): writing a clean code, OOP design.
Hands-on with Big Data technologies like Spark, Cloudera, Data Platform, Airflow, NiFi, Docker, Kubernetes, Iceberg, Hive, Trino or Hudi.
Excellent understanding of dimensional data and data modeling techniques.
Experience implementing and deploying solutions in cloud environments.
Consulting experience with excellent communication and client management skills, including prior experience directly interacting with clients as a consultant.
Ability to work independently and take ownership of project deliverables.
Fluent in English (at least C1 level).
Bachelor’s degree in technical or mathematical studies.

Optional

Experience with an MLOps framework such as Kubeflow or MLFlow.
Familiarity with Databricks, dbt or Kafka.

Your responsibilities

Develop and maintain a high-performance data processing platform for automotive data, ensuring scalability and reliability.
Design and implement data pipelines that process large volumes of data in both streaming and batch modes.
Optimize data workflows to ensure efficient data ingestion, processing, and storage using technologies such as Spark, Cloudera, and Airflow.
Work with data lake technologies (e.g., Iceberg) to manage structured and unstructured data efficiently.
Collaborate with cross-functional teams to understand data requirements and ensure seamless integration of data sources.
Monitor and troubleshoot the platform, ensuring high availability, performance, and accuracy of data processing.
Leverage cloud services (AWS) for infrastructure management and scaling of processing workloads.
Write and maintain high-quality Python (or Java/Scala) code for data processing tasks and automation.

Report

Published	about 1 month ago
Expires	in 12 days
Work mode	remote
Source

Similar jobs that may be of interest to you

Based on "Data Engineer (Spark)"

Why Is Nobody Responding to Your Applications?

The silence is deafening. You send application after application, but your inbox stays empty. Our AI reveals the hidden barriers keeping you invisible to recruiters.

No offers found, try changing your search criteria.