Data Engineer with Databricks & PySpark

Capgemini Polska

Gdańsk, Oliwa +5 more
hybrid
📊 Databricks
PySpark
🐍 Python
☁️ AWS
☁️ Microsoft Azure
🔍 Google Cloud Platform
hybrid
⚙️ Backend
Cloud computing
🔄 DevOps
☁️ AWS S3
☁️ AWS Glue
☁️ AWS EMR
☁️ AWS Lambda
☁️ AWS Redshift
☁️ Azure

Requirements

Expected technologies

Databricks

PySpark

Python

AWS

Microsoft Azure

Google Cloud Platform

Optional technologies

SQL

Kafka

Terraform

Apache Spark

Snowflake Data Cloud

Our requirements

  • You have hands-on experience in data engineering and are comfortable working independently on moderately complex tasks.
  • You’ve worked with Databricks and PySpark in real-world projects.
  • You’re proficient in Python for data transformation and automation.
  • You’ve used at least one cloud platform (AWS, Azure, or GCP) in a production environment.
  • You communicate clearly and confidently in English.

Optional

  • Solid SQL skills and understanding of data modeling.
  • Exposure to CI/CD pipelines, Terraform, or other DevOps tools.
  • Familiarity with streaming technologies (e.g., Kafka, Spark Streaming).
  • Knowledge of cloud data storage solutions (e.g., Data Lake, Snowflake, Synapse).
  • Relevant certifications (e.g., Databricks Certified Data Engineer Associate).

Your responsibilities

  • Develop and maintain data processing pipelines using Databricks and PySpark.
  • Collaborate with senior engineers and architects to implement scalable data solutions.
  • Work with cloud-native tools to ingest, transform, and store large datasets.
  • Ensure data quality, consistency, and security in cloud environments.
  • Participate in code reviews and contribute to continuous improvement initiatives.

Company

Views: 1
Published3 days ago
Expiresin 20 days
Work modehybrid
Source
Logo
Logo
Logo

Similar jobs that may be of interest to you

Based on "Data Engineer with Databricks & PySpark"