Data Engineer with AWS & PySpark

Capgemini Polska

Gdańsk, Oliwa +5 more
hybrid
PySpark
☁️ AWS S3
☁️ AWS Glue
☁️ AWS EMR
☁️ AWS Lambda
☁️ AWS Redshift
🐍 Python
☁️ Azure
🔍 Google Cloud Platform
hybrid
⚙️ Backend
Cloud computing
🔄 DevOps
📊 Databricks
☁️ AWS
☁️ Microsoft Azure
Analityka biznesowa

Requirements

Expected technologies

PySpark

AWS S3

AWS Glue

AWS EMR

AWS Lambda

AWS Redshift

Python

Azure

Google Cloud Platform

Optional technologies

SQL

Terraform

CloudFormation

Kafka

Kinesis

Our requirements

  • You have hands-on experience in data engineering and can independently handle moderately complex tasks.
  • You’ve worked with PySpark in distributed data processing scenarios.
  • You are familiar with AWS data services such as S3, Glue, EMR, Lambda, Redshift, or similar.
  • Experience working with additional major cloud platform (Azure, or GCP)
  • You are proficient in Python for ETL and automation tasks.
  • You communicate clearly and confidently in English.

Optional

  • Strong SQL skills and understanding of data modeling.
  • Exposure to CI/CD pipelines, Terraform, or CloudFormation.
  • Familiarity with streaming technologies like Kafka or Kinesis.
  • AWS certifications

Your responsibilities

  • Design and build data pipelines using AWS services and PySpark.
  • Process large-scale datasets efficiently and reliably.
  • Collaborate with architects and team members to implement scalable data solutions.
  • Ensure data quality, consistency, and security in cloud environments.
  • Participate in code reviews and contribute to continuous improvement efforts.

Company

Views: 2
Published3 days ago
Expiresin 20 days
Work modehybrid
Source
Logo
Logo
Logo

Similar jobs that may be of interest to you

Based on "Data Engineer with AWS & PySpark"