Solution Architect – Site Reliability (SRE) & Observability | f/m/d

Solution Architect – Site Reliability (SRE) & Observability | f/m/d

ERGO Technology & Services S.A.

Warsaw
Solution Architect
Site Reliability Engineering
Observability
IaC
Terraform
☁️ Azure
🚢 Kubernetes
📊 Databricks
English fluency
German (nice to have)

Podsumowanie

Solution Architect – SRE & Observability (pełny etat) w Warszawie/Gdańsku. Odpowiada za strategiczną wizję, projektowanie i zarządzanie obserwowalnością, IaC, monitoring, mentoring zespołów SRE oraz negocjacje z dostawcami. Wymaga silnego doświadczenia w SRE, narzędziach obserwacyjnych, Terraform, Azure/K8s, płynnej znajomości języka angielskiego; niemiecki mile widziany.

Słowa kluczowe

Solution ArchitectSite Reliability EngineeringObservabilityIaCTerraformAzureKubernetesDatabricksEnglish fluencyGerman (nice to have)

Benefity

  • pakiet medyczny
  • karta sportowa i sekcje sportowe
  • elastyczne godziny pracy
  • program wsparcia pracownika (confidential employee assistant)
  • możliwość pracy zdalnej
  • pokój gier oraz przyjazne psom biuro w Warszawie
  • warsztaty i szkolenia, hackathony, meetupy
  • platformy e‑learningowe i kursy językowe
  • działania CSR
  • wyścigi rowerowe, mecze piłkarskie, maratony filmowe w kinie firmowym
  • zróżnicowane i inkluzywne środowisko pracy

Opis stanowiska

What you will do

As a Solution Architect, you will be responsible for defining the strategic direction of the Site Reliability Engineering (SRE) service including observability and monitoring. This role focuses on architectural decisions, designing integrations, ensuring best practices, and advising SRE engineers and consulting customer teams on how to automate their service operations and leverage observability tools (e.g. Datadog) effectively.

How you will get the job done

  • defining the strategic vision for site reliability engineering, observability and platform engineering and planning tactical steps for implementation
  • leading the design and governance of automated service operations, observability tooling, ensuring scalability, security, and cost efficiency
  • scouting and analysing new observability features – matching them to business needs and notifying the engineers about potential improvements
  • designing collaboration, automation and integration models
  • defining standards/best practices for automated service operations, observability framework including alerting, SLOs, and distributed tracing across digital products
  • configuring, integrating, administering, and maintaining observability for all relevant digital products, using Infrastructure as Code (IaC)
  • ensuring comprehensive monitoring coverage across digital products
  • supporting, advising, and coaching SRE engineers on the best ways to automate service operations, and the use observability tools
  • supporting SRE engineers in troubleshooting and optimizing monitoring configurations
  • guiding and mentoring engineers in implementing provisioning and configuration of observability tools using Infrastructure as Code
  • engaging with the observability tool vendors to discuss complex technical issues and feature enhancements
  • answering technical questions from product teams
  • negotiating technical aspects of observability tools during procurement discussions to ensure optimal setup

What we offer

Let's be healthy – medical package, sports card, and numerous sports sections – these are some of the benefits that help our employees stay in good shape.

Let's be balanced – work-life balance is a key aspect of a healthy workplace. We offer our employees flexible working hours, a confidential employee assistant program, as well as the possibility of remote working. However, staying at home with our in-office gaming room and dog-friendly office in Warsaw won’t be easy.

Let's be smart – we organize numerous workshops and training courses. Thanks to hackathons and meetups, our specialists share their expertise with others. Additionally, we have a wide range of digital learning platforms and language courses.

Let's be responsible – each year, we participate in several CSR activities, during which, together with our colleagues, we do our best to create a better future.

Let's be fun – company-wide bike races and soccer matches, film marathons in our cinema room or other engaging team-building activities – we got it covered!

Let's be diverse – every team member is valued, regardless of gender, nationality, religious beliefs, disability, age, and sexual orientation or identity. Your qualifications, experience, and mindset are our greatest benefit!

Requirements

  • fluency in English
  • strong Site Reliability Engineering (SRE), Platform Engineering and Observability Architecture experience
  • expertise in observability tools (architecture, governance, integrations, APM, security best practices) and automating service operations
  • strong Infrastructure as Code (IaC) knowledge and experience (e.g. Terraform)
  • experience designing log management, APM, infrastructure monitoring, and synthetic testing solutions
  • knowledge of distributed tracing, metrics, and telemetry collection
  • familiarity with cloud environments (Azure, Kubernetes, Databricks)
  • strong strategic thinking and vision-setting for observability and reliability
  • excellent stakeholder communication and coaching abilities
  • experience negotiating with vendors and external service providers
  • ability to lead and mentor engineers, ensuring effective implementation of observability tooling

Nice to have

  • German language proficiency

Zaloguj się, aby zobaczyć pełny opis oferty

Wyświetlenia: 2
Opublikowana2 dni temu
Wygasaza 28 dni
Źródło

Podobne oferty, które mogą Cię zainteresować

Na podstawie "Solution Architect – Site Reliability (SRE) & Observability | f/m/d"

Nie znaleziono ofert, spróbuj zmienić kryteria wyszukiwania.