Strong experience in leading SRE/DevOps team in an agile working environment.
Self-starter and focused on deliverables and a biased for action - flexible and agile attitude in respect to responsibilities and change.
Strong hands-on experience in designing and maintaining mission-critical Java & Linux-based application environment on-premises and on cloud (ideally AWS/GCP).
Strong hands-on experience automating support and delivery operations with Terraform/Ansible/Puppet, CI/CD pipelines and programming scripts including building observability with Grafana, Prometheus, Splunk, AppDynamics or similar tools.
Strong experience in collaborating with globally dispersed, cross-functional teams and vendors in project delivery, incident & problem management. Excellent oral, written communication and presentation skills. Able to perform out-of-hours production support and project delivery duties.
Strong hands-on experience in performance optimization and troubleshooting issues on Oracle & PostgreSQL databases, Network, Storage, Linux OS & JVMs.
Good full-stack engineering skills with some of these programming languages - Java, Python, NodeJS, Go.
Experience in maintaining Atlassian applications in a large enterprise and migrating them from on-premises to Cloud/SaaS will be preferrable but not necessary.
Your responsibilities
Work as DevOps Engineer/SRE for our global, highly available (24x7) deployment of Jira and Confluence.
Design, implement and maintain highly available, performant, scalable and secure Atlassian Jira and Confluence estate.
Automate deployment, monitoring, recovery and other support operations of Atlassian applications using infrastructure-as-code tools, CI/CD pipelines and programming scripts.
Lead incident responses, root cause analysis and post-incident-reviews with our global Atlassian team, infrastructure teams (Network, Linux, Database and Storage, etc.) and application vendors.
Work in partnership with the product owner to define operational metrics (KPIs/SLIs/SLOs); build & maintain observability to continuously monitor the operational performance; create & execute action plans to address failures to meet the desired metrics.
Plan and execute application/infrastructure migration, DR failover/testing, and upgrade activities.
Review user support queries, enhance automation and self-service capability to improve user experience and reduce manual toil on user support.
Coaching junior team members to uplift the team’s DevOps and SRE capabilities and strengthen collaborations. Support team’s hiring and people management activities.
Views: 1
Report
Published
8 days ago
Expires
in 5 days
Work mode
hybrid
Source
Similar jobs that may be of interest to you
Based on "Senior Application Reliability Engineer"