Talent.com
This job offer is not available in your country.
Control Plane - Site Reliability Engineer (Hosted Infrastructure)

Control Plane - Site Reliability Engineer (Hosted Infrastructure)

ElasticWorkFromHome, Wellington, New Zealand
3 days ago
Job description

Overview

Control Plane - Site Reliability Engineer (Hosted Infrastructure). Elastic is the Search AI Company enabling real-time answers across data at scale. We integrate, scale, and evolve multi-cloud infrastructure across 4 CSPs, 70+ globally distributed regions, and tens of thousands of compute instances to power Elastic Cloud. We automate with Infrastructure as Code (IaC), configuration management, and software that minimizes toil while improving reliability and efficiency for our customers.

If this kind of work excites you, your experience can help us continue delivering an outstanding customer experience across a diverse suite of cloud infrastructure.

What You Will Be Doing

  • Applying software engineering methods to automate large-scale systems administration.
  • Optimizing the lifecycle and reliability of compute across multiple cloud providers.
  • Ensuring proactive monitoring and alerting to prevent incidents before they happen.
  • Growing our global infrastructure to meet increasing scaling demands by developing and maintaining software, tooling, and automations.
  • Collaborating in an inclusive environment with a focus on Operational Excellence and constructive feedback.
  • Being part of an SRE on-call rotation responding to operational needs and incidents.

What You Bring

  • 2+ years in software engineering using Golang.
  • 2+ years operating hundreds (or more) of Cloud Compute via automated solutions.
  • 2+ years with Linux systems; proficient with terminal and shell.
  • 2+ years working with containerized services (such as Docker).
  • A customer-first approach in solving operational problems from an SRE perspective.
  • Comfortable working remotely on distributed teams.
  • Bonus Points

  • Experience with Terraform, Puppet, Ansible, Argo CD, Argo Workflows, CUE, Kubernetes, or other programming languages besides Golang in production environments.
  • Experience being on-call during incidents and using observability tools (e.g., Elastic Stack, Graphite, Prometheus, Influx) to diagnose issues, quantify impact, and confirm mitigations.
  • Designed, implemented, and engineered solutions with the Elastic Stack.
  • Additional Information - We Take Care Of Our People

    As a distributed company, diversity drives our identity. Elastic is an equal opportunity employer and is committed to creating an inclusive culture that celebrates different perspectives, experiences, and backgrounds. We strive to have parity of benefits across regions and, while regulations differ, we believe taking care of our people is the right thing to do.

    Benefits you may expect include :

  • Competitive pay based on the work you do here and not your previous salary
  • Health coverage for you and your family in many locations
  • Flexible locations and schedules for many roles
  • Generous vacation days
  • Donation matching up to $2000 (or local currency equivalent) for charitable donations
  • Up to 40 hours per year for volunteer projects
  • Parental leave — minimum of 16 weeks
  • We welcome individuals with disabilities and strive to create an accessible experience. To request an accommodation, please email We reply within 24 business hours of submission.

    Applicants have rights under federal employment laws. See the relevant posters and the Privacy Statement for more information.

    Locations and Roles

    New Zealand – Auckland, Christchurch, Wellington (various postings and timelines listed in job feed).

    #J-18808-Ljbffr

    Create a job alert for this search

    Reliability Engineer • WorkFromHome, Wellington, New Zealand