OverviewKontakt.io is building a platform that care operations run on.
We reduce waste, cut costs, and improve revenue by improving throughput, asset utilization and staff productivity.
Our platform uses AI, RTLS, and EHR data to enable self-learning agents to automate workflows, adapt in real-time, and orchestrate all care delivery operations.
It is easy to deploy and scale, giving a clear picture of spaces, equipment, and people to eliminate inefficiencies and enhance the patient experience.
Kontakt.io has measurable ROI and supports multiple use cases for better and faster care delivery operations.We are looking for a Lead Software Engineer - SRE with a strong software engineering foundation and a strategic mindset to drive the reliability, scalability, and performance of our platform.
This role is part of our Infrastructure Engineering team and will shape the architecture and direction of our SRE function.
You will lead the design and build of resilient infrastructure, development automation, tooling, and fault-tolerant systems across our AWS-based platform.You'll work hands-on in designing resilient systems, improving deployment pipelines, and driving incident management practices.
As a technical leader, you'll mentor engineers, shape technical strategy, and promote a culture of accountability, ownership, and continuous improvement.ResponsibilitiesLead the design and implementation of scalable, fault-tolerant, and self-healing infrastructure and services across AWS and KubernetesCollaborate with Product, Engineering, and Infrastructure teams to align SRE initiatives with business priorities and platform needsDefine and drive adoption of SLIs, SLOs, and SLAs to ensure consistent performance and high reliability across the platformOwn and evolve observability strategies using Prometheus, OpenTelemetry, Grafana, and related toolingDesign and maintain infrastructure as code (Terraform) and drive GitOps best practicesOversee major incident response and on-call practices, including incident reviews and long-term remediation planningMentor and support the growth of SRE and platform engineers, fostering a culture of engineering rigor and operational excellenceContribute to the long-term reliability roadmap and architecture of high-throughput, real-time systems in healthcare operationsDrive process improvements in CI / CD, service ownership, chaos engineering, disaster recovery, and secure deploymentWhat You Bring / Qualifications5+ years of experience in Site Reliability Engineering, Cloud Infrastructure, or Platform Engineering5+ years of software engineering experience building production-grade systems (Java, Python, Go, or similar)Proven success scaling high-traffic, mission-critical platforms in SaaS, IoT, or healthcare environmentsDeep expertise in cloud platforms (especially AWS), Kubernetes, and distributed system architectureHands-on experience with monitoring, logging, and observability tools (Prometheus, OpenTelemetry, Datadog, etc.)Extensive knowledge of CI / CD automation, GitOps workflows, and infrastructure-as-code (Terraform, Helm, ArgoCD)A track record of leading major incident response and running postmortems with a blameless, learning-focused approachStrong understanding of networking, access control, and security within regulated environments (HIPAA, SOC 2)A leadership mindset—able to drive cross-functional alignment, lead initiatives, and mentor a high-performance SRE teamWhy You\'ll Love It HereOwn Mission-Critical Reliability – Ensure hospitals and care facilities always stay online with a 99.99% uptime healthcare platformScale AI-Powered Infrastructure – Work on real-time automation and self-healing cloud systems that orchestrate care deliveryDrive Big Impact in Healthcare – Help reduce waste, optimize resources, and improve patient care with technology that delivers 10X ROIAutomation-First Culture – Minimize manual ops with cutting-edge automation, observability, and incident response strategiesJoin a High-Performing Team – Work with top engineers, AI experts, and healthcare innovators solving real-world challengesReady to Build the Future of Healthcare?
Apply now and help scale the platform that care operations run on.
#J-18808-Ljbffr
Software Engineer • Auckland, New Zealand