This job offer is not available in your country.

Mlops Engineer

Docuvera Software CorporationWellington, Wellington, New Zealand

12 days ago

Job description

Overview

About Docuvera

Our mission is to lead the digital transformation of the Life Science industry, drastic ally improving time to market, productivity, patient safety and patient outcomes.

Docuvera is a global SaaS leader in component-based structured content authoring, transforming how pharmaceutical companies create, manage, and reuse product-related content.

Our platform empowers life sciences organizations to drive digital transformation through efficient, compliant, and multiformat content creation.

Headquartered in Wellington, New Zealand, our distributed team work across New Zealand, Asia, and the United States with customers throughout the world.

The part you'll play

You'll design, build, and run the AI infrastructure that powers Docuvera's enterprise AI and our AI-driven content management platform.

Day to day, you'll turn machine-learning models into reliable products, build scalable data pipelines, and provide the foundation for AI-powered workflows.

You'll lead MLOps practices, stand up vector databases and knowledge graphs, and work closely with data scientists and engineers to deploy and monitor models in production.

Your work enables key programs like company-wide knowledge assistants, AI-assisted bug triage, and intelligent content curation.

You'll automate ML workflows, keep AI services highly available, and ensure everything meets life-sciences regulations (FDA 21 CFR Part 11, GxP).

What you'll focus on

AI infrastructure & MLOps platforms

Design, deploy, and operate AI / ML infrastructure on AWS (e.g., Bedrock, SageMaker, Lambda, S3, Aurora PostgreSQL, DynamoDB, EventBridge, SQS, EKS, ECS, CloudFormation / CDK).

Use the right data store for the job : relational workloads on Aurora PostgreSQL (read replicas, point-in-time recovery, RDS Proxy) and low-latency / key-value workloads on DynamoDB (with DAX when helpful).

Set up and manage vector databases (OpenSearch, Pinecone, or Milvus) and a graph database (Neptune) to support semantic search and enterprise knowledge.

ML pipeline development

Build end-to-end MLOps pipelines with SageMaker Pipelines, MLflow, or Kubeflow for training, validation, versioning, and deployment.

Orchestrate event-driven ML lifecycles with EventBridge; trigger retraining on data-quality signals or upstream changes; decouple components with SQS (FIFO / Standard) and DLQs.

Automate safe releases with A / B tests, canary deployments, and fast rollbacks.

Data engineering for AI

Ingest and prepare data from tools like Confluence, Jira, GitLab, Slack, and SharePoint.

Build real-time sync using AWS Glue, Lambda, EventBridge (buses / schemas), and SQS (including SNS?SQS fan-out and back-pressure handling).

Design reliable data flows : idempotent processing, outbox / CDC patterns, and exactly-once semantics where needed (Aurora logical decoding / CDC, DynamoDB Streams into EventBridge / SQS).

Vector search & retrieval

Deploy and tune vector databases and embedding pipelines for semantic search.

Implement retrieval-augmented generation (RAG) with Bedrock Knowledge Bases, using metadata-based access control and near-real-time content updates via EventBridge and SQS.

Model deployment & monitoring

Containerize and ship models with Docker; run on Kubernetes (EKS), ECS, and serverless where it fits.

Monitor model performance, drift, and quality with CloudWatch and New Relic.

Track SQS health (oldest message age, queue depth, visibility timeout), EventBridge failures, and Aurora performance (Performance Insights, enhanced monitoring).

Knowledge graph operations

Build and maintain Neptune graphs linking documents, code, tickets, and business entities.

Optimize graph queries and relationships to enable cross-domain reasoning.

Compliance & security for AI

Implement AI-specific controls : prompt / response logging, audit trails, and data classification for regulated workflows.

Enforce encryption in transit / at rest (KMS) across Aurora, DynamoDB, SQS, EventBridge.

Apply least-privilege IAM, resource policies for event buses, VPC endpoints / PrivateLink, and database auditing (e.g., pgaudit).

Define data residency, retention, backups, and PITR across databases, object storage, and queues in line with FDA 21 CFR Part 11 and GxP.

CI / CD for models & data platforms

Extend GitLab CI / CD for MLOps : automated tests, validation gates, and deployment orchestration.

Manage infrastructure as code with AWS CDK, CloudFormation, or Terraform (Aurora clusters / subnets / proxies, DynamoDB tables, EventBridge buses / rules / targets, SQS queues / DLQs) with environment guardrails.

What you'll bring to the role

Some or all of these technical skills, experience and knowledge

AWS AI / ML & data : Bedrock, SageMaker, Lambda, S3, Aurora PostgreSQL (Aurora / RDS), DynamoDB, EventBridge, SQS, EKS, ECS, Neptune, OpenSearch; hands-on architecture and cost tuning.

Relational data & PostgreSQL : Strong SQL, schema design, indexing / partitioning, query tuning, connection management (RDS Proxy), HA / DR (Multi-AZ, read replicas, PITR), CDC / outbox patterns.

MLOps platforms : MLflow, Kubeflow, SageMaker Pipelines for lifecycle management, experiment tracking, and automated deployments.

Event-driven systems : EventBridge (rules, schedules, schema registry) and SQS (FIFO / Standard, DLQs, ordering / deduplication) for loosely coupled services at scale.

Vector search & RAG : Implementing and tuning Pinecone / Milvus / Weaviate / OpenSearch and embedding workflows in production RAG systems.

Data pipelines : Real-time ingestion with Glue, Kinesis, Lambda; integrating enterprise APIs / webhooks; EventBridge buses and SQS workers for reliable, idempotent processing.

Containers & Kubernetes : Docker, EKS, and serverless model serving; autoscaling for AI workloads.

Graph databases : Neptune or Neo4j with Gremlin / Cypher / SPARQL.

Programming & automation : Python / Bash and IaC (CDK, Terraform, CloudFormation).

Model operations : Deploying / monitoring LLMs, embeddings, and custom ML models with performance optimization.

Enterprise integration : Model Context Protocol (MCP), API gateways, and connectors for systems like Confluence, Jira, and SharePoint.

Observability & resilience : CloudWatch / New Relic dashboards, SLOs / SLIs, synthetic checks; queue latency / depth alerts; EventBridge failure handling; DB health and slow-query monitoring.

AI governance : Model risk management, validation frameworks, and compliance logging for regulated AI apps.

Come with a positive, hands-on approach to complex AI infrastructure challenges.

Be innovative and resourceful; proactive about improving MLOps.

Are outcome-focused and adaptable to changing AI needs and timelines.

Is a strong communicator who partners well with research, engineering, platform, and business teams.

These core qualities and attributes

Focuses on what matters most– prioritises work that drives real results, aiming for progress not perfection, and using time wisely to make smart, impactful choices

Proactive and curious– shows genuine curiosity and interest, taking initiative, and investing discretionary effort to deliver

Agency and ownership– pragmatic and independent but knows when to seek help or share information early for the benefit of the team

Considered and constructive– listens and responds thoughtfully, is honest about challenges, but looks for ways to improve

Resilient and adaptable– willing to adjust and tackle change with positive intent.

Change is constant; how we respond defines us.

Collaborative yet accountable– will unite as a team but takes notice when things aren't working, acts on it, and will hold self and others responsible

Strong camaraderie– builds others up, is respectful and encouraging, will debating ideas not people, and will recognising effort, giving credit where it's due

Positive and professional outlook– maintains optimism and commitment, and takes pride in delivering quality work

These certifications or qualifications are sought after (desired but not required)

AWS Certified Machine Learning – Specialty

AWS Certified DevOps Engineer – Professional

AWS Certified Solutions Architect – Professional

AWS Certified Security – Specialty

MLOps / AI Engineering certifications (e.g., MLOps Institute, Coursera)

What we can offer you

We operate in a high trust environment, and we really walk the talk.

We aspire for everyone to be themselves and be comfortable at work, so we put great emphasis on ensuring our people have what they need to be at their best.

This includes offering :

offering a digital first, fully flexible working style.

We've embraced asynchronous, hybrid working as a norm.

modern tools and systems, with a big focus on our use of AI

a focus on personal growth with career, learning, and development tools available, plus some dedicated 'tools down' personal development time

in NZ, an additional week of paid leave; staff appreciation leave at Christmas plus a day off for your birthday,

in the USA, we offer unlimited PTO and fully funded health benefits, like dental and medical

all within in a tight knit, supportive, and inclusive global community.

Key information about our recruitment process

If there is anything we should consider to enable your experience in our recruitment process to be more inclusive, please let us know.
This role is full time, Monday to Friday, and will be based remotely (US roles) or near an office hub (NZ roles) but with the ability to work from home to suit.
Our process generally consists of 2 online interviews of around an hour each generally with one or two people, and a technical test if the role is in the Engineering space.

We'll then request a minimum of two references and require a clean background check from your home country.

We'll close applications when we find the right person but will keep you updated how we're going.

For agencies, we have a preferred supplier arrangement with various global recruitment partners so please don't send us unsolicited CVs.

#J-18808-Ljbffr

Create a job alert for this search

Engineer • Wellington, Wellington, New Zealand