Posted 2026-06-29

Principal Site Reliability Engineer

Description

As a Principal Site Reliability Engineer, you will shape the long-term strategy for the infrastructure behind one of the most demanding platforms in sports betting and gaming. You will drive the architectural direction of cloud and on-premise platforms, helping engineering teams build, deploy, and operate highly reliable systems at scale. Working across Platform Engineering and Site Reliability Engineering, you will influence how the infrastructure is modernised, strengthen operational excellence, and prepare the platform for the next generation of growth.

Responsibilities

Define and execute the long-term strategy for our Kubernetes platform across Google Kubernetes Engine, Amazon Elastic Kubernetes Service, RKE2, and on-premise environments, ensuring reliability, scalability, and operational consistency.
Drive architectural decisions across critical infrastructure, including cluster lifecycle management, networking, identity and access management, observability, autoscaling, capacity planning, and cost optimization.
Lead large-scale platform initiatives across multiple engineering teams, establishing technical direction, engineering standards, and measurable outcomes that improve platform reliability and developer experience.
Establish and evolve reliability practices by defining service level objectives, service level indicators, and error budget frameworks that align platform performance with business priorities.
Build automation-first infrastructure through Infrastructure as Code, GitOps workflows, self-healing systems, and internal platform tooling that improve engineering velocity and reduce operational overhead.
Champion the responsible adoption of AI-powered engineering capabilities that improve operational efficiency, accelerate incident response, and enhance developer productivity.
Lead critical platform incidents, drive post-incident improvements, and strengthen platform resilience through automation, capacity planning, and operational excellence.
Mentor senior engineers, influence technical strategy across the organization, and elevate engineering excellence through architecture reviews, coaching, and technical leadership.

Requirements

A Bachelor's Degree in Computer Science or a related technical field. (required)
At least 8 years of experience designing, operating, and scaling distributed cloud and on-premise infrastructure. (required)
At least 3 years operating at the Staff, Principal, or equivalent technical leadership level. (required)
Proven experience leading large-scale infrastructure or platform initiatives that require cross-functional alignment and long-term technical ownership. (required)
Deep expertise with Kubernetes, including cluster architecture, networking, storage, security, operators, lifecycle management, and large-scale production operations. (required)
Extensive experience building and operating production infrastructure in AWS and Google Cloud Platform using Infrastructure as Code technologies such as Terraform, Pulumi, or similar tools. (required)
Strong software development experience in Go, Python, or both, with expertise in GitOps, continuous integration and continuous delivery, observability, distributed systems, Linux, and reliability engineering principles. (required)
Experience incorporating AI-powered tools into engineering workflows while applying sound judgment around reliability, security, and operational risk. (required)
Exceptional communication and leadership skills with a proven ability to mentor engineers, influence technical strategy, and drive engineering excellence. (required)
Experience working in regulated industries, hybrid cloud environments, contributing to open-source projects, or holding cloud certifications. (preferred)

Benefits

Base salary range of 200,000.00 USD - 250,000.00 USD, plus bonus and equity.
Comprehensive health benefits including medical, dental, and vision plans.
Wellbeing program including free therapy sessions with Lyra, employee assistance programs, the Calm App, and virtual yoga classes.
14 weeks of 100% paid parental leave and workplace lactation support.
Partnership with Care.com for care finder and backup childcare services.
Family planning support for adoption, surrogacy, and fertility treatments.
Flexible PTO.
Pet insurance.
Gym reimbursement.
Financial planning support including 401(k) matching and Origin Financial programs.
Commuter benefits.
Tuition reimbursement program.

About DraftKings

DraftKings Inc. is a leading American digital sports entertainment and gaming company, headquartered in Boston, Massachusetts. Founded in 2012, it started in daily fantasy sports and has grown into one of the largest US online sportsbook and iGaming operators. The company offers mobile sports betting, online casino and daily fantasy contests across regulated North American markets. DraftKings is listed on the Nasdaq stock exchange and employs several thousand people.

Principal Site Reliability Engineer

Want to see more roles like this?

Sign in

Job Alerts