2,614 Open roles
98 Companies
54 Posted today
Jobs / Kindred Group (FDJ United) / Site Reliability Engineer
Posted 2026-05-21

Site Reliability Engineer

Description

We're looking for a Site Reliability Engineer (SRE) to join our Platform Delivery and Reliability Services (PDRS) team and help us grow and maintain our Observability Platform. You will build, deploy, and maintain telemetry pipelines and observability platforms that provide real-time insight into system performance and reliability. You will design and integrate monitoring and observability solutions with an automation-first mindset. Collaborate with Development and Infrastructure teams to ensure observability requirements — metrics, traces, logs, and alerts — are built into the design of new systems. Apply SRE principles to reduce repetitive work and improve lead time to detect and resolve issues. Help troubleshoot and prevent production issues to keep our systems stable and performant. Share knowledge and contribute to team documentation and best practices.

Responsibilities
  • Build, deploy, and maintain telemetry pipelines and observability platforms that provide real-time insight into system performance and reliability.
  • Design and integrate monitoring and observability solutions with an automation-first mindset.
  • Collaborate with Development and Infrastructure teams to ensure observability requirements — metrics, traces, logs, and alerts — are built into the design of new systems.
  • Apply SRE principles to reduce repetitive work and improve lead time to detect and resolve issues.
  • Help troubleshoot and prevent production issues to keep our systems stable and performant.
  • Share knowledge and contribute to team documentation and best practices.
Requirements
  • Hands-on experience as a Site Reliability Engineer or in a similar infrastructure/operations role.
  • Good coding and scripting skills — Bash, Python, or similar — including experience with code reviews.
  • Solid understanding of Linux OS and distributed systems.
  • Experience with Time Series Databases and visualisation tools such as Grafana.
  • Familiarity with alerting and monitoring tools like Prometheus, Icinga, Datadog, or Nagios.
  • Experience with CI/CD tools (Jenkins, GitLab, GitHub Actions) and Infrastructure as Code (Terraform, Ansible, or similar).
  • Experience with logging platforms such as Splunk, ElasticSearch, Loki, FluentD, or Vector is a plus.
  • Knowledge of container orchestration platforms (Kubernetes, Nomad, OpenShift) and cloud environments (AWS, Azure, OpenStack) is a plus.
  • Interest or experience in AI tools is considered an advantage.
Benefits
  • Well-being allowance
  • Learning and development opportunities
  • Inclusion networks
  • Charity days
  • Long service awards
  • Social events and activites
  • Private medical insurance
  • Life assurance and income protection
  • Employee Assistance Programme
  • Pension