Skip to main content

DevOps/Platform Engineer

UK Remote
Full-time
Permanent employee

Your role

Overview:
  • Build and run a reliable platform for services and data workflows across Kubernetes and Prefect.
  • Own CI/CD, observability, security, and developer experience for Python/Go/Rust services.
Responsibilities:
  • Design, provision, and operate Kubernetes workloads (deployments, networking, autoscaling, storage).
  • Build and maintain GitLab CI/CD pipelines for Python, Go, and Rust services (build, test, scan, release).
  • Operate Prefect (agents, work queues, deployments, concurrency limits, task execution environments).
  • Implement environment strategy and promotion flow (dev/staging/prod) with clear release gates.
  • Create golden paths and templates for FastAPI microservices and Prefect flows.
  • Manage secrets, configuration, and access (e.g., GitLab variables, K8s secrets).
  • Establish observability: logging, metrics, traces, alerting, runbooks, and SLOs.
  • Operate data stores (MySQL, PostgreSQL, Redis): provisioning, backups, migration execution, monitoring, and capacity planning.
  • Optimise build and runtime costs (container images, caching, autoscaling, resource requests/limits).
  • Lead incident response, postmortems, and reliability improvements.

Your profile

You have:
  • 4+ years in DevOps/SRE/Platform roles with production Kubernetes.
  • Strong GitLab CI/CD experience (pipelines, runners, caching, artifact management).
  • Proficiency with containers and image optimization; comfortable with Linux internals and networking.
  • Hands-on with Prefect in production (deployments, flow orchestration, storage, results).
  • Familiar with operating MySQL/PostgreSQL/Redis in production (availability, performance, backups).
  • Scripting/automation with Python or Go; ability to read Rust build pipelines.
  • Solid understanding of security fundamentals (least privilege, image scanning, SBOM, secret hygiene).
  • Experience instrumenting systems and creating actionable alerts.
Nice to have:
  • Helm/Kustomize, policy-as-code (OPA), and basic gRPC.
  • Performance tuning for high‑throughput data or API services.
  • Experience in multi‑tenant or multi‑cluster environments.

About us

We are an applied AI team advancing the future operating systems of AI. Traditional, fragmented AI stacks are struggling to scale under the demands of production, governance, and security, creating barriers to reliable, enterprise-grade AI deployment.

With deep expertise spanning carrier and data centre architecture to applied AI and agent orchestration, we close that gap to deliver production-scale, compliant systems for enterprises operating with high-value, high-risk data across space, retail, media, and entertainment – embedding ethics, governance, and responsible AI practices at every stage.