Senior Platform Engineer

Remote Full-time
Attio is the CRM built for the AI era. Designed for the most ambitious go-to-market teams, it gives companies the power to understand every customer, automate at scale, and build their go-to-market motion exactly as they need. We've raised $116M from some of the world's best investors: GV (Google Ventures), Redpoint, Balderton, Point Nine, and 01A.
We hire builders who thrive on complex technical challenges, hold themselves to a high bar, and genuinely care about delighting the people who use what they build. The team here brings sharp judgement, real craft, and the drive to do exceptional work. We're obsessed about the details and energized by the frontier.
If you want to do the best work of your career, this is the right place.

About the Role
We are seeking highly skilled and experienced Platform Product Engineers to join our Security, Infrastructure and Performance team. This is a crucial, dual-faceted role that combines high-level engineering strategy with hands-on operational excellence. The successful candidates will be responsible for building, operating, and continuously enhancing the internal technology platform, fundamentally treating this platform as a product with all development teams as its primary customers.
The Platform Product Engineer role is centered around embodying and executing DevOps principles, specifically focusing on:
Automation: Systematically removing manual toil from the software development lifecycle (SDLC) through the creation of robust tooling, CI/CD pipelines, and infrastructure-as-code (IaC).

Collaboration: Fostering a tight, cooperative partnership with product development teams, gathering requirements, and delivering solutions that accelerate their productivity and time-to-market.

Continuous Improvement: Instilling a culture of iterative enhancement for the platform's reliability, cost-efficiency, and developer experience.

This mandate is underpinned by a rigorous Site Reliability Engineering (SRE) mindset. The successful candidates will be instrumental in defining and upholding Service Level Objectives (SLOs) and Service Level Indicators (SLIs), implementing effective monitoring and alerting strategies, and leading operational incident response processes.

What you'll do
The core responsibility is to implement, maintain, and continuously improve the foundational platform infrastructure that powers all engineering services. This necessitates a relentless focus on ensuring high reliability, exceptional scalability, and optimal performance across the entire stack.
Platform Infrastructure: Build and maintain platform infrastructure using declarative IaC tools (e.g., Terraform, Pulumi), ensuring all environments are reproducible, version-controlled, and auditable. Proactively manage the capacity of the infrastructure to consistently meet or exceed Service Level Objectives for latency, error rates, and availability.

Incident Response and Post-Mortems: Act as first-line responders for critical system incidents. Triage, diagnose, and resolve complex production issues rapidly. Drive a culture of blameless post-mortems, ensuring root causes are identified, and long-term preventative measures are implemented as code (e.g., via runbooks, automation, or system design changes).

Tooling & Automation: Own the stack of supporting tools necessary for operational excellence and developer enablement, including:
Continuous Integration and Continuous Delivery (CI/CD) Pipelines: Implement, maintain, and evolve the fully automated CI and CD pipelines. This includes establishing best practices for fast, reliable, and secure build, test, and deployment processes.

Observability: Implement and manage robust systems for monitoring (metrics), logging (centralised log aggregation), and distributed tracing to provide deep insights into application and infrastructure health.

What you'll bring
Applied DevOps and SRE Principles:
Must have : Demonstrable, hands-on experience applying core DevOps and Site Reliability Engineering (SRE) principles to manage, monitor, and scale production systems.

Must have: A deep understanding of the SRE mindset, including SLO/SLA creation and monitoring, error budget management, toil reduction, and post-incident review (blameless postmortems).

Desirable: Proven ability to drive cultural and process change that fosters a collaborative approach between development and operations teams.

Cloud Infrastructure and Containerisation Expertise:
Must have: Expertise in one or more major public cloud providers (AWS, GCP, or Azure), encompassing network configuration, security best practices (IAM, security groups, etc.), compute services (EC2, GKE, ECS, etc.), and managed services (databases, queues, serverless functions).

Must have: In-depth knowledge of container technologies, specifically Docker, and extensive experience orchestrating them at scale using Kubernetes (K8s). This includes designing, deploying, and managing Kubernetes clusters, understanding networking (CNI), storage (CSI), and security configurations within the Kubernetes ecosystem.

Automation and Programming Skills:
Must have: Proficiency in one or more modern software languages (e.g., Typescript, Go, Python, Rust) and associated frameworks used for building high-performance, resilient production systems.

Must have: Proven experience developing robust, maintainable, and well-tested automation scripts, services and pipelines to manage infrastructure, deployments, and operational tasks.

Operational Tooling and Observability Management:
Must have: Experience owning, managing, and maintaining mission-critical operational tooling.

Desirable: Proven background in implementing and managing centralised logging solutions or similar platforms (e.g., Splunk, DataDog).

Desirable: Familiarity with distributed tracing tools (e.g., Jaeger, Zipkin) and Application Performance Monitoring (APM) solutions.

What we offer
Equity in an early-stage tech company on an incredible trajectory

Apple hardware

Team off-site in fun places! (We've been to Barcelona, Lisbon, Malta, and Split so far)
Apply Now →

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

VA Benefits Advisor - Fort Eustis

Remote

Experienced Remote-Customer Experience Specialist – Travel Industry Expert

Remote

Part-Time blithequark Data Entry Remote Jobs: Earn Competitive Hourly Rate for Exceptional Data Management

Remote

Work From Home Remote Data Entry Part Time

Remote

Software Developer; Tableau

Remote

VIRTUAL TAX PREPARER

Remote

Insurance Sales-Remote Position-Flexible Hours

Remote

Senior Revenue Cycle Quality & Performance Management Analyst

Remote

Account Manager - Survey & Geospatial Equipment - West LA County (Remote)

Remote

Remote Part-Time Evening Data Entry

Remote
← Back