[Remote] Senior Cloud Engineer, Observability

Remote Full-time
Note: The job is a remote job and is open to candidates in USA. Bayer is a company committed to solving the world’s toughest challenges in health and agriculture. They are seeking a Senior Cloud Engineer specializing in Observability to enhance their digital farming technology by improving observability practices, collaborating with teams, and driving reliability outcomes within their AWS platform.ResponsibilitiesBe the hands-on SME for our observability toolchain (e.g., Datadog, CloudWatch, OpenSearch), including log pipelines, tracing/telemetry standards, and platform templatesRun office hours, produce exemplars, and pair with teams to implement “known-good” instrumentation and alertingTriage and resolve observability-related platform requests (new service onboarding, log/metric gaps, noisy alerts, dashboard standards) with clear ownership and measurable outcomesEstablish and operationalize SLIs/SLOs for key platform components and enable teams to define service SLOs without reinventing the wheelMaintain opinionated “golden paths” for:Logging (standard fields/tags, retention, routing, searchability)Metrics (naming conventions, cardinality guardrails, standard RED/USE views)Tracing (service maps, critical spans, propagation standards)Dashboards (starter dashboards by service type + curated views for platform reliability)Provide reusable templates for alerting patterns (latency, error-rate, saturation, dependency failures), tuned for actionable paging vs. noiseReduce MTTR by improving detection, triage paths, runbooks, and “what changed” visibilityDrive reliability reviews focused on observability gaps: missing signals, unclear ownership, bad alerts, and uninstrumented failure modesPartner with delivery teams to turn recurring incidents into durable fixes (instrumentation + alerting + automation + documentation)Embed observability checks into CI/CD and platform workflows (e.g., telemetry guardrails, dashboard/monitor templates, logging standards checks)Partner with Security/Compliance to ensure telemetry supports auditability and incident investigation without ad-hoc effortDefine and report platform observability KPIs: alert noise rate, % actionable alerts, MTTA/MTTR trends, onboarding time to “fully observable,” runbook coverage, incident recurrenceRun lightweight experiments to improve signal quality (threshold tuning, monitor redesign, dashboard UX), and ship improvements like a product ownerCreate cost-aware telemetry standards (log volume controls, metric cardinality guidance, sampling strategies, retention tiers)Help teams optimize spend while improving reliability outcomes (“cheaper + better” logging/metrics patterns)Serve as a trusted partner to delivery units, Security, and Data—turning pain points into paved-road improvementsMentor engineers and uplift organizational practices for incident response, reliability signals, and operational excellenceSkillsBachelor's in computer science/engineering or equivalent experience5+ years hands-on AWS experience operating production workloadsDeep practical experience with observability in production, including: Datadog and/or CloudWatch (dashboards, monitors/alerts, log search, correlation)Designing actionable alerts (noise reduction, ownership, runbook-first alerts)Defining/using SLIs/SLOs and reliability metrics to drive behaviorStrong proficiency with Infrastructure as Code (Terraform; CloudFormation a plus)Strong programming for automation/tooling (Python, Go, or similar)Solid grasp of cloud architecture, networking, and security fundamentalsExperience productizing observability enablement (templates, golden paths, standards, onboarding workflows)CI/CD at scale (GitLab pipelines), including integrating reliability/telemetry guardrails into delivery workflowsLogging/telemetry platforms beyond CloudWatch/Datadog (e.g., ELK/OpenSearch) and experience managing scale concerns (volume, retention, cardinality)Container platforms (ECS/EKS) and common AWS data services (RDS/Aurora, S3/lake patterns, MSK/Kinesis)FinOps experience related to observability (tagging, allocation, optimizing telemetry cost)Relevant AWS certifications and excellent communication skillsBenefitsAdditional compensation may include a bonus or incentive program (if relevant).Health careVisionDentalRetirementPTOSick leaveCompany OverviewBayer is a life science company that specializes in the areas of health care and agriculture. It was founded in 1863, and is headquartered in Leverkusen, Nordrhein-Westfalen, DEU, with a workforce of 10001+ employees. Its website is https://www.bayer.com.

Apply Now →

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

Junior Project Manager (Entry Level – Software & IT Projects)

Remote

**Experienced Remote Data Entry Specialist – Flexible Evening Opportunities at blithequark**

Remote

Hiring Now: Field Case Manager Registered Nurse RN

Remote

Experienced Data Entry Clerk and Focus Group Panelist - Flexible Remote Work Opportunity

Remote

Distribution Center Package Handler

Remote

Senior Software Engineer - AI (React)

Remote

Remote Occupational Therapist in NC

Remote

Head of Sustainable Sourcing Cocoa - Environment & Climate

Remote

Remote Assistant Legal Counsel

Remote

**Part-Time Customer Support Executive – Cleveland – Join arenaflex's Dynamic Team**

Remote
← Back