[Remote] Staff Backend Engineer - Adaptive Telemetry | USA | Remote
Note: The job is a remote job and is open to candidates in USA. Grafana Labs is a leading company in the observability cloud space, dedicated to open source and innovative solutions. They are seeking a Staff Backend Engineer for their Adaptive Telemetry group, responsible for driving technical strategy, leading large projects, and ensuring the reliability and performance of critical systems.ResponsibilitiesDrive technical strategy and roadmap. Proactively define the architectural vision, prioritize work that unlocks major product or platform improvements, and influence product and engineering decisionsLead end-to-end delivery of large, cross-functional projects. Own planning, design, execution, rollout and long-term operation of large initiativesOwn architecture, reliability, performance and cost for critical systems. Make pragmatic architecture choices that balance scalability, availability, latency and cost while ensuring systems remain maintainable and evolvableDefine SLOs/SLIs and lead incident response. Establish measurable reliability targets, run high-severity incident response, lead blameless post-mortems, and drive systemic fixes and automation to prevent recurrenceImprove observability, automation and operational readiness. Champion telemetry, alerting, runbooks, capacity planning and automation efforts that reduce toil, speed debugging and lower MTTRAlign stakeholders and remove blockers. Coordinate across Product, Design and other teams to align priorities, negotiate tradeoffs, and unblock delivery for large initiativesMentor and grow engineering talent. Coach senior and mid-level engineers, lead design reviews, raise engineering standards, and help teammates make sound technical tradeoffsRepresent engineering internally and externally. Communicate technical strategy clearly to non-engineering stakeholders and represent the team in cross-team planningSkillsProven delivery of large distributed systems. Experience shipping and operating complex systems that span multiple teams, with clear evidence of technical leadership and impactStrong systems-design instincts. Deep understanding of tradeoffs around latency, consistency, availability, scaling and costHands-on cloud and platform experience. Solid experience with cloud-native architectures (microservices, containers/Kubernetes, IaC) and the operational practices that keep them healthyReliability and performance ownership. Comfortable defining SLOs/SLIs, doing capacity planning, tuning performance, and driving reliability work end-to-endExcellent coding and design skills. You write clear, maintainable, well-tested code and can lead technical designs - we use Go, but Python/C/C++/Rust or similar translate wellComfort with AI-assisted development. We embrace AI and agentic development so we expect you to be curious and comfortable using AI-powered developer tools and ideally have practical experience folding them into a team's workflowExperience with messaging and telemetry. Familiarity with streaming/messaging systems (e.g., Kafka) and observability tooling (PrometheGrafana or equivalents)Influence without authority. Ability to align cross-functional stakeholders, set priorities and drive outcomes in a remote-first environmentStrong communicator. Clear written and verbal communication that works across engineers and non-technical stakeholdersBenefitsBenefits include equity, bonus (if applicable) and other benefits listed here.You can use modern AI coding assistants as part of your daily workflow (your choice of tools, within security guidelines), backed by a company-funded usage budget so you can iterate quickly without unnecessary friction.You'll also have access to frontier models (e.g., GPT-Codex 5/3, Claude Opus 4.6, Gemini 3 Pro).100% Remote, Global Culture - As a remote-only company, we bring together talent from around the world, united by a culture of collaboration and shared purpose.Scaling Organization - Tackle meaningful work in a high-growth, ever-evolving environment.Transparent Communication - Expect open decision-making and regular company-wide updates.Innovation-Driven - Autonomy and support to ship great work and try new things.Open Source Roots - Built on community-driven values that shape how we work.Empowered Teams - High trust, low ego culture that values outcomes over optics.Career Growth Pathways - Defined opportunities to grow and develop your career.Approachable Leadership - Transparent execs who are involved, visible, and human.Passionate People - Join a team of smart, supportive folks who care deeply about what they do.In-Person onboarding - We want you to thrive from day 1 with your fellow new 'Grafanistas' to learn all about what we do and how we do it.Balance is Key - We operate a global annual leave policy of 30 days per annum. 3 days of your annual leave entitlement are reserved for Grafana Shutdown Days to allow the team to really disconnect. *We will comply with local legislation where applicable.Company OverviewDice is a job-searching platform for technology professionals. It is a sub-organization of DHI Group. It was founded in 1990, and is headquartered in Santa Clara, California, USA, with a workforce of 201-500 employees. Its website is http://www.dice.com.Company H1B SponsorshipDice has a track record of offering H1B sponsorships, with 2 in 2022, 4 in 2021, 5 in 2020. Please note that this does not guarantee sponsorship for this specific role.