Production Support Lead

Remote Full-time
Professional Services -Service Practice
Full-time ยท Remote (India / US / EU)
Experience : 5-8 years

About Lyzr
Lyzr.ai's agentic AI platform powers intelligent, autonomous workflows for enterprise clients. Production Support Engineers are the front line that keeps those workflows healthy โ€” triaging incidents, resolving tickets, digging into logs, and escalating the right issues to the right teams before clients feel the pain.
This role suits someone who thrives in a fast-paced technical environment, takes ownership seriously, and genuinely enjoys the detective work of diagnosing why something broke in production. You will work within a global follow-the-sun support model, reporting to the Production Support Lead.
What youโ€™ll do
Incident command & escalation
Own the full incident lifecycle โ€” detection, triage, war-room coordination, resolution, and post-mortem โ€” for P1/P2 issues across all production tenants.

Act as the primary escalation point for Production Support Engineers; make the call on severity reclassification and client communication timing.

Drive RCA completion within SLA windows and ensure corrective actions are tracked to closure in Jira/Confluence.

Maintain and continuously improve the P1 runbook library, escalation trees, and on-call rotation schedules.

Team leadership & operations
Manage and mentor a team of 3โ€“6 Production Support Engineers; run weekly 1:1s, set KPIs, and own the performance review cycle.

Build and optimise the shift rota for 24x7x365 follow-the-sun coverage across India, EU, and US time zones.

Define and track operational metrics: MTTR, SLA attainment by priority tier, re-open rate, and backlog aging.

Partner with Engineering and Platform teams to advocate for supportability improvements, observability tooling, and bug-fix prioritisation.

Client & commercial accountability
Serve as the named support contact for strategic accounts during critical incidents; provide executive-level written updates under pressure.

Review monthly SLA performance reports with client stakeholders; identify systemic patterns and propose proactive remediation.

Contribute to SLA definition in new SOWs, ensuring commitments are operationally deliverable.

Support the renewal and expansion process by demonstrating support maturity and service quality data.

Process & tooling
Own the support toolchain: ticketing (Jira Service Management or equivalent), monitoring dashboards, alerting rules, and on-call tooling (PagerDuty / OpsGenie).

Establish knowledge management practices โ€” internal runbooks, known-error database, and a tiered FAQ โ€” to reduce repeat escalations to Engineering.

Define and enforce severity classification criteria and ticket hygiene standards across the team.

What you bring
Experience : 5โ€“8 years in production/application support; 2+ years in a lead or senior role

Domain: SaaS / AI / ML platform support; ideally agentic or LLM-based systems

Incident mgmt.: ITIL Foundation or equivalent; proven P1 incident commander

Tooling: Jira SM, PagerDuty / OpsGenie, Datadog / Grafana, Confluence

Leadership: Direct team management experience; mentoring junior engineers

Communication: Executive-level written updates under high-pressure conditions

Additionally, you will have:
Hands-on familiarity with cloud infrastructure (AWS / GCP / Azure) and container environments (Kubernetes, Docker).

Ability to read logs, traces, and basic Python/SQL to independently diagnose issues before engaging Engineering.

Bonus: experience supporting multi-tenant SaaS at scale, or prior work with AI/ML pipelines in production.

Bonus: familiarity with enterprise client SLA frameworks โ€” P1/P2/P3 tiering, OLA/UC structures.

Apply To This Job
Apply Now โ†’

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

Assistant Director of Research Startups - OEC

Remote

Product Manager

Remote

Remanufacturing Technician 1 - 3rd Shift

Remote

Amazon Product Tester and Reviewer - Work Remotely

Remote

**Experienced Customer Service Representative - Employee Assistance Program (EAP) Support Specialist (Mon, Wed-Fri 12:30PM-9:00PM EST, & Sat 7:30AM-4:00PM EST)**

Remote

Junior Customer Support Specialist โ€“ Remote Full-Time Opportunity for Exceptional Client Service and Career Growth with arenaflex

Remote

[Remote] Senior Product Manager, AI - Supply Chain & Procurement - US Remote Job Details | Hexion Careers

Remote

Director, Privacy Officer & Managing Counsel

Remote

Remote Sales Agent, Paid Training $18/Hr.

Remote

Director, Delivery Operations - Healthcare - East Coast ($185K)

Remote
โ† Back