Staff Software Engineer - Supernal

Remote Full-time
Staff Software Engineer About Supernal Supernal helps small-to-medium businesses hire their first AI employee. Our AI teammates are built using intelligent, agentic workflows deployed on a proprietary platform. We deliver working, value-generating AI Employees—not tools—that handle real business processes alongside human teams. The Role We're looking for a Staff/Principal Software Engineer to own and evolve the core platform that powers our AI employees. This is a technical leadership position responsible for the systems that enable our agents to scale reliably: the Django backend, distributed task infrastructure, event-driven architecture, Kubernetes deployments, and observability stack. You'll work across the full system—from database query optimization to Helm chart tuning to designing new platform abstractions. You'll be a force multiplier for the engineering team, driving architectural decisions, eliminating scaling bottlenecks, and establishing patterns that make the platform more robust and developer-friendly. This role reports to the Director of Engineering and involves significant autonomy in shaping technical direction. What You'll Own Drive platform architecture decisions and align the team on scalable patterns and long-term maintainability Review a high volume of code, design docs, and architectural proposals for scalability, reliability, security, and operability Be a technical mentor and force multiplier: unblock engineers, raise the bar on production readiness, and establish platform best practices Own and evolve the core backend platform (Django/DRF/ASGI) performance and correctness Scale async execution across Celery + Dramatiq + Temporal/Cortex; implement resilient workflow patterns (retries, circuit breakers, graceful degradation) Optimize PostgreSQL/pgvector (query tuning, connection pooling) and caching strategies Maintain and improve Kubernetes deployment infrastructure (GKE, Helm, Terraform/OpenTofu) and CI/CD + rollout strategies. Own KEDA autoscaling policies and resource allocation across worker pools. Own reliability of RabbitMQ, Redis, and PostgreSQL infrastructure; lead incident response and post-mortems Extend OpenTelemetry + Datadog instrumentation, dashboards, alerts, and SLOs; profile and reduce latency/memory bottlenecks What We're Looking For Required 10+ years building and operating production backend systems at scale Deep expertise in Python (Django preferred) and relational databases (PostgreSQL) Hands-on experience with Kubernetes, Helm, and cloud infrastructure (GCP preferred) Strong background in distributed systems: message queues, event sourcing, workflow orchestration Production experience with async task systems (Celery, Dramatiq, or similar) Track record of debugging complex production issues across multiple services Ability to work autonomously and drive technical initiatives without close supervision Clear technical communication—able to explain tradeoffs and build consensus Preferred Experience with Temporal or similar workflow engines Background in LLM infrastructure, RAG systems, or AI/ML platforms Familiarity with OpenTelemetry, Datadog, or similar observability stacks Experience with KEDA or other Kubernetes autoscaling solutions Contributions to multi-tenant SaaS platform architecture History of improving developer experience and platform abstractions What Success Looks Like Platform services maintain high availability with predictable performance under load Scaling bottlenecks are identified and resolved proactively New features ship faster because platform primitives are well-designed and documented Incidents are rare, quickly detected, and thoroughly addressed Engineers across the team adopt platform patterns and best practices Technical debt is systematically identified and paid down You're a trusted technical voice in architectural discussions Compensation & Logistics Compensation: Competitive salary commensurate with experience (Staff/Principal level) Location: Remote Type: Full-time Requirements: Overlap with Americas timezones for collaboration; reliable high-speed internet
Apply Now →

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

Land Management Senior II

Remote

Experienced Retail Media Strategist – Digital Commerce Operations and Client Consulting Expert

Remote

Experienced Remote Customer Service Representative - Delivering Exceptional Client Experiences in a Dynamic Work-From-Home Environment with arenaflex

Remote

**Experienced Data Entry Specialist – Remote Opportunity with blithequark**

Remote

Inside Sales Manager - Now Hiring

Remote

**Experienced Livechat Support Specialist – Delivering Exceptional Customer Experiences for Small Business Clients at blithequark**

Remote

Elevate Customer Experience: Remote Customer Service Agent at Delta Airlines ✈️

Remote

Customer Experience Specialist – Minneapolis/St. Paul, MN – On‑Site Role Driving Exceptional Travel Service at arenaflex

Remote

Experienced Medical Transcriptionist for Remote Healthcare Transcription Services

Remote

RevOps, HubSpot Admin – Contractor, Part-Time

Remote
← Back