[Remote] Principal + Staff Software Engineers
Note: The job is a remote job and is open to candidates in USA. Unstructured is defining the standard for enterprise data transformation in the age of LLMs and generative AI. They are seeking Staff and Principal Software Engineers to define the architectural foundation for processing and transforming unstructured data for LLM applications, ensuring systems are performant and resilient.ResponsibilitiesDefine and evolve the end-to-end architecture for Unstructuredâs data transformation and retrieval platformBuild and scale distributed systems that process massive volumes of unstructured data across diverse formats and sourcesServe as the company-wide authority on Kubernetes orchestration, cluster design, performance tuning, and reliabilityLead Python architecture and best practicesâensuring performance, modularity, and maintainability across servicesDesign and optimize Postgres schemas, queries, and indexing strategies to support large-scale metadata and retrieval pipelinesMentor senior engineers through design reviews and code guidance, raising the bar for technical excellence across the orgPartner with the infrastructure and product teams to translate research prototypes into production-grade systemsEvaluate emerging technologies and open-source tools in LLM infrastructure, retrieval, and orchestrationâdeciding where and how to integrate themSkillsHave 10+ years of software engineering experience with a focus on distributed systems, infrastructure, or data architectureAre a Python expertâcapable of building frameworks and performance-critical services from scratchHave deep Kubernetes expertise; you can design, deploy, and debug at scale and could teach others how to productionize it securelyAre fluent in Postgresâyou understand query planning, partitioning, and tuning for high-throughput environmentsAre obsessed with clean, scalable architecture and can lead design reviews that shape how entire systems evolveHave experience in high-performance data or AI/ML systemsâespecially those involving retrieval pipelines, embeddings, or hybrid workloadsThrive in fast-moving, ambiguous environments where technical depth and judgment matter more than processExperience building or scaling LLM-powered or RAG systems in productionFamiliarity with open-source orchestration frameworks, vector databases, or hybrid cloud infrastructureContributions to open-source projects in Python, Kubernetes, or distributed systemsBenefitsCompany offsitesBest-in-tech swagThe tools you need to do your best work, wherever you're basedMedical, dental, and vision coverage effective the 1st of the month following your start dateLife and disability insuranceUnlimited PTOFlexible parental leaveA 401(k) with company matchEquity$500 work from home stipend$70/month internet reimbursementTeam/company offsites throughout the yearCompany OverviewUnstructured is the data infrastructure company solving the most critical bottleneck in enterprise AI: making unstructured data accessible to AI applications. It was founded in 2022, and is headquartered in Rocklin, California, USA, with a workforce of 51-200 employees. Its website is https://unstructured.io.Company H1B SponsorshipUnstructured has a track record of offering H1B sponsorships, with 1 in 2024. Please note that this does not guarantee sponsorship for this specific role.