[Remote] AI Software Engineer, Legal Prompting & LLM Dev.
Note: The job is a remote job and is open to candidates in USA. Nixon Peabody LLP is a law firm that values innovation and collective thinking, seeking an AI Software Engineer specializing in Legal Prompting and LLM Development. This role involves building production-grade applications that utilize Large Language Models to enhance legal workflows, requiring expertise in software engineering, prompt engineering, and AI infrastructure.ResponsibilitiesDesign, develop, and deploy LLM-integrated applications that enhance legal workflows across transactional, litigation, regulatory, and advisory practice areasDevelop backend services across the Microsoft stack and in languages such as TypeScript/JavaScript, Python, C#, and others as needed, that interact with LLM providers (OpenAI, Anthropic, etc.), external APIs, SQL and NoSQL databases, and document management systemsBuild and maintain RESTful and event-driven APIs that expose AI capabilities to internal applications and downstream consumersWrite and refine persona-based prompts, system instructions, and few-shot examples to guide LLMs in delivering accurate, defensible, and legally appropriate responsesBuild prompt evaluation harnesses, regression test suites, and offline/online evaluation pipelines (e.g., LLM-as-judge, golden datasets) to measure quality, hallucination rates, and latencyContinuously test and iterate on prompts and code to optimize model performance, cost, and user experienceDesign, build, and operate Model Context Protocol (MCP) servers that expose firm systems — document management (e.g., iManage, NetDocuments), time and billing, CRM, research platforms, and internal knowledge bases — as secure, governed tools for AI agentsDefine tool schemas, authentication flows, rate limiting, and audit logging for MCP endpoints, ensuring outputs are scoped to user permissions and ethical wallsMaintain a catalog of reusable MCP tools and resources that can be composed across multiple AI products at the firmBuild and tune retrieval-augmented generation pipelines, including chunking strategies, embedding model selection, hybrid search (lexical + semantic), and rerankingWork with vector databases (e.g., Pinecone, Weaviate, pgvector, Azure AI Search) and orchestration frameworks (e.g., LangChain, LlamaIndex, Semantic Kernel) to ground LLM outputs in firm and client dataDevelop multi-step and multi-agent workflows that combine planning, tool use, and human-in-the-loop checkpoints for sensitive legal tasksImplement guardrails, content filters, PII redaction, and citation/verification layers to ensure responsible useContainerize services (Docker) and deploy via CI/CD pipelines to cloud environments (Azure preferred; AWS/GCP a plus), using infrastructure-as-code (Terraform, Bicep) where appropriateInstrument applications with logging, tracing, and LLM-specific observability tools (e.g., LangSmith, Arize, Weights & Biases, OpenTelemetry) to monitor quality, cost, and drift in productionPartner with Information Security and the Office of the General Counsel to ensure solutions meet client outside counsel guidelines, data residency requirements, and confidentiality obligationsCollaborate with attorneys, legal professionals, and product teams to understand domain-specific needs and translate them into technical solutionsAssess the integration of LLMs into existing legal workflow systems and recommend improvementsPerform other duties as assignedSkills4-6 years of production-level software engineering experience on a commercial or internal product teamBachelor's degree in Computer Science, Engineering, or a related technical fieldStrong programming skills in modern object-oriented languages such as TypeScript/JavaScript, C#, Python, or Java (typing, async, packaging, testing), with the ability to work fluently across the Microsoft technology stackExperience designing and consuming RESTful APIs and working with SQL databases; familiarity with NoSQL and vector stores a plusHands-on experience with LLM APIs (OpenAI, Anthropic, Cohere, Azure OpenAI) and/or open-source models (e.g., LLaMA, Mistral)Proficiency with prompt engineering techniques (chain-of-thought, structured outputs/JSON mode, function/tool calling, few-shot design)Experience building or integrating with Model Context Protocol (MCP) servers, custom tools, or function-calling endpoints for agentic systemsFamiliarity with orchestration frameworks such as LangChain, LlamaIndex, LangGraph, Semantic Kernel, or Pydantic AIExperience implementing RAG pipelines with embeddings, vector databases, and reranking modelsExperience with evaluation frameworks (Ragas, DeepEval, promptfoo) and LLM observability platformsFamiliarity with containerization (Docker), CI/CD, and cloud deployment (Azure preferred)Excellent written communication skills — especially in crafting clear and effective LLM prompts and technical documentationAbility to translate legal context and goals into prompt instructions, tool definitions, and system requirementsStrong analytical and problem-solving capabilities, with sound judgment about when to use deterministic code versus probabilistic modelsAbility to thrive both independently and as part of a collaborative teamPrior experience developing software solutions in the legal industry strongly preferredLegal background highly preferred (e.g., J.D., paralegal, legal tech industry experience, or work with legal software vendors)Demonstrated experience in legal practice or support roles is a plusBenefitsIn addition to a standard benefits package, this role may be eligible for additional contingent compensation based on an array of factors, including but not limited to: work performance, geographic location, work experience, education, and qualifications.Company OverviewNixon Peabody LLP is an American Lawyer top-100 law firm in the United States and has 15 offices worldwide. It was founded in 1999, and is headquartered in Boston, Massachusetts, USA, with a workforce of 1001-5000 employees. Its website is http://www.nixonpeabody.com/.