[Remote] Senior Machine Learning Engineer
Note: The job is a remote job and is open to candidates in USA. Mathpix is looking for a Senior Machine Learning Engineer with deep expertise in computer vision, sequence modeling, and multimodal AI. In this role, you will advance the state of the art in OCR and related applications by building custom models for text recognition and document understanding.ResponsibilitiesResearch, design, and implement custom deep learning models for OCR and multimodal document understanding tasksBuild and train sequence-to-sequence and attention-based architectures for text recognition, translation, and generation tasksLead development of multimodal language models that combine vision and text for real-world applications (e.g., image-to-text, document parsing)Optimize and extend PyTorch-based training pipelines for large-scale datasets and high-performance inferenceCollaborate with product and engineering teams to integrate models into production systems, ensuring scalability, robustness, and efficiencyWork closely with the in-house data team to define, generate, and curate high-quality training data, enabling rapid iteration on bug fixes and the development of new featuresMentor junior engineers and provide technical leadership in model architecture, experimentation, and deployment best practicesSkillsPhD in Computer Science, Machine Learning, Computer Vision, NLP, or a related field3+ years of hands-on experience in deep learning research and developmentStrong expertise in sequence-to-sequence models, attention mechanisms, and Transformer-based architecturesProven experience building and training custom models in PyTorch (not using off-the-shelf models)Track record of work in one or more of the following areas: machine translation, text generation, speech-to-text, OCR, image captioning, or related multimodal tasksDeep understanding of core ML concepts: optimization, regularization, model scaling, and distributed trainingDemonstrated ability to take models from research to production in a high-stakes environmentExperience with large-scale multimodal foundation models and techniques for fine-tuning/adaptationKnowledge of advanced evaluation methodologies for sequence and multimodal modelsPublications in top ML/AI/vision conferences or journals (e.g., NeurIPS, CVPR, ACL, ICML)Experience mentoring teams and driving research agendas in applied AI settingsExperience at a startup or high-growth company; founding/early-team experience is a bonusContributions outside of work — personal projects, open-source, articles, or blog postsCompany OverviewMathpix is an AI-powered document conversion cloud built for research. It was founded in 2017, and is headquartered in Brooklyn, New York, USA, with a workforce of 11-50 employees. Its website is https://mathpix.com.Company H1B SponsorshipMathpix has a track record of offering H1B sponsorships, with 1 in 2025. Please note that this does not guarantee sponsorship for this specific role.