[Remote] ML Engineer
Note: The job is a remote job and is open to candidates in USA. Talener is a global newswire and media organization whose content reaches over half the global population daily. They are seeking a Senior Machine Learning Engineer to build and optimize inference systems for processing millions of media assets across text, image, and video pipelines.ResponsibilitiesBuilding and optimizing inference systems that run in production at scaleWorking across text, image, and video pipelinesProcessing millions of media assets to power news intelligence productsProfiling a transformerRewriting its serving path for a 2-3x latency improvementTuning an HNSW indexMaking smart infrastructure decisions on SageMaker instance selection to hit p95 targets at the lowest costPartnering closely with MLOps, platform engineering, data scientists, and product teamsOwning model performance, inference logic, and pipeline efficiencySkills5+ years building production ML inference systemsPython - core to everything in this rolePyTorch (TorchScript, ONNX, FastAPI/TorchServe) and TensorFlow (SavedModel, tf.data, XLA, TFLite) - both requiredDeep hands-on experience with transformer-based models (BERT family - DistilBERT, SBERT, etc.) in productionInference optimization at scale - quantization, distillation, compilation, kernel/profile-level performance workAWS infrastructure - EC2, Batch, Lambda, SageMaker across different media workload typesHybrid search architecture experience - BM25 + vector search + cross-encoder rerankingAsynchronous processing systems - reliability, caching, deduplication, observabilityData pipeline and workflow orchestration (Airflow or similar)Video frameworks - FFmpeg, large-scale frame-level inferenceMust have experience in the media industryMust have experience working with large amounts of data, including text, images and videosExperience with TransNetV2 or similar video shot boundary detectionFamiliarity with HuggingFace open source LLMsOpenAI API or other foundation model provider experienceHybrid CPU/GPU environment experience at scaleBenefits15% bonus targetCompany OverviewTalener is a staffing firm dedicated to finding great opportunities for technology professionals. It was founded in 2007, and is headquartered in New York, New York, USA, with a workforce of 11-50 employees. Its website is http://www.talener.com.