Data Scientist
Lilly is a global healthcare leader headquartered in Indianapolis, Indiana, focused on discovering and delivering life-changing medicines. They are seeking a Data Scientist to apply data science, machine learning, and AI techniques to solve business problems, collaborating with stakeholders to identify opportunities for data-driven solutions. Responsibilities Partnering with business stakeholders across GRA, GSC, and GSS to understand their workflows, identify high-impact problems, and frame them as data science opportunities — translating business questions into analytical approaches Developing and deploying machine learning models — classification, regression, clustering, time-series forecasting — to solve problems such as submission timeline prediction, document classification, regulatory risk scoring, and resource optimization Building and evaluating NLP and generative AI solutions — leveraging LLMs, RAG architectures, text extraction, entity recognition, and document summarization to automate regulatory authoring, scientific literature analysis, and content generation workflows Designing and executing experiments to evaluate model performance — using rigorous statistical methods, A/B testing, and evaluation frameworks (including RAGAS for RAG systems) to ensure solutions meet quality and accuracy thresholds before deployment Designing and building AI agents and agentic workflows — creating multi-step, tool-using systems that can autonomously execute complex tasks such as regulatory document drafting, data extraction and transformation, and cross-system orchestration — moving beyond single-prompt interactions to production-grade agent architectures that operate reliably in a validated environment Collaborating with full stack engineers and platform teams to productionize models — building APIs, integrating into existing applications, deploying on AWS infrastructure (Lambda, EKS, SageMaker, Databricks), and monitoring model performance in production Communicating findings and recommendations to both technical and non-technical audiences — using data visualization, storytelling, and clear business-impact framing to ensure your work drives actual decisions Staying current with emerging techniques in machine learning, generative AI, and data science — evaluating new tools, frameworks, and approaches for applicability to the GRA/GSC/GSS portfolio and sharing knowledge with the broader team Skills Bachelor's degree in Data Science, Statistics, Computer Science, Mathematics, or a related quantitative field 1 years of professional data science experience in Python, R and core data science libraries Qualified applicants must be authorized to work in the United States on a full-time basis. Lilly will not provide support for or sponsor work authorization now or in the future for this role, including but not limited to F-1 CPT, F-1 OPT, F-1 STEM OPT, J-1, H-1B, TN, O-1, E-3, H-1B1, or L-1 Experience with machine learning frameworks and model deployment patterns Academic Background in Data Science Hands-on experience with NLP techniques and/or generative AI — LLM APIs (OpenAI, Anthropic), RAG architectures, vector databases, prompt engineering Familiarity with cloud data platforms — AWS (SageMaker, Lambda, S3), Databricks, or similar Knowledge of statistical methods — hypothesis testing, experimental design, Bayesian methods, regression analysis Experience with SAS programming Strong communication skills — ability to present technical findings to non-technical audiences and translate business questions into analytical frameworks Collaborative mindset and experience working with cross-functional teams including engineers, product owners, and business partners Benefits Company bonus 401(k) Pension Vacation benefits Eligibility for medical, dental, vision and prescription drug benefits Flexible benefits (e.g., healthcare and/or dependent day care flexible spending accounts) Life insurance and death benefits Certain time off and leave of absence benefits Well-being benefits (e.g., employee assistance program, fitness benefits, and employee clubs and activities) Company Overview BioSpace is the leading online community for industry news and careers for life science professionals. It was founded in 1988, and is headquartered in San Francisco, California, USA, with a workforce of 11-50 employees. Its website is