Lilly is a global biotechnology and pharmaceuticals healthcare company. Founded by Colonel Eli Lilly in 1876, the company is based in Indianapolis, Indiana, and maintains a strong
Clinical Trials Data Associate
Location
United States
Posted
3 days ago
Salary
$79.2K - $91.1K / year
Seniority
Mid Level
Job Description
Clinical Trials Data Associate
Lilly
Role Description Eli Lilly and Company seeks a Clinical Trials Data Associate (P1) to be responsible for trial-level clinical data strategy including: - Database structure, content and meaning - Acquisition, storage, retrieval, interchange, delivery, and representation Key responsibilities include: - Collaborating with key study partners to define, implement, and deliver clinical data management packages - Providing trial leadership and ownership for a particular trial, set of trials, or programs - Ensuring that data management timeline and results are delivered to scope, cost, and time objectives - Driving data flow design through consultation, review, and approval of vendor work - Defining and approving data quality and submission outputs and results Qualifications - Bachelor’s degree in Statistics, Information Technology, Biochemistry, Epidemiology, or a related field - 3 years of experience with data collection, data flow management, data quality, data technology, dataset delivery, archiving, and data standards - 3 years of experience articulating the flow of data from patient to analysis - 3 years of experience applying knowledge of clinical/drug development and collaborating with study team members - 3 years of experience leading development of creative data solutions to address clinical development challenges - 3 years of experience setting and implementing plans to improve complex clinical data management processes and capabilities - 3 years of experience with Tableau, PowerBI, Python, SAS, ‘R’ and Shiny for reporting, metrics and visualization, P-SQL, and T-SQL for DMBS Requirements - Telecommuting benefit available Benefits - Eligibility to participate in a company-sponsored 401(k) - Pension - Vacation benefits - Eligibility for medical, dental, vision and prescription drug benefits - Flexible benefits (e.g., healthcare and/or dependent day care flexible spending accounts) - Life insurance and death benefits - Certain time off and leave of absence benefits - Well-being benefits (e.g., employee assistance program, fitness benefits, and employee clubs and activities) Actual compensation will depend on a candidate’s education, experience, skills, and geographic location. The anticipated wage for this position is $79,206.00 - $91,086.90 per year. Full-time equivalent employees also will be eligible for a company bonus (depending, in part, on company and individual performance).
Related Guides
Related Categories
Related Job Pages
More Data Scientist Jobs
Role Description We are looking for a Senior Data Scientist to help build advanced analytics, machine learning, and AI solutions for manufacturing operations. This role will focus on using factory data to detect anomalies, improve quality, reduce downtime, optimize throughput, and support reusable data models that connect fragmented manufacturing systems into a common intelligence layer. The ideal candidate has strong applied machine learning skills, practical experience working with complex operational data, and the ability to partner with manufacturing, data engineering, platform, and software teams to move analytical solutions toward production. This is not a pure research role. We are looking for someone who can move from problem framing to data understanding, model development, validation, stakeholder alignment, and production support. The candidate should be able to learn unfamiliar domains quickly, challenge assumptions constructively, and push back when requirements, data quality, or model expectations are not realistic. Manufacturing experience is strongly preferred, but we are also open to candidates from adjacent industrial, operations, quality, aerospace, semiconductor, supply chain, or equipment-heavy environments who can learn the manufacturing domain quickly. Qualifications - Bachelor’s or Master’s degree in Data Science, Computer Science, Statistics, Industrial Engineering, Mechanical Engineering, Manufacturing Engineering, Operations Research, Applied Mathematics, or a related technical field. - 5+ years of experience applying data science, machine learning, statistical modeling, optimization, or advanced analytics in a professional environment. - Strong Python skills using libraries such as pandas, NumPy, scikit-learn, SciPy, XGBoost, PyTorch, TensorFlow, statsmodels, or similar tools. - Strong SQL skills and experience working with large, complex datasets. - Experience with supervised and unsupervised machine learning methods, including classification, regression, clustering, anomaly detection, time-series analysis, forecasting, or process optimization. - Experience building features from machine, sensor, process, quality, maintenance, production, or operational datasets. - Experience working with cloud-based data and analytics platforms such as GCP, AWS, Azure, or similar environments. - Understanding of MLOps concepts such as experiment tracking, model deployment, model monitoring, CI/CD, version control, testing, model registry, and retraining. - Ability to work with noisy, incomplete, high-frequency, or fragmented operational data. - Ability to communicate technical findings clearly to plant teams, engineers, leaders, and non-technical stakeholders. - Professional confidence to challenge assumptions, push back constructively, and influence stakeholders with evidence. - Demonstrated ability to learn new technical and business domains quickly. Requirements - Experience applying data science or machine learning in manufacturing, industrial, automotive, aerospace, semiconductor, supply chain, quality, maintenance, or operations environments. - Experience with automotive manufacturing, stamping, body shop, paint shop, final assembly, battery manufacturing, or powertrain operations. - Understanding of manufacturing KPIs such as throughput, cycle time, downtime, OEE, JPH, FTT, FRC, scrap, rework, takt time, bottlenecks, quality escapes, and safety events. - Basic understanding of manufacturing systems such as MES, SCADA, PLCs, historians, CMMS, QMS, ERP, or industrial IoT platforms. - Familiarity with graph databases or semantic technologies such as RDF, OWL, SPARQL, Neo4j, Stardog, GraphDB, or similar tools. Benefits - Immediate medical, dental, vision and prescription drug coverage. - Flexible family care days, paid parental leave, new parent ramp-up programs, subsidized back-up child care and more. - Family building benefits including adoption and surrogacy expense reimbursement, fertility treatments, and more. - Vehicle discount program for employees and family members and management leases. - Tuition assistance. - Established and active employee resource groups. - Paid time off for individual and team community service. - A generous schedule of paid holidays, including the week between Christmas and New Year's Day. - Paid time off and the option to purchase additional vacation time.
• Develop machine learning and statistical models to support manufacturing use cases such as anomaly detection, quality prediction, equipment health, process monitoring, throughput improvement, and decision support. • Apply supervised, unsupervised, and semi-supervised learning methods, including classification, regression, clustering, anomaly detection, time-series analysis, statistical process control, and model explainability. • Build anomaly detection solutions using methods such as control limits, isolation forests, clustering, Mahalanobis distance, autoencoders, time-series models, and supervised classification where labeled defects are available. • Evaluate model performance using appropriate metrics, ground truth definitions, validation strategies, false positive and false negative analysis, and business impact measures. • Identify when data is insufficient, labels are unreliable, ground truth is weak, or a machine learning approach is not appropriate, and communicate those limitations clearly. • Partner with plant teams and domain experts to understand process behavior, validate assumptions, and determine whether model outputs reflect real operating conditions.
• Lead and support multiple Data Engineering teams, fostering a collaborative, innovative, and high-performance environment. • Provide technical mentorship and guidance to engineers, helping them grow their skills and capabilities. • Promote engineering best practices, code quality, and continuous improvement across teams. • Design, review, and oversee scalable data solutions using Databricks and modern cloud technologies. • Define and enforce coding standards, architecture principles, and documentation guidelines. • Conduct code reviews and provide technical feedback to ensure maintainability, performance, and reliability. • Drive initiatives focused on process optimization, automation, and engineering excellence. • Stay up to date with industry trends, emerging technologies, and best practices in Data Engineering and Analytics. • Review and validate technical deliverables to ensure alignment with engineering standards and client expectations. • Establish and improve quality assurance processes focused on data integrity, reliability, and governance. • Support the implementation of scalable and sustainable data architecture practices. • Partner with clients to understand business and technical requirements. • Act as a trusted technical advisor, ensuring solutions align with client goals and standards. • Facilitate communication between technical teams, stakeholders, and business partners to ensure successful project delivery.
• Your primary role will be to extract the maximum potential from deep learning and computer vision algorithms, transforming video streams into structured translation data. • You will work end-to-end: from aligning with the Linguists team to screen reference videos to packaging models for the Engineering team to deploy. • Actively participate in continuous improvements to multimodal LLM-based models.



