Principal/Sr. Software Engineer, Data/ML Platform

Data EngineerData EngineerOtherRemoteLeadTeam 11-50

Location

United States

Posted

89 days ago

Salary

0

Seniority

Lead

No structured requirement data.

Job Description

Principal/Sr. Software Engineer, Data/ML Platform

Clarvos LLC

Reports to: Head of ML Data Infrastructure Department: AI Engineering Platform FLSA Category: Exempt Position Type: Full-Time, Mid-Senior Level Travel Requirement: 0-10%, Quarterly for meetings Office Location: Remote, US Based JOB SUMMARY We are seeking an experienced Principal/Senior Software Engineer, Data/ML Platform to define and lead the technical vision for our large-scale data and machine learning infrastructure. You will architect and build the systems that enable our teams to deliver intelligent, data-driven, and agentic AI products. This is a hands-on leadership role combining deep technical expertise with strategic impact across the organization. Essential Functions & Key Responsibilities: - Define and implement the data architecture and software systems that underpin our ML and - AI platforms. - Lead design and development of scalable data pipelines, APIs, and services enabling Data-as-a-Service for ML use cases. - Architect real-time and batch data serving frameworks for training, inference, and feedback - loops. - Drive engineering excellence and platform scalability across distributed environments. - Collaborate with AI Platform leadership, MLOps, and backend teams to shape long-term technical strategy. - Mentor and guide senior engineers, establishing standards for design, testing, and deployment. - Evaluate emerging technologies and tools to strengthen the platform’s reliability and - performance. - Champion best practices in software architecture, data quality, and performance optimization Qualifications & Experience Requirements: - Bachelor’s in Computer Science, Engineering, or related field (Master’s preferred). - 7+ years of experience in software engineering, or ML/Data platform development. - Expertise is preferred in Python, Java, and/or PySpark for distributed data and service - development. - Proven experience architecting cloud-native data and ML systems (preferably GCP + - Databricks). - Deep understanding of system design, data modeling, and distributed computing. - Demonstrated leadership in scaling large, data-intensive systems and mentoring engineering - teams. - Excellent communication, technical leadership, and stakeholder management skills. - Strong system design and architecture skills. - Excellent debugging and troubleshooting abilities. - Expertise with automated testing. - Ability to thrive in a highly dynamic, fast-paced environment. Why This Role Matters? You will define the data foundation of our Agentic-AI platform, transforming raw information into a trusted, scalable data service that directly fuels brand growth solutions in MarTech and AdTech. Your work ensures our Application & ML engineers can build inference-as-a-service products faster, cheaper, and with higher accuracy. PHYSICAL REQUIREMENTS/WORKING CONDITIONS: Standing/Walking/Mobility: Must have mobility to attend meetings remotely and in person. Climbing/Stooping/Kneeling: 0% - 10% of the time. Lifting/Pulling/Pushing: 0% - 10% of the time. Fingering/Grasping/Feeling: Must be able to write, type and use a telephone system 100% of the time. Sitting: Sitting for prolonged and extended periods of time. This job description reflects management’s assignment of essential functions; it does not prescribe or restrict the tasks that may be assigned. Management may revise duties as necessary without updating this job description. For more information about the company, please visit our website: https://clarvos.com/ Clarvos is an Equal Opportunity Employer and does not discriminate against any employee or applicant for employment because of race, color, sex, age, national origin, religion, sexual orientation, gender identity, status as veteran, disability or any other federal, state or local protected class. Clarvos complies with federal and state disability laws and makes reasonable accommodation for applicants and employees with disabilities. If you require reasonable accommodation in completing the application, interviewing, completing any pre-employment testing, or otherwise participating in the employee selection process, please direct your inquiries to hrsupport@Clarvos.com.

Related Categories

Related Job Pages

More Data Engineer Jobs

Troveo is the largest licensable video library for AI model training. We partner with thousands of content licensors—ranging from top-tier studios and production houses to leading YouTube creators—to supply video content to the world’s foremost research labs. Our mission is to rapidly deliver massive volumes of video content, to exact specifications, fueling next-generation generative and world-understanding AI models. Data Engineering is central to our success. Each week, we process petabytes of video data—quickly, cost-effectively, and with uncompromising quality. As a data engineer at Troveo, you’ll focus on: - Lowering costs and reducing turnaround times for processing content. - Enhancing and transforming video data for our customers, to make it easier to discover and more valuable. We are seeking a Principal or Senior Data Engineer with demonstrated expertise in Python and large-scale data management. Practical experience with AWS services (S3, EC2, etc.), search, and large databases is essential. Familiarity with video data is a plus, but not required. Responsibilities - Data Pipeline Development: Design, build, and maintain scalable, efficient data pipelines in Python. - AWS Ecosystem: Leverage services like S3 for data storage (including multiple tiers of storage) and EC2 for compute (currently running clusters of 50k G instances), retrieval, and processing in production environments. - Big Data Handling: Develop and optimize systems to handle petabyte-scale datasets with a focus on performance, reliability, and cost-effectiveness. - Metadata Generation: Leveraging self-hosted open source LLMs and managed APIs to generate reliable metadata to power discovery and enhance the value of the content we deliver. - Discovery: Building from the ground up search capabilities leveraging visual, semantic and taxonomic data to deliver the right content to our customers. - Monitoring & Reliability: Implement robust monitoring, alerting, and logging to ensure smooth data flow and quickly troubleshoot issues. - Collaboration: Work cross-functionally with data scientists, software engineers, and product teams to understand data needs and deliver optimized solutions. - Video Processing (Preferred): If applicable, process and manage video data for analytics, quality control, and other use cases. Required Qualifications - Python Proficiency: Strong coding skills in Python (including familiarity with libraries for data manipulation and analysis). - AWS Expertise: Hands-on experience using core AWS services (S3, EC2, possibly Lambda, EMR, or ECS). - Big Data Skills: Demonstrated ability to work with large-scale datasets (petabyte-level), ensuring high performance and scalability. - Database & Storage: Familiarity with large Postgres databases. - Automation & Scripting: Comfortable building CI/CD pipelines and automating repetitive tasks. Nice to Have - Video Processing: Experience handling or transforming video data (e.g., transcoding, extracting metadata, compiling FFMPEG). - Machine Learning Pipelines: Familiarity with ML and Computer Vision workflows or frameworks (OpenCV, TensorFlow, PyTorch, etc.). - Security Best Practices: Understanding of AWS IAM, encryption, and SOC II compliance standards. What We Offer - An opportunity to work with massive data sets and cutting-edge technologies in the cloud serving the biggest companies in tech building the next generation of AI models - A collaborative environment with a talented, diverse team of engineers and data experts. - Competitive compensation and benefits with room for career growth and professional development. - This job is remote/work from home with the option of meeting up from time to time if you are located in the SF Bay Area.

United States
OtherRemoteTeam 201-500

SMSI provides expert management consulting, program and project management, and technical consulting services to government and private sector clients. SMSI has grown and evolved by building an outstanding reputation for client-focused performance and for delivering results that enable clients to meet commitments and milestones.   . SMSI is an Equal Employment Opportunity employer, all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran. Job Summary SMSI LLC is seeking a Solutions Data Architect to assist our client, Los Alamos National Laboratory (LANL). This position will be mostly remote, with occasional trips onsite in Los Alamos, NM as needed. Responsibilities Los Alamos National Laboratory has selected SAP for their next-generation Cloud ERP solution. In parallel, they will require a future state data strategy to take advantage of new tools and technology to help modernize their current on-prem operational data store. This work plays a critical role in supporting key institutional initiatives such as enabling real-time analytics, AI/ML capabilities, and enhanced operational efficiency. This role will be critical in transformation of their legacy data environment into a modern, cloud-native architecture. In concert with institutional knowledge experts and ERP implementation planner and integrators, you will define the data strategy, standards, and blueprints for migrating and integrating high-volume operational data into a new platform, ensuring data integrity, security, and accessibility across the enterprise. Job Requirements - •5+ years of progressive experience in data architecture, data modernization, and cloud-based data solutions. - Extensive experience implementing solutions with at least one or more of the following: Snowflake, Databricks, and/or SAP data platforms (S/4HANA, Datasphere, etc) - Proven experience as a senior data architect or equivalent, with demonstrated leadership on large-scale architecture projects. - Proven work experience in designing and implementing complex data architecture, with significant experience in data warehousing, and large-scale data modernization projects. - Strong understanding of major cloud platforms and their native data services. - Extensive data migration experience from on-prem to cloud systems and cloud to cloud. - Proven ability to work with cross-functional teams, including business analysts, ERP functional consultants, developers, and infrastructure teams. - Strong stakeholder management skills with a track record of presenting architecture decisions to executive leadership. - Experience with leading a team and providing technical direction to personnel for a wide variety of projects and initiatives. - Excellent verbal and written communication skills. - Candidate must have the ability to work with many team members across multiple disciplines. - US Citizenship is required. Preferred Qualifications - Master’s degree in Computer Science, Information Systems, or related technical discipline. - Experience with SAP BDC (Business Data Cloud), SAP BTP (Business Technology Platform), and Integration Suite. - Lead architectural decisions around data platforms (e.g. Snowflake, Databricks, Data Lakes, Data Lakehouses, ETL/ELT, streaming architecture). - Experience in designing and implementing a comprehensive data catalog to improve data discovery and governance. - Experience with big data technologies and machine learning concepts. - ERP certification or training (e.g., SAP Data Architect, Oracle ERP Cloud, Workday Integration). - Experience implementing or supporting data governance programs, especially in ERP environments. - Familiarity with data privacy regulations and compliance requirements relevant to ERP data (e.g., SOX, GDPR, HIPAA). - Active (or recent) Q clearance is a plus but not required. Education Requirements - Bachelor’s degree, preferably in computer science, engineering, information systems, or related discipline, along with 12 years of experience. Equivalent combination of education and experience will be considered. - Certifications are a plus.

United States

Data Product Engineer FAIR Health is unable to provide sponsorship for this position About FAIR Health FAIR Health, Inc., a national, independent nonprofit organization, was established in October 2009 with the mission to help assure fairness and transparency in health insurance information. FAIR Health has created a database of de-identified commercial healthcare claim records—the largest in the country—which receives claims monthly and which serves as the foundation for a variety of data products, custom analytics and consumer tools. FAIR Health has also been certified by the Centers for Medicare & Medicaid Services as a national “Qualified Entity” and thus holds the entire collection of Medicare Parts A, B and D claims from 2013 to the present, which get routinely refreshed. Our standard data modules, custom analytics and technological tools serve all participants in the healthcare sector nationwide; they are licensed to payors, third-party administrators, bill review companies, self-insured employers, government agencies, academic researchers and consultants. Our medical and dental data are used to inform statutes and regulations, healthcare cost indices, fee schedules, benefit and provider network design, practice/facility expansion, health systems research and dispute resolution, among other uses. We also offer a suite of consumer-oriented tools and resources available on our consumer website (fairhealthconsumer.org), as well as a mobile application, which can be licensed by other organizations. Summary of Position The Data Product Engineer will be responsible for the design, development and implementation of data-driven processes used to create FAIR Health data products. This individual will play a significant role in identifying areas in which data processes can be optimized and automated and will help to implement such changes. Working with multiple FAIR Health stakeholders, the Data Product Engineer will have responsibilities throughout the product research and development life cycle, translating business needs into data-driven deliverables and analyses. The successful candidate will have a strong technical, analytic and statistical background with the ability to extract information from the database to inform insightful data solutions. This person should be self-motivated with the ability to problem-solve independently. The ideal candidate will have an understanding of healthcare claims data, a working knowledge of healthcare coding concepts (ICD, HCPCS, CPT® etc.) and a broad knowledge of healthcare fee schedules and data products. Primary Responsibilities - Design, develop and automate custom data processes using workflow tools to create FAIR Health data products. - Coordinate efforts across multiple teams to optimize and automate data processes. - Translate business needs into development requirements and data-driven deliverables. - Debug, investigate and resolve errors that occur within database-stored procedures. - Create and understand advanced SQL queries to analyze large datasets. - Work with database technologies, such as Oracle, and statistical analytical tool sets, such as SAS/R, to evaluate datasets and produce result sets. - Understand and analyze healthcare claims data to identify trends, patterns and other salient information. - Design and oversee the implementation of data product enhancements and the creation of new data products. - Create and execute test plans of functional and nonfunctional test cases that verify requirements and validate functionality within the database and application layers. - Define and execute regression testing of database and application changes. - Liaise between cross-functional teams (Information Technology, Quality Assurance and business users) in order to assure the complete and accurate execution of requirements gathering, development and testing efforts, the implementation of data product enhancements and the creation of new data products. Knowledge and Skill Requirements - Required - Master’s degree in computer science, statistics, public health or related field, or equivalent work experience. - 3-7 years’ professional work experience. - Self-motivated and intellectually curious with strong analytic and problem-solving skills. - Ability to conduct research and analysis on large data sets, interpret results and communicate findings. - Proficient in SQL, PL/SQL or equivalent database query language. - Experience with Python for the purpose of data processing and analytics. - Strong interpersonal skills, including the ability to work in a highly cross-functional, multilocation organization. - Ability to work independently and appropriately self-manage multiple projects simultaneously. - Excellent written and verbal communication skills. - Expert proficiency with Microsoft Excel and Word. - Preferred - Advanced data mining skills and experience with SAS, R or related statistical software. - 2+ years’ experience working for a health insurer or healthcare-related organization (including consulting or technology companies) with responsibility for analysis of large healthcare datasets. - Knowledge of medical claims data, including, but not limited to, professional claims, outpatient claims, inpatient claims, specialty pharmacy and pharmaceutical claims. - A working knowledge of healthcare coding (CPT, HCPCS, ICD, revenue codes, DRG). - Familiarity with healthcare fee schedules, including, but not limited to, Medicare, workers’ compensation and Medicaid. Location: Remote or work from our New York City or Syracuse office. Salary is commensurate with experience in the range of $80,000 – $105,000 annually Interested candidates should submit their resume here. FAIR Health, Inc., is an equal opportunity employer and is an E-Verify participant. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity or national origin. FAIR Health offers a competitive compensation package and includes the following benefits: Medical, Dental, Vision, Flexible Spending and Dependent Care Accounts, Life and Disability Insurance, Paid Time Off, Paid Holidays, 401(k) and Discretionary Bonus.

United States
$80K - $105K / year
Job Closed
Huron logo

Senior Solution/Data Architect – Regulatory Risk, SA‑CCR, Basel III Endgame

Huron

Huron is a global professional services firm elevating the vision of what's possible and then putting it into practice.

Data Engineer89 days ago
Full TimeRemoteTeam 5,001-10,000Since 2002H1B Sponsor

• Lead the end‑to‑end SA‑CCR architecture, defining the framework for RC, PFE and EAD calculations, including netting sets, supervisory factors and collateral/CSA logic • Design a modern Databricks architecture using Delta Lake, Unity Catalog and Spark/PySpark to support versioned, reproducible and regulator‑defensible exposure calculations • Develop high‑performance compute pipelines that support intraday recalculation, scenario testing and rapid exposure analytics across complex derivatives portfolios • Establish strong governance and auditability, embedding lineage, versioning, entitlements, and evidence‑ready data flows throughout the architecture • Architect an AI‑ready platform, enabling explainable, controlled and reproducible AI‑assisted recalculation and optimisation, without compromising regulatory expectations • Collaborate with stakeholders across Risk, Quants, Technology, Capital/Treasury, Security and Front Office to align architecture with regulatory, operational and strategic requirements • Own architectural deliverables, including target state blueprints, integration patterns, data contracts, governance standards and audit artefacts • Support programme milestones, including parallel run, validation, remediation and optimisation, delivering architectural leadership with minimal onboarding.

Poland
Job Closed