Job Closed

This listing is no longer active.

SimSpace is an Equal Opportunity Employer: In compliance with federal law, all persons hired will be required to verify identity and eligibility to work in the United States and to complete the required employment eligibility verification document form upon hire. SimSpace is committed to providing an inclusive and welcoming environment for all members of our staff, clients, volunteers, subcontractors, vendors, and clients. Research shows that women and people from underrepresented groups only apply to jobs if they meet all of the qualifications. However, no one ever meets 100% of the qualifications. SimSpace encourages you to break that statistic and to apply. We also consider qualified applicants regardless of criminal histories, in accordance with applicable law. We are committed to providing reasonable accommodations for qualified individuals with disabilities in our job application procedures. SimSpace does not accept unsolicited resumes from employment agencies. Actual compensation for the position is based on a variety of factors, including, but not limited to affordability, skills, qualifications and experience, and may vary from the range.

Staff Data Science Engineer

Data EngineerData EngineerFull Time Remote LeadTeam 201-500

Location

United States

Posted

44 days ago

Salary

$183.8K - $184K / year

Seniority

Lead

AI/ML Observability/Monitoring Python NumPy Pandas JAX TensorFlow PyTorch Docker Kubernetes unittest pytest scikit-learn

Job Description

Role Description Staff Data Science Engineer sought by SimSpace Corporation (Boston, MA). - Design, implement, and deploy advanced mathematical and machine-learning algorithms (e.g., supervised, unsupervised, reinforcement learning, NLP, anomaly detection) to support cyber-range simulations, delivering production models with documented accuracy, latency, and throughput metrics. - Develop and maintain end-to-end AI/ML pipelines (data ingestion, feature engineering, model training, validation, inference, monitoring), ensuring test coverage, reproducibility of experiments, and documented performance benchmarks. - Construct and optimize numerical methods and computational models using Python, NumPy, SciPy, Pandas, and JAX/TensorFlow/PyTorch to solve large-scale (10M+ row) data and optimization problems relevant to cyber-range operations. - Architect scalable model-serving systems in Docker/Podman/Kubernetes, achieving reliable deployments with measured service uptime of 99 percent or greater and documented resource-utilization improvements. - Develop and integrate new AI-driven cybersecurity capabilities (e.g., automated scoring engines, classification systems, reinforcement-learning-based adversary behaviors) with quantified gains in accuracy, precision/recall, or scenario realism, validated against internal evaluation datasets. - Author and maintain production-quality Python services, enforcing code standards, implementing unit/integration testing with unittest/pytest, and reducing defect rates via measurable static/dynamic analysis reports. - Design, evaluate, and improve model performance using quantitative metrics (e.g., AUC, F1, perplexity, reward curves, convergence rates), generating written model-evaluation reports used in release readiness decisions. - Perform algorithmic research on emerging ML/AI/cyber methods, producing technical assessments, prototypes, and feasibility studies that directly inform quarterly engineering and product roadmaps. - Lead cross-team technical initiatives, producing written design documents, conducting architecture reviews, and driving the integration of DS/AI services across engineering, product management, platform teams, and cybersecurity content engineering. - Mentor senior-level engineers and data scientists by conducting formal code reviews, mathematical model reviews, and algorithm correctness checks, with documented feedback that improves model accuracy, stability, or performance. - Apply computational mathematics methods (e.g., linear algebra, numerical optimization, differential equations, stochastic processes) to design, implement, and validate algorithms and models with documented quantitative results. - Produce internal documentation (design specs, API references, model cards, validation reports) ensuring compliance with internal engineering, security, and AI governance standards. - Define and establish technical standards, best practices, and design patterns for AI/ML development across the Data Science team. - Drive high-performance computing initiatives to optimize AI/ML system performance, including distributed computing and GPU acceleration strategies. - Collaborate with cross-functional teams, including product development, engineering, cybersecurity content developers, and external stakeholders to align technical solutions with organizational objectives. - Prepare and deliver technical reports, presentations, and briefings to leadership, stakeholders, and customers on project status, technical approaches, and strategic recommendations. - Evaluate and recommend new technologies, tools, and methodologies to advance SimSpace's AI/ML and cybersecurity capabilities. - Attend and participate in team and company meetings as well as contribute to strategic planning and technical roadmap development. - May work remotely from anywhere in the US. Qualifications - Ph.D. in Computational Mathematics, Computer Science, Applied Mathematics, or a closely related field. - 1 year of experience in computational mathematics, scientific computing, machine learning, data science, or algorithm development. Experience may be gained through employment, research, or doctoral work. - Demonstrated experience applying mathematical or machine-learning algorithms (e.g., regression, classification, clustering, reinforcement learning, NLP, numerical optimization) to datasets of at least 1 million observations or high-dimensional data. - Demonstrated experience developing scientific or ML software in Python using at least three of the following packages: NumPy, Pandas, SciPy, Matplotlib. - Demonstrated experience implementing, training, and validating machine-learning models using at least three of the following frameworks: PyTorch, TensorFlow, JAX, scikit-learn. - Demonstrated experience writing automated tests for ML or scientific code using at least two of the following: unittest, pytest, hypothesis. - Demonstrated experience building and deploying containerized applications using at least one of the following: Docker, Podman, Kubernetes. - Demonstrated experience producing documented research or production-quality software artifacts (e.g., peer-reviewed publications, open-source contributions, internal enterprise algorithms or models) demonstrating algorithm correctness or performance validation. - Demonstrated experience applying computational mathematics methods (e.g., linear algebra, numerical optimization, differential equations, stochastic processes, network or graph analysis) to design or evaluate algorithms or models, with documented quantitative results. - Demonstrated understanding of statistics, computational complexity and performance, parallelization, databases, optimization, linear programming, hypothesis testing, research methodology, and existing scientific literature and results in the field of data science and AI/ML. Requirements - *Experience may be gained through academic coursework during or after master’s or PhD degree. - *Experience may be gained concurrently. - **Demonstrated knowledge or experience is equivalent to at least 6 months of experience as it cannot be learned during a reasonable period of on-the-job training. - **May work remotely from anywhere in the US. Benefits - Salary Range: $183,801-$184,000 - Please e-mail resume to careers@simspace.com. Company Description SimSpace is an Equal Opportunity Employer: - In compliance with federal law, all persons hired will be required to verify identity and eligibility to work in the United States and to complete the required employment eligibility verification document form upon hire. - SimSpace is committed to providing an inclusive and welcoming environment for all members of our staff, clients, volunteers, subcontractors, vendors, and clients. - Research shows that women and people from underrepresented groups only apply to jobs if they meet all of the qualifications. However, no one ever meets 100% of the qualifications. SimSpace encourages you to break that statistic and to apply. - We also consider qualified applicants regardless of criminal histories, in accordance with applicable law. We are committed to providing reasonable accommodations for qualified individuals with disabilities in our job application procedures. - SimSpace does not accept unsolicited resumes from employment agencies. - Actual compensation for the position is based on a variety of factors, including, but not limited to affordability, skills, qualifications and experience, and may vary from the range.

Related Categories

Data Engineer

Related Job Pages

Remote Full-time Jobs (US)Remote Python Jobs (US)More Remote Jobs

More Data Engineer Jobs

Product Owner-Data

MDxHealth

Mdxhealth seeks talented people who are passionate about improving the diagnosis and treatment of cancer patients. Mdxhealth is a building world class healthcare company, providing significant career development and financial opportunities.

Data Engineer44 days ago

Full Time Remote

Role Description The Product Owner – Data will serve as the primary owner of mdxhealth’s data product ecosystem, spanning laboratory, clinical, and commercial data sources. This role will be responsible for shaping and executing the product vision for enterprise data capabilities, including: - Data architecture - Data quality - Analytics - Future AI-driven initiatives This individual will act as the subject matter expert on mdxhealth’s business needs related to data and data analysis, bridging healthcare operations, clinical science, and technical execution. Qualifications - 3+ years of experience as a Product Owner, Business Analyst, or similar role in an agile environment - Strong background in healthcare IT, data platforms, and data structures - Hands-on experience working with complex datasets and multiple enterprise data sources (e.g., LIMS, CRM, clinical systems) - Agile or Product certifications (CSPO, CSM, SAFe PO/PM) - Preferred - Proven ability to translate business and analytical needs into clear product requirements - Strong understanding of agile product delivery and backlog management - Excellent communication, facilitation, and stakeholder management skills Requirements - Hiring salary range: $130,000 - $160,000. The actual rate will be determined based on experience and other factors permitted by law. Benefits - Comprehensive compensation and benefits package - Competitive salary - Company paid medical, dental, vision, and life insurance coverage - 401(k) with company match - Generous employee discounts - Casual, but driven work environment - Ability to make a real difference as a key contributor to our growth Company Description Mdxhealth seeks talented people who are passionate about improving the diagnosis and treatment of cancer patients. Mdxhealth is building a world-class healthcare company, providing significant career development and financial opportunities. Mdxhealth is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, or protected veteran status and will not be discriminated against on the basis of disability. Accessibility: If you need an accommodation as part of the employment process, please contact Human Resources at: 866-259-5644.

AI LIMS CRM

View details: Product Owner-Data

United States

$130K - $160K / year

Apply

Data Modeller

CGI

CGI, established in 1976, is one of the world’s leading information technology and business-process service firms. With more than 70,000 team members in 40 co

Data Engineer44 days ago

Full Time Hybrid

Company Site

Title: Data Modeller Location: Melbourne, Victoria, Australia Hybrid Full-time Job Description: Position Description: Recognised as one of the world's largest IT and business consulting firms, CGI has offices across Australia, supporting local public and private sector clients to solve real business problems. We are looking to hire a Data Modeller who will focus on translating complex business requirements into clear, scalable, conceptual, logical, and physical data models that support analytics, reporting, and operational needs. Working closely with data engineers, analysts, and business stakeholders, the Data Modeller ensures that data is well-organized, standardized, and aligned with governance and quality frameworks to enable efficient data-driven decision making across the organization. Flexible work is available including hybrid work from client's site at Port Melbourne Your future duties and responsibilities: - Contribute to the design, development, and continuous refinement of conceptual, logical, and physical data models within the Data Transformation team. - Create and maintain SQL DDL scripts, along with detailed mapping and ETL documentation, to support Data Engineers in constructing and loading data models. - When required, carry out reverse engineering of existing data models from databases or SQL code to ensure alignment with current architecture and standards. - Develop and update business documentation, including process maps, taxonomies, and ontology diagrams, to provide clarity and traceability of data flows. - Maintain a strong emphasis on conceptual and business data modelling, ensuring that structures align with organisational objectives. - Interpret and translate business requirements into scalable data models that support long-term analytical and operational needs. - Assist in managing and maintaining controlled vocabularies and the corporate data catalogue to promote consistency and reuse of data assets. - Participate in modelling workshops and collaborative sessions with other Data Modellers to align on best practices and design approaches. - Adhere to existing Data Quality and Data Governance frameworks, contributing to their ongoing enhancement and ensuring compliance with modelling standards. - Build and maintain effective working relationships with subject matter experts and business stakeholders across the organisation. - Keep stakeholders and senior management informed of prioritisation decisions, project progress, and delivery timelines, managing expectations clearly and proactively. Required qualifications to be successful in this role: - Sound understanding of data modelling methodologies, including Kimball, Inmon, Top-down/Bottom-up, Relational and Dimensional Modelling, Data Warehousing, and 3NF approaches. - Ability to think conceptually and apply modelling techniques such as generalisation, subtyping, and super-typing to create efficient and flexible models. - Skilled in producing Entity Relationship Diagrams (ERDs) using a range of notations, such as Crow's Foot and UML. - Strong technical understanding of databases, ETL/ELT pipelines, and programming languages (typically SQL), with the ability to connect these technologies to data modelling practices. (This is a hands-on role involving active work with data.) - Solid comprehension of business processes, with the ability to capture requirements accurately and translate them into effective technical designs. - Confident communicator, capable of engaging in technical discussions with both technical and non-technical audiences across all organisational levels. - Experience with cloud-based data technologies, particularly within the Microsoft Azure ecosystem, including: - Azure Data Lake - Azure Data Factory - Azure Databricks (SQL and Python) - Azure SQL Server - Azure DevOps / Git Skills: - GIT - GIT - SQLite What you can expect from us: Together, as owners, let's turn meaningful insights into action. Life at CGI is rooted in ownership, teamwork, respect and belonging. Here, you'll reach your full potential because… You are invited to be an owner from day 1 as we work together to bring our Dream to life. That's why we call ourselves CGI Partners rather than employees. We benefit from our collective success and actively shape our company's strategy and direction. Your work creates value. You'll develop innovative solutions and build relationships with teammates and clients while accessing global capabilities to scale your ideas, embrace new opportunities, and benefit from expansive industry and technology expertise. You'll shape your career by joining a company built to grow and last. You'll be supported by leaders who care about your health and well-being and provide you with opportunities to deepen your skills and broaden your horizons. Come join our team-one of the largest IT and business consulting services firms in the world.

SQL ETL Azure Databricks Python Microsoft SQL Server Azure DevOps Git SQLite

View details: Data Modeller

Australia

Apply

Staff Data Engineer

Imagine Pediatrics

Reimagining pediatric health care. Together.

Data Engineer44 days ago

Full Time RemoteTeam 51-200H1B No Sponsor

Company Site LinkedIn

• As a Staff Data Engineer at Imagine Pediatrics, you will be the first dedicated Data Engineer on a hybrid team with Analytics Engineers, responsible for defining how data moves through our platform and owning the data pipelines that power clinical analytics, operational reporting, and external integrations. • You will ensure that data ingestion and integration decisions are made with a clear understanding of downstream analytical usage, including how data freshness, grain, and structure impact downstream processes and systems. • You will partner closely with Analytics Engineers, Product Engineers and Platform Engineers to deliver a platform built for a high-growth, mission-driven healthcare organization. • Design, build, and maintain scalable ELT pipelines that ingest data from clinical systems, APIs, and third-party integrations. • Architect and manage event-driven data pipelines in AWS — including cross-account configurations and dead-letter queue handling. • Write and maintain infrastructure-as-code to deploy and manage data ingestion workloads, primarily extending existing modules and patterns. • Orchestrate pipeline execution and monitoring using Dagster, ensuring observability and reliability across all workflows. • Implement data quality checks, alerting, and lineage tracking across the pipeline. • Identify and eliminate systemic failure modes in pipelines, improving reliability through long-term fixes rather than repeated incident remediation. • Partner with Analytics Engineers to ensure upstream data supports correct and consistent downstream models. • Set technical direction for data architecture and mentor other engineers.

AWS Cloud JavaScript Python SQL Terraform TypeScript Go

View details: Staff Data Engineer

United States

$180K - $200K / year

Apply

Data Engineer

ARB Interactive

Building interactive technology

Data Engineer44 days ago

Full Time RemoteTeam 51-200Since 2022H1B No Sponsor

Company Site LinkedIn

• Build and maintain ETL/ELT pipelines that move and transform data reliably across our stack • Model clean, well-documented datasets to support analytics, reporting, and experimentation • Collaborate with data analysts and product teams to improve data quality and accessibility • Contribute to data quality monitoring and alerting to catch issues early • Help instrument new product features and events alongside engineering and product teams • Write readable, tested, well-documented SQL and Python code • Participate in code reviews, give and receive constructive feedback • Learn from senior engineers and contribute ideas to how we build and scale our data infrastructure

Airflow Amazon Redshift AWS Azure BigQuery Cloud ETL Google Cloud Platform Python SQL Terraform

View details: Data Engineer

United States

Apply

Job Closed