Job Closed

This listing is no longer active.

MetroStar logo
MetroStar

Powering Change

Senior Data Engineer III

Data EngineerData EngineerOtherRemoteSeniorTeam 201-500H1B No SponsorCompany SiteLinkedIn

Location

United States

Posted

158 days ago

Salary

$138K - $147K / year

Seniority

Senior

Bachelor Degree10 yrs expEnglishETLPythonSQL

Job Description

Senior Data Engineer III

MetroStar

• Design, maintain, and validate data schemas supporting federation and integration between C2SET and external systems • Build and support SQL and Python based ETL pipelines for operational, simulation, and analytics data • Ensure data integrity, correctness, and performance across distributed and multi tier data sources • Troubleshoot data mismatches, malformed messages, schema drift, and integration issues • Partner with modeling and simulation engineers to analyze simulation outputs and support tuning of behaviors and decision logic • Design and execute data driven experiments to evaluate model changes and operational impacts • Develop datasets, scripts, and tooling to support repeatable validation and performance analysis • Support Government analysts and integrators by delivering timely, reliable data refreshes and analysis • Perform database performance tuning and optimization for operational workloads • Maintain data documentation, metadata repositories, and data governance artifacts • Act as a functional lead for data engineering and analytics activities within the integrations team

Job Requirements

  • An active U.S. Government issued Secret security clearance
  • Bachelor’s degree in Computer Science, Data Engineering, Information Systems, Mathematics, or a related technical field, or equivalent experience
  • More than ten years of experience in data engineering, database engineering, database administration, or applied analytics
  • Strong hands on experience with SQL and Python, including building and maintaining ETL pipelines
  • Proven experience designing, implementing, and maintaining complex data schemas and data models
  • Experience working with structured and semi structured data formats such as JSON and XML
  • Ability to design, validate, and troubleshoot data exchanges across distributed or federated systems
  • Experience with database performance tuning, indexing strategies, and query optimization
  • Experience analyzing large or complex data sets and translating results into actionable technical recommendations
  • Ability to collaborate effectively with software engineers, modeling and simulation engineers, and Government stakeholders.

Benefits

  • Health, dental, and vision insurance
  • 401(k) retirement plan with company match
  • Paid time off (PTO) and holidays
  • Parental Leave and dependent care
  • Flexible work arrangements
  • Professional development opportunities
  • Employee assistance and wellness programs

Related Categories

Related Job Pages

More Data Engineer Jobs

Prospyr Medical logo

Data Migration Specialist

Prospyr Medical

A HIPAA compliant solution that makes it easy for Aesthetics providers to manage and grow their practices.

Data Engineer158 days ago
OtherRemoteTeam 1-10H1B No Sponsor

• Own end-to-end data migrations for new and expanding customers • Import and validate patient, appointment, invoice, payment, provider, service, and membership data • Map data from a wide range of legacy systems (EMRs, POS tools, spreadsheets, exports) • Identify, clean, normalize, and reconcile inconsistent or incomplete data • Perform QA checks to ensure data accuracy, completeness, and integrity post-migration • Work directly with customers during onboarding to define migration scope and timelines • Explain data requirements, limitations, and tradeoffs in a clear, customer-friendly way • Support go-live readiness by ensuring migrated data aligns with customer workflows • Troubleshoot and resolve migration issues quickly and accurately • Maintain and improve migration playbooks, templates, and checklists • Document repeatable patterns for common legacy systems • Partner with Engineering and Product to improve migration tooling and automation • Surface recurring data issues and upstream product improvements • Partner closely with Customer Experience and Implementation teams on go-live execution • Coordinate with Engineering on complex migrations or edge cases • Provide internal visibility into migration status, risks, and blockers

United States
Data Engineer158 days ago
Full TimeRemoteTeam 201-500H1B No Sponsor

• Act as the transition point between Prompt Engineering and Data Labeling, translating model and product requirements into concrete data and annotation workflows. • Design, implement, and maintain scalable data workflows for dataset generation, curation, and ongoing maintenance. • Ensure data quality and consistency across labeling projects, with a focus on operational reliability for production AI systems. • Create, review, and maintain high-quality annotations across multiple modalities, including text, audio, conversational transcripts, and structured datasets. • Identify labeling inconsistencies, data errors, and edge cases; propose and enforce corrective actions and improvements to annotation standards. • Utilize platforms such as Labelbox, Label Studio, or Langfuse to manage large-scale labeling workflows and enforce consistent task execution. • Use Python and SQL for data extraction, validation, transformation, and workflow automation across labeling pipelines. • Leverage LLMs (e.g., GPT-4, Claude, Gemini) for prompt-based quality checks, automated review, and data validation of annotation outputs. • Implement automated QA checks and anomaly-detection mechanisms to scale quality assurance for large datasets. • Analyze annotation performance metrics and quality trends to surface actionable insights that improve labeling workflows and overall data accuracy. • Apply statistical analysis to detect data anomalies, annotation bias, and quality issues, and partner with stakeholders to mitigate them. • Collaborate with ML and Operations teams to refine labeling guidelines and enhance instructions based on observed patterns and error modes. • Work closely with Prompt Engineering, Data Labeling, and ML teams to ensure that data operations align with model requirements and product goals. • Document data standards, annotation guidelines, and workflow best practices for use by internal teams and external labeling partners.

India
Job Closed
IPinfo.io – IP Data Provider logo

Data Engineer – Geolocation Team

IPinfo.io – IP Data Provider

We're the trusted source for IP address data, handling over 40 billion API requests per month for over 500,000+ users.

Data Engineer158 days ago
OtherRemoteTeam 11-50Since nilH1B No Sponsor

• Design, build, and operate data collection and analysis pipelines • Work with large-scale internet measurement data (we collect 75+ TB per week , including BGP, DNS, ping, and traceroute data from 1200+ global vantage points ) • Research, apply, and implement techniques from cutting-edge internet measurement research • Maintain a high bar for signal quality and defensibility , prioritizing observable network behavior over heuristics or guesswork • Communicate findings clearly by contributing to blog posts, technical documentation, and research publications , both internally and externally

United States
Job Closed
fusionSpan logo

Senior Data Engineer

fusionSpan

Bridging Gaps Through Technology.

Data Engineer160 days ago
OtherRemoteTeam 51-200Since 2011H1B Sponsor

• Utilize extract/transform/load ETL technologies using snowflake and other cloud data platforms • Interpret data, analyze results using statistical techniques and provide ongoing reports • Develop and implement databases, data collection systems, data analytics, and other strategies that optimize statistical efficiency and quality • Acquire data from primary or secondary data sources and maintain databases/data systems • Evaluate and optimize data structures • Identify, analyze, and interpret trends or patterns in complex data sets • Filter and “clean” data by reviewing computer reports, printouts, and performance indicators to locate and correct code problems • Monitor, troubleshoot, and improve pipeline transparency, performance, scalability, and reliability, using Snowflake OpenFlow and related ELT/ETL tools • Ensure AI/ML readiness of data by preparing and maintaining semantic models, ensuring robust data quality, and establishing and enforcing data access • Produce field mapping and translation documentation for use in both manual and scripted migrations • Work within Agile methodology managing tasks and tickets as assigned • Communicate with clients and team members for requirements gathering, clarification, and planning for data conversions • Document work and work processes for use by team members

United States
Job Closed