Job Closed
This listing is no longer active.
Data Engineer I
Location
Maryland
Posted
27 days ago
Salary
$76.4K - $98.1K / year
Seniority
Senior
Job Description
Data Engineer I
eSimplicity
• Develop production-grade ETL workflows using Python and Microsoft-based frameworks to ingest, transform, and validate large-scale structured and unstructured data. • Implement schema enforcement, data validation, and quality checks to maintain integrity across diverse sources. • Optimize pipelines for performance, scalability, and fault tolerance using open-source and cloud-native patterns. • Manage Azure-based data solutions, including Data Lake Storage, Azure SQL, and cloud storage access from Python services. • Deploy workflow orchestration using Azure Data Factory or Foundry for scheduling, monitoring, and automation. • Ensure secure integration of APIs and services within the Microsoft ecosystem for seamless data exchange. • Build Python-based data services leveraging libraries such as Pandas, Pytorch, and other open-source frameworks for high-performance processing. • Implement logging, monitoring, and performance tuning for robust operational reliability. • Develop API endpoints and microservices to enable interoperability with analytics and ML platforms. • Work closely with data scientists, analysts, and cloud architects to deliver clean, reliable data for predictive modeling and real-time dashboards. • Apply data governance best practices, ensuring compliance, reproducibility, and auditability across workflows. • Contribute to Agile team processes, driving iterative improvements and shared problem-solving. • Work in Agile teams; drive iterative delivery, joint problem-solving, and continuous improvement. • Engage closely with project managers, technical leads, client representatives, and cross-functional teams to provide timely updates, resolve issues, and ensure alignment with business goals. • Translate technical specifications into code and design documents.
Job Requirements
- Bachelor’s degree or equivalent professional experience in Data Science, Computer Science, Engineering, or related field.
- All candidates must pass public trust clearance through the U.S. Federal Government.
- 3+ years developing and deploying advanced statistical and machine learning models or supporting data pipelines for such models.
- Proficiency in Python (Pandas required; scikit-learn, NumPy, and related libraries preferred).
- Strong SQL skills and experience integrating data from relational databases.
- Familiarity with open-source data processing libraries (Pandas, PyTorch, Tensorflow etc.).
- Open-source frameworks for production-grade data pipelines.
- ETL development using Python and Microsoft technologies.
- Data validation, schema enforcement, and quality assurance.
- API development within Microsoft ecosystem.
- Performance optimization, logging, and monitoring for large-scale systems.
- Azure Data Lake Storage integration and Azure SQL connectivity.
- Workflow orchestration with Azure Data Factory.
- Deployment and operation of Python-based data services in Azure.
- Strong attention to detail with a commitment to delivering high-quality and accurate work.
- Excellent communication skills, both written and verbal, with the ability to collaborate effectively across teams.
- Proven ability to manage time and prioritize tasks in a fast-paced environment.
- Demonstrated problem-solving skills with a proactive and solution-oriented mindset.
Benefits
- We offer a highly competitive salary
- full healthcare benefits
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
• Architect data infrastructure roadmap centered on a Lakehouse architecture • Establish data governance, cataloging, and lineage frameworks • Partner with OT and Engineering teams to operationalize IIoT and supply chain data • Oversee transition from legacy BI tools to modern analytics platforms • Lead the transition to MLOps and DataOps methodologies • Collaborate with business unit leaders to identify data products for growth • Build and mentor a high-performing team of data engineers, ML engineers, and data architects
Senior Data Specialist
Alex Staff AgencyWe help the best professionals and companies find each other despite borders.
Role Description We need someone who understands data deeply and uses Python to wrangle it — not a platform engineer, not a pure pipeline builder, but a data specialist who's comfortable with research, investigation, and the unglamorous work of making messy energy market data actually usable. You'll spend significant time on tasks like: - Mapping BM units to power plants and fuel types - Reconciling legacy data formats with current ones - Ensuring consistency between different Elexon message types - Cleaning time-series data (outliers, gaps, overlaps) Some of this requires genuine investigation — cross-referencing sources, making judgment calls, documenting edge cases. There's no API that solves these problems for you. Python is your primary tool (Pandas, Numpy, standard libraries) to minimise manual effort, but you should be comfortable that some detective work is unavoidable. If you find satisfaction in truly understanding a dataset's structure and quirks — rather than just piping data through and hoping for the best — this role is for you. Qualifications - Strong Python skills for data work — fluent with pandas, comfortable writing clean, testable code, and can build reusable data processing logic. - Solid SQL skills — complex queries, window functions, CTEs in PostgreSQL. - Experience with messy, real-world data — reconciliation, cleaning, or mapping work. - Methodical and detail-oriented — notice inconsistencies and want to understand root causes. - Good documentation habits — understanding that undocumented mappings and assumptions are technical debt. - Self-directed — able to own ambiguous problems, do research, and communicate findings clearly. Requirements - Map BM units from Elexon to their corresponding power plants, substations, and fuel types. - Map substations to ETYS zones and grid supply points. - Build and maintain reference/master datasets that link identifiers across disparate sources. - Document mappings, assumptions, and known limitations clearly for downstream users. - Reconcile legacy data formats with current formats. - Ensure consistency between different Elexon message types. - Investigate discrepancies between data sources and determine authoritative values. - Clean time-series data: detect outliers, fill gaps appropriately, resolve overlapping or duplicate timestamps. - Develop reusable Python-based cleaning routines that can be applied across datasets. - Write and maintain Python data grabbers for energy market APIs. - Build dbt models to transform raw data into clean, analysis-ready datasets. - Orchestrate workflows via GitHub Actions. - Design PostgreSQL schemas that reflect your understanding of the domain. Benefits - Remote-first with async collaboration (Slack, GitHub, documented decisions). - Core overlap with UK business hours expected (at least 4 hours daily). - Competitive compensation based on location and experience. - Plenty of opportunities for learning and professional growth. - B2B contract with a paid vacation.
• Design and implement enterprise data architecture leveraging Snowflake for structured and unstructured data • Develop scalable data models, data lakes, and data warehouses to support analytics, AI/ML, and reporting • Define and enforce data governance, security, and compliance frameworks (HIPAA, FDA, GDPR) • Architect data pipelines and ETL/ELT processes using modern tools (e.g., Snowpipe, dbt, Informatica, Azure Data Factory) • Collaborate with cross-functional teams (Engineering, Clinical, Regulatory, Quality, and Business stakeholders) • Ensure data integrity, lineage, and traceability aligned with medical device standards (e.g., ISO 13485, GxP) • Optimize Snowflake performance, cost management, and workload management • Support cloud data architecture (AWS, Azure, or GCP) and hybrid environments • Enable advanced analytics, including AI/ML and real-world evidence (RWE) use cases • Establish best practices for data lifecycle management (ingestion, storage, archival, and retention)
• Design, develop, and maintain reliable software in line with technical requirements, focusing on performance and availability • Analyze requirements, review designs, and estimate user stories following project methodology (Agile, Waterfall, etc) • Proactively propose code refactoring and optimization improvements according to the best software development practices and coding standards • Help maintain and improve high-quality standards within the developer community by sharing knowledge, conducting tech talks, and participating in the internal promotion verification process • Stay up-to-date with modern technology and obtain professional certifications • Support less experienced developers by providing training, distributing, and monitoring tasks


