8(a) HUBZone IT consultancy w/ advanced partnerships w/ Amazon Web Services, Microsoft Azure & Google Cloud Platform
Mid-Level Data Engineer
Location
United States
Posted
2 days ago
Salary
0
Seniority
Senior
Job Description
Mid-Level Data Engineer
Simple Technology Solutions
• Develop new ETL pipelines and data ingestion processes alongside senior engineers using AWS Glue (Spark-based, PySpark), MWAA (Airflow), Lambda, and SNS • Integrate the agency's ETL Common Library into Glue jobs for standardized orchestration, error handling, metadata recording, and SNS notifications for all success and error job events • Ingest structured and semi-structured datasets (CSV, XML, JSON, Avro, pipe-delimited) into S3 landing, raw, and curated zones using Apache Iceberg tables • Configure static ETL metadata in the centralized PostgreSQL metadata store; ensure dynamic metadata records job status and timestamps for all key execution steps • Monitor assigned production jobs and participate in operations support rotations • Ensure ETL Load Reports are populated in real-time and ETL Gap Reports are updated on a weekly basis • Build and maintain materialized views and semantic layer objects in Trino and Athena to ensure optimized query performance and consistent business logic • Produce and maintain required documentation for each assigned dataset: Business Requirements, ETL Design Documents, Data Models, Data Dictionaries, Mapping Documents, Deployment Documents, O&M Guides, and ETL Test Plans • Write unit and integration tests achieving the 90% minimum code coverage threshold; complete security scans at least once per sprint • Deploy ETL resources using CloudFormation templates through the agency CICD pipeline • Support transition of ETL jobs from other agency teams and disaster recovery exercises
Job Requirements
- US Citizenship is required
- Bachelor's Degree is required
- minimum of 3-5 years' position related experience is required
- Hands-on experience with Python (PEP 8), PySpark, and SQL for ETL pipeline development
- Experience with AWS services including Glue, S3, MWAA (Airflow), Lambda, SNS, and SQS
- Familiarity with Apache Iceberg, Parquet, and ORC file formats and S3 data lake zone concepts
- Experience with PostgreSQL and basic familiarity with Redshift or Oracle
- Familiarity with Trino or Athena for query and semantic layer development
- Experience with CloudFormation, GitHub branching workflows, and CI/CD-integrated deployments
- Ability to produce clear ETL documentation including data models (Mermaid format) and data dictionaries
- Understanding of ETL metadata concepts including static and dynamic metadata, load reports, and gap reports
- Experience in agile development environments with sprint-based delivery
- Experience supporting IV&V and/or User Acceptance Testing (UAT) processes in a federal or technical program environment
- Experience with automated testing frameworks; ability to write unit and integration tests achieving defined code coverage thresholds
- Familiarity with FISMA, NIST 800-53, and OWASP ASVS Level 2 is a plus
- Must be able to work 8am-5pm Eastern Time regardless of home location
- Active federal public trust suitability determination or ability to obtain one required
Benefits
- Flexible work arrangements
- Continuous learning
- Professional development
- Special incentives for team members living in qualified HUBZones
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
Professor/Author | Content Curation and Validation - POWER QUERY AND ETL (M LANGUAGE) - EdTech | Vitru
Vitru EducaçãoA Vitru Educação é a maior instituição privada de ensino EAD do Brasil e referência nacional do ensino presencial e Medicina. Juntas, Vitru, Unicesumar e Uniasselvi se dedicam ao desenvolvimento individual e coletivo, priorizando o aprendizado e fazendo do conhecimento um estilo de vida.
Role Description A Vitru Educação está em busca de professor(a)-autor(a) para atuar na revisão técnica, validação acadêmica e curadoria conceitual da disciplina POWER QUERY E ETL (M LANGUAGE), do curso Power BI e Data Visualization Avançada (Storytelling e DAX), na modalidade pós-graduação EAD. SOBRE A ATIVIDADE: - Curadoria e validação de conteúdo - Revisão técnica de 3 temas (20 a 30 laudas cada) - Análise de atualidade conceitual, linguagem e adequação ao público - Conferência e eventual complementação de referências - Realização de ajustes diretamente no material (não apenas sugestões) - Atividades avaliativas e complementares - Validação de 3 questões de autoestudo por tema - Revisão e validação de 15 questões objetivas por tema - Validação de materiais complementares (1 estudo de caso + 10 questões) - Validação de roteiros e gravação de 3 videoaulas (10 a 15 minutos cada) EMENTA DA DISCIPLINA: - Conectores de dados - Transformação e limpeza de dados com Power Query - Linguagem M - Parâmetros e funções personalizadas Qualifications - Pós-graduação lato ou stricto sensu – Especialização, Mestrado ou Doutorado na área da disciplina* - Familiaridade com Google Docs e/ou Word - Organização, gestão de prazos e clareza na escrita - Experiência prévia na produção de livros ou materiais educacionais será considerada um diferencial - Power BI e Data Visualization Avançada (Storytelling e DAX) Benefits - Atividade 100% home office - Acompanhamento especializado durante todas as etapas - Publicações com registro de ISBN - Desenvolvimento de competências autorais digitais - Visibilidade nacional e reconhecimento acadêmico Company Description A Vitru valoriza e apoia a diversidade em todas as suas formas. Todas as candidaturas são bem-vindas.
• Build & Enhance Pipelines: Design, develop, and maintain scalable data pipelines to ingest, transform, and enrich complex healthcare data using Databricks and Spark. • Optimize Data Workflows: Analyze and improve data intake processes and optimize SparkSQL/Python workloads for performance, scalability, reliability, and cost efficiency. • Design Data Models: Develop and maintain data marts, semantic models, and curated datasets that support analytics products, reporting, and business intelligence initiatives. • Ensure Data Quality & Reliability: Monitor pipeline health, troubleshoot production issues, implement data validation frameworks, and maintain high standards for data quality and governance. • Collaborate Across Teams: Partner with product, analytics, and engineering teams to understand business requirements and deliver scalable data solutions. • Drive Continuous Improvement: Contribute to architecture decisions, engineering standards, automation efforts, and best practices across the data platform.
Supervisor - Electronic Data Interchange
Astrana Health, Inc.Astrana Health is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. All employment is decided on the basis of qualifications, merit, and business need. If you require assistance in applying for open positions due to a disability, please email us at humanresourcesdept@astranahealth.com to request an accommodation. The job description does not constitute an employment agreement between the employer and employee and is subject to change by the employer as the needs of the employer and requirements of the job change.
Role Description Supervisor - Electronic Data Interchange (Astrana Health Management Inc.; Alhambra, CA): Develop data structure and software applications, and improve software platforms and systems to support business needs. Design, test, deploy, improve, and maintain software that will be used internally and externally by physicians and patients. Key Responsibilities - Apply expertise on emerging or existing massive data to create and scale insights. - Develop large-scale models to support specific business needs and enhance the company's efficiency. - Improve system design and architecture to ensure performance and high stability of services. - Troubleshoot, debug, and upgrade large-scale distributed systems by systematic problem-solving approach to improve product performance and the whole lifecycle of software services. - Monitor system scalability and reliability and understand the importance of experimentation framework, monitoring, and alerting. - Produce specifications and determine operational feasibility, document and maintain software functionality, and integrate software components into a fully functional software system. - Manage project requirements, priorities, and deliverable timelines effectively. - Collaborate with different functional teams to integrate software applications and technical solutions. - Work with business users, engineers, and other stakeholders to execute mission-critical projects. - Identify high priority engineering tasks/projects and review, analyze, and deliver technical requirements. - Understand product objectives to align with user demands. - Collaborate with engineering teams, customers, and PM and PSO teams to fulfill fast-growing technical demand. - Supervise EDI Analyst I & II (2). - Telecommuting permitted from within the U.S. Qualifications - Bachelor’s degree or foreign equivalent in Computer Science, Computer Engineering, Information Systems, or related field. - Two (2) years of experience as a Software Engineer, Software Developer, or related occupation. Requirements - Experience with designing and building large-scale distributed systems. - Improving software system scalability, performance, and reliability. - Applying data structures, algorithms, and software engineering principles. - Object-oriented programming language: Java, Python, or C#. - Debugging and optimizing complex systems. - Software Development Life Cycle (SDLC), including testing, deployment, and maintenance. - Collaborating with cross-functional teams to deliver technical solutions. Benefits - Compensation: $129,667 - $140,000 / year. - Employment Type: Full Time. - Location: Remote.
Role Description Join a team where your work truly makes an impact. In this role, you’ll leverage your expertise in Data Warehousing and Business Intelligence to design scalable, flexible, and resilient data solutions that support the continued growth and mission of TWIA/TFPA. You’ll play a key role in transforming data from multiple sources into meaningful insights, powering critical decision-making across the organization. Working closely with teams throughout our Property & Casualty insurance business, you’ll help deliver accurate, actionable intelligence to a wide range of stakeholders—turning complex data into real business value. If you’re passionate about building data-driven solutions and collaborating across the enterprise, this is an opportunity to make a lasting impact. Candidates selected for the Data Warehouse Developer position with TWIA/TFPA will be placed at a level that aligns with their demonstrated skills, competencies, and professional experience. Qualifications - Four-year college degree, preferably in CS, engineering or other technical field, or equivalent experience. - 7-12 years of directly relevant experience in Data Warehouse/Business Intelligence. - 3-8 years of hands-on experience in building ETL, reporting, and Data Visualization solutions using tools (Ex: SAP Data Services, Informatica, Data Stage, Microsoft SSIS packages, QlikView, Tableau etc.). - 1-5 years of work experience in Guidewire Data Management tools. - Experience in building conceptual, logical, and physical data modeling; dimensional modeling preferred. - Significant experience in software development processes and best practices. - Understanding of business operations - preferably property and casualty insurance. - Clear, effective verbal and written communication skills including the ability to actively listen, problem solve, and communicate with both technical and business users. - Demonstrated experience working both as an individual contributor and as part of a team. - Experience in providing guidance to junior developers, quality assurance, and business analysts. - Attention to detail with self-discipline, strong ownership and accountability and drive for results. Requirements - Develops recommendation and conducts analysis in support of department and business line strategic efforts. - Confers with client, technical staff, and team members to plan, design, develop, implement, and enhance applications, scripts, procedures, and metadata for relational databases. - Champion enterprise data management best practices & provides recommendations for data architecture structure and standards. - Participate in end-to-end architecture & data warehouse life cycle management activities & provides recommendations. - Manage numerous requests concurrently and strategically, prioritizing when necessary. - Designs, develops, and implements the data architecture for a data warehouse, including high-performance ETL (Extract, Transform, and Load) tools and routines and the infrastructure. - Recognize and leverage data sources both internally and externally to enhance the analysis capabilities. - Reverse Engineers data specifications from source systems through data discovery, analysis, and other techniques. - Develops and maintains Data Models, including Enterprise Data Dictionary. - Establishes methods and procedures for tracking data quality, completeness, redundancy, and improvement. - Participates with application and infrastructure design architects to provide guidance for development and releases. - Influences technology direction and/or adjustments to incorporate into business plans. - Influences the selection of hardware and software product standards and the design of standard configurations; makes significant contributions to market-centric technology roadmaps and architectural principles and frameworks. - Reviews and ensures major architectural designs are consistent, maintainable, flexible and cost effective solutions. - Leads retrospectives and defines opportunities for experiments and POC`s. - Perform troubleshooting on all ETL processes and resolve issues effectively within IT service level agreements. - Lead by example, engaging hands-on in complex data challenges and Information Management projects to derive deep insights from data. - Oversee business intelligence/data warehouse security and data load and implementation strategies. - Lead enterprise projects through all phases of System Development Life Cycle (SDLC)/ Agile process. - Perform other duties as assigned. Benefits - Comprehensive medical, dental, vision, group life, AD&D, and dependent life insurance coverage. - Flexible Spending Accounts (FSA) and Health Savings Accounts (HSA) with a generous annual HSA employer contribution. - A 401(k) retirement plan with employer matching contributions up to 6%, in addition to a pension plan that begins vesting at 5 years of employment. - Active support for professional growth, including ongoing training and professional development opportunities sponsored by TWIA, support to obtain professional certifications, and a tuition reimbursement program. - Generous paid time off, including vacation, sick leave, paid holidays, and personal days. - Flexible scheduling to support work-life balance.

