GitHub is the world’s leading AI-powered developer platform with 150 million developers and counting. We’re also home to the biggest open-source community on earth (and 99% of the world’s software has open-source code in its DNA). Many of the apps and programs you use every day are built on GitHub. Our teams are dreamers, doers, and pioneers, leading the way in AI, driving humanitarian efforts around the globe, and even sending open source to Mars (and beyond!). At GitHub, our goal is to create the space you need to do your best work. We’re remote-first and offer competitive pay, generous learning and growth opportunities, and excellent benefits to support you, wherever you are—because we know that people flourish when they can work on their own terms. Join us, and let’s change the world, together.
Software Engineer II, Data Engineering
Location
United States
Posted
50 days ago
Salary
$83.4K - $221.4K / year
Seniority
Mid Level
Job Description
Software Engineer II, Data Engineering
GitHub, Inc.
About GitHub GitHub is the world’s leading platform for agentic software development — powered by Copilot to build, scale, and deliver secure software. Over 180 million developers, including more than 90% of the Fortune 100 companies, use GitHub to collaborate, and more than 77,000 organisations have adopted GitHub Copilot. Locations In this role you can work from Remote, United States Overview As a software engineer at GitHub, you will enhance the collaboration experience at GitHub by working closely with a community of engineers and designers with a distributed, diverse and passionate team delivering the services that millions of developers depend on. In this role you will design, prototype, implement, ship and support highly performant and inspiring user experiences with your team. We are looking for creative problem solvers and diverse thinkers, people who care about culture as well as customers and features. We believe that how we do things is as important as what we do. Big vision, a common purpose, passion for quality, curiosity, dedication, and investment in fun and collaboration are what lead to great results. Great products reflect the teams that build them. Responsibilities - Design, develop, test and ship high-quality technical solutions that scale across multiple GitHub services and become intimately familiar with the systems you build and take pride in writing maintainable code. - Provide technical leadership, mentorship, pairing opportunities, and code reviews to encourage the growth of others; support teams in producing extensible and maintainable code, ensuring integration with downstream dependencies and adherence to quality standards. - Own and advocate for the health and quality of the systems that the team builds, including participating in on-call for first responder rotations and live incidents. - Write architecture briefs and proposals and carry out code experiments. - Design and implement APIs to facilitate seamless integration between software components. - Utilize CI/CD tools to set up automated pipelines for continuous integration and delivery. - Collaborate with cross-functional teams and partner with stakeholders and lead discussions for technical solutions, including design and cost considerations. - Create and guide others in 1) developing clear testing plans to assure solution quality, reliability, and performance; 2) defining success metrics; and 3) integrating customer feedback for continuous improvement - all while ensuring system architecture meets security and compliance standards. - Maintain executional and operational excellence within and potentially across teams/organizations. - Apply debugging tools and telemetry to verify assumptions, proactively resolve issues, and optimize code performance and maintainability. Qualifications Required Qualifications: - 2+ years experience in Software Engineering, Computer Science, or related technical discipline with proven experience maintaining and delivering production software languages including, but not limited to, C, C++, C#, JavaScript, Go, Ruby, Rust, or Python - OR Associate’s Degree in Computer Science, Electrical Engineering, Electronics Engineering, Math, Physics, Computer Engineering, Computer Science, or related field AND 3+ years experience - OR Bachelor's Degree in Computer Science, Electrical Engineering, Electronics Engineering, Math, Physics, Computer Engineering, Computer Science, or related field AND 2+ years experience in Computer Science, or related technical discipline with proven experience coding in languages including, but not limited to, C, C++, C#, JavaScript, Go, Ruby, Rust, or Python - OR equivalent experience. Preferred Qualifications: - Demonstrated experience with large-scale system architecture and design, particularly in cloud-based environments, with a strong understanding of distributed systems and microservices. - Experience in one or more scripting languages (e.g., Bash, Python, or a similar language) - Experience with cloud environments and/or Cloud Native Compute Foundation (CNCF) concepts - Experience working with both relational (e.g. mysql) and most importantly non-relational datastores (e.g. Cosmos) - Experience working with Azure resource such as Azure Storage (blob and table particularly), Azure Redis Cache, Azure Data Explorer Clusters. - Experience operating Cosmos DB clusters at scale. Compensation Range The base salary range for this job is USD $83,400.00 - USD $221,400.00 /Yr. These pay ranges are intended to cover roles based across the United States. An individual's base pay depends on various factors including geographical location and review of experience, knowledge, skills, abilities of the applicant. At GitHub certain roles are eligible for benefits and additional rewards, including annual bonus and stock. These rewards are allocated based on individual impact in role. In addition, certain roles also have the opportunity to earn sales incentives based on revenue or utilization, depending on the terms of the plan and the employee's role. GitHub values - Customer-obsessed - Ship to learn - Growth mindset - Own the outcome - Better together - Diverse and inclusive Manager fundamentals - Model - Coach - Care Leadership principles - Create clarity - Generate energy - Deliver success Who We Are GitHub is the world’s leading AI-powered developer platform with 150 million developers and counting. We’re also home to the biggest open-source community on earth (and 99% of the world’s software has open-source code in its DNA). Many of the apps and programs you use every day are built on GitHub. Our teams are dreamers, doers, and pioneers, leading the way in AI, driving humanitarian efforts around the globe, and even sending open source to Mars (and beyond!). At GitHub, our goal is to create the space you need to do your best work. We’re remote-first and offer competitive pay, generous learning and growth opportunities, and excellent benefits to support you, wherever you are—because we know that people flourish when they can work on their own terms. Join us, and let’s change the world, together. EEO Statement GitHub is made up of people from a wide variety of backgrounds and lifestyles. We embrace diversity and invite applications from people of all walks of life. We don't discriminate against employees or applicants based on gender identity or expression, sexual orientation, race, religion, age, national origin, citizenship, disability, pregnancy status, veteran status, or any other differences. Also, if you have a disability, please let us know if there's any way we can make the interview process better for you; we're happy to accommodate!
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
Senior Data Engineer
UnitedHealth GroupUnitedHealth Group is a healthcare and well-being company that’s dedicated to improving the health outcomes of millions around the world. We are comprised of
Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by diversity and inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health equity on a global scale. Join us to start Caring. Connecting. Growing together. Use your sharp technical (ETL / SQL Server) skills alongside your ability to quickly grasp operational data to help support short and long term operational / strategic business activities. You’ll develop, enhance, and maintain our databases, and ETLs. You’ll leverage SSIS daily to wrangle data into our environment from untapped enterprise databases. Effectively you’ll be a pseudo-DBA directly embedded within the engineering team. This is a role that if done well increases the effectiveness of our entire team! You will leverage AI tools including GitHub Copilot on daily basis for getting work done faster as well to automate the work. To succeed, you will be empowered to proactively and continuously refine our SQL server structure, permissions, schemas, custom objects, standards, as well as the content therein. We will look forward to you advising and executing on best practices based on your extensive history in the field. You will be responsible for routine monitoring of the health of our environment. Ideally, the individual who fills this role will become the go-to expert, overseer, advisor, and implementer of our SQL server ambitions; we want a best-in-class, meticulously maintained but fully pragmatic environment that can stand toe-to-to-toe with any other at UHG. We’re looking for that rare individual who can not only cast a vision but also deftly execute on it in such a way as to elate those that are supported by the fruit of this labor. We have been gifted with talents of understanding, manipulating, and conveying data in many forms. We aim for excellent quality and wise counsel. We genuinely serve our partners with capable means, a flexible approach, and a positive attitude. We offer our talents in our daily work as a service for others at this company which in turn provides for ourselves, our families, and our communities. You will enjoy the flexibility to telecommute* from anywhere within the U.S. as you take on some tough challenges. Primary Responsibilities: - Generating ad-hoc reports for operations - Performing data analysis - Troubleshooting production issues. Supported clients: UHC, Medica, CareSource (onshore only), Centene, Elevance, etc. RADV - Support is time-sensitive and critical. Workload has increased significantly due to new CMS mandate resulting in a manifold rise in charts processed through RADV - The position will leverage AI tools extensively for code/query/script writing using Copilot - Resource must be onshore to access PHI/PII daily for support - Leverage AI tools to automate manual processes - Design, develop, and deploy AI-powered solutions using no-code, low-code, and advanced platforms, translating business needs into scalable applications that enhance products, workflows and decision-making. You’ll be rewarded and recognized for your performance in an environment that will challenge you and give you clear directions on what it takes to succeed in your role as well as provide development for other roles you may be interested in. Required Qualifications: - Bachelor’s degree in CS or IT related field - 7+ years of SQL (CTEs, indexed temp tables, Stored Procedures, etc.) and SQL Server experience (server level, database level, securable, roles, grants, fragmentation, backups, schemas, sprocs, functions, etc.) - 5+ years of SSIS experience (Event Handling, Project Level Connections, Containers, Job Scheduling using SQL Server Agent, Delayed Validation, etc.) - 2+ years of .NET experience - 2+ years of cloud experience with either Azure, AWS or GCP - 1+ years of experience with AI tools including Github Copilot and Microsoft Copilot or similar - Ability to do On-Call/After hour support Preferred Qualifications: - 1+ years of Terraform experience - 1+ years of Azure Data Factory experience - Experience in business analysis, process improvement, workflow, benchmarking, or evaluation of business processes - Experience in Retrospective Risk Adjustment - Healthcare domain - Health care industry experience - Ability to work in a self-directed environment - Ability to work with less structured, more complex issues - Proven eye for detail and design sense - Demonstrated intermediate level of understanding of .NET (integration with SQL server, flow, expectations, advising, etc.) *All Telecommuters will be required to adhere to UnitedHealth Group’s Telecommuter Policy. Pay is based on several factors including but not limited to local labor markets, education, work experience, certifications, etc. In addition to your salary, we offer benefits such as, a comprehensive benefits package, incentive and recognition programs, equity stock purchase and 401k contribution (all benefits are subject to eligibility requirements). No matter where or when you begin a career with us, you’ll find a far-reaching choice of benefits and incentives. The salary for this role will range from $91,700 to $163,700 annually based on full-time employment. We comply with all minimum wage laws as applicable. Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records. Application Deadline: This will be posted for a minimum of 2 business days or until a sufficient candidate pool has been collected. Job posting may come down early due to volume of applicants. At UnitedHealth Group, our mission is to help people live healthier lives and make the health system work better for everyone. We believe everyone–of every race, gender, sexuality, age, location, and income–deserves the opportunity to live their healthiest life. Today, however, there are still far too many barriers to good health which are disproportionately experienced by people of color, historically marginalized groups, and those with lower incomes. We are committed to mitigating our impact on the environment and enabling and delivering equitable care that addresses health disparities and improves health outcomes — an enterprise priority reflected in our mission. UnitedHealth Group is an Equal Employment Opportunity employer under applicable law and qualified applicants will receive consideration for employment without regard to race, national origin, religion, age, color, sex, sexual orientation, gender identity, disability, or protected veteran status, or any other characteristic protected by local, state, or federal laws, rules, or regulations. UnitedHealth Group is a drug - free workplace. Candidates are required to pass a drug test before beginning employment. #RPO #GREEN
Senior Data Engineer
ZoomInfo Technologies LLCZoomInfo (NASDAQ: GTM) is the Go-To-Market Intelligence Platform that empowers businesses to grow faster with AI-ready insights, trusted data, and advanced automation. Its solutions provide more than 35,000 companies worldwide with a complete view of their customers, making every seller their best seller.
ZoomInfo is where careers accelerate. We move fast, think boldly, and empower you to do the best work of your life. You’ll be surrounded by teammates who care deeply, challenge each other, and celebrate wins. With tools that amplify your impact and a culture that backs your ambition, you won’t just contribute. You’ll make things happen–fast. About the Role We are looking for a highly skilled Senior Data Engineer to become part of our core Data & AI Engineering team. In this pivotal role, you will be responsible for designing and expanding enterprise-level data infrastructure that enables ZoomInfo's internal teams to interact with data comprehensively—extracting, exploring, analyzing, and generating insights—through various platforms using ZI's internal chat agent The ideal candidate has a strong background in big data processing, pipeline orchestration, and data modeling, with a proven track record of delivering scalable and high-quality data solutions in fast-paced, data-centric product environments. Given the dynamic nature of emerging technologies, this role requires an individual who excels at exploration and embraces continuous learning as core responsibilities. You'll constantly research and implement innovative solutions while integrating vast, diverse data sources into our AI applications, including our industry-leading LLM-powered systems What you’ll do: - Design, develop, and maintain high-performance, product-centric data pipelines using Airflow, DBT, and Python. - Architect and optimize the massive-scale data warehouse and lakehouse that serves as our single source of truth for all customer data, primarily using Snowflake. - Lead the integration of diverse structured and unstructured data sources (e.g., web data, third-party APIs) into our data ecosystem, ensuring high-quality and reliable ingestion. - Implement and enforce Model Context Protocol (MCP) or similar architectures to feed accurate and contextual data into our LLM-powered products for applications like Retrieval Augmented Generation (RAG) and advanced search. - Collaborate with ML engineers, data scientists, and product managers to translate business needs into scalable data solutions that directly enhance customer value. - Define, monitor, and enforce data quality SLAs across all pipelines and products, ensuring data accuracy and lineage are a top priority. - Mentor and coach junior engineers, promoting best practices in code quality, data architecture, and operational excellence. - Participate in architectural decisions and long-term strategy planning for our enterprise-wide data infrastructure, with a focus on cost, performance, and reliability. What you bring: - Expert-level SQL for building performant, scalable queries and transformations on massive datasets. - Strong Python programming skills with a focus on distributed computing, data manipulation, and building robust APIs. - Production-level experience for large-scale batch and streaming data processing. - Hands-on experience with DBT (Data Build Tool) for advanced data modeling and transformations in a modern data stack. - Deep knowledge of Snowflake data warehouse design, optimization, and cost modeling. - Experience implementing Model Context Protocol (MCP) or similar architectures to feed structured and unstructured data into LLM-powered systems. - Strong understanding of data architecture concepts, including data lakes, event-driven architectures (e.g., Kafka), ETL/ELT, and data mesh. - Proficiency with cloud platforms (GCP and/or AWS) and infrastructure as code (e.g., Terraform). Nice to Have - Familiarity with LLMOps, LangChain, or RAG (Retrieval Augmented Generation) pipelines. - Experience with building embedding models or pipelines for Named Entity Recognition (NER). - Knowledge of data cataloging tools (e.g., OpenLIneage, etc.) and lineage tracking. - Familiarity with other distributed systems and databases (e.g., DynamoDB, Flink). Required Non-Technical Skills - Excellent communication skills – ability to explain complex technical concepts to both engineering teams and non-technical stakeholders. - Strategic & Product-Oriented Thinking – can translate business objectives and customer needs into scalable, high-impact data solutions. - Leadership & Mentorship – experience guiding and uplifting engineering teams to achieve their full potential. - Stakeholder Management – able to collaborate effectively across departments (Product, Engineering, Sales, Compliance). - Agility & Adaptability – thrives in ambiguous, evolving environments and can rapidly prototype and iterate on solutions. - Strong documentation habits and ability to evangelize best practices across the organization. Qualifications - Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field. - 8+ years of progressive experience in data engineering, with a track record of leadership and impact. Demonstrated experience in implementing or scaling data infrastructure for a data-centric product company. #LI-AR2 #LI-REMOTE About us: ZoomInfo (NASDAQ: GTM) is the Go-To-Market Intelligence Platform that empowers businesses to grow faster with AI-ready insights, trusted data, and advanced automation. Its solutions provide more than 35,000 companies worldwide with a complete view of their customers, making every seller their best seller. ZoomInfo is committed to protecting your privacy when you apply for jobs with us. Please review our Job Applicant Privacy Notice for more details on how we handle your personal information. ZoomInfo may use a software-based assessment as part of the recruitment process. More information about this tool, including the results of the most recent bias audit, is available here. ZoomInfo is proud to be an equal opportunity employer, hiring based on qualifications, merit, and business needs, and does not discriminate based on protected status. We welcome all applicants and are committed to providing equal employment opportunities regardless of sex, race, age, color, national origin, sexual orientation, gender identity, marital status, disability status, religion, protected military or veteran status, medical condition, or any other characteristic protected by applicable law. We also consider qualified candidates with criminal histories in accordance with legal requirements. For Massachusetts Applicants: It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability. ZoomInfo does not administer lie detector tests to applicants in any location.
• Accurately input, update, and maintain data across multiple business and software systems • Support data migration and conversion activities as part of the Facets implementation • Perform routine data management tasks, ensuring data quality, accuracy, and completeness • Validate and reconcile data across systems to ensure consistency during migration • Identify and report data discrepancies, supporting resolution efforts • Collaborate with data teams, QA, and business stakeholders to support integration and migration activities • Follow established data governance standards and processes • Maintain documentation related to data processes, workflows, and validation activities • Support senior team members with data-related tasks and deliverables
About Ancestry: When you join Ancestry, you join a human-centered company where every person’s story is important. Ancestry®, the global leader in family history, connects everyone with their past so they can discover, preserve, and share their unique family stories. With our unparalleled collection of more than 65 billion records, over 3.5 million subscribers, and over 27 million people in our growing DNA network, customers can discover their family story and gain a new level of understanding about their lives. Over the past 40 years, we’ve built trusted relationships with millions of people who have chosen us as the platform for discovering, preserving, and sharing the most important information about themselves and their families. We are committed to our location flexible work approach, allowing you to choose to work in the nearest office, from your home, or a hybrid of both (subject to location restrictions and roles that are required to be in the office- see the full list of eligible US locations HERE). We will continue to hire and promote beyond the boundaries of our office locations, to enable broadened possibilities for employee diversity. Together, we work every day to foster a work environment that's inclusive as well as diverse, and where our people can be themselves. Every idea and perspective is valued so that our products and services reflect the global and diverse clients we serve. Ancestry encourages applications from minorities, women, the disabled, protected veterans and all other qualified applicants. Passionate about dedicating your work to enriching people’s lives? Join the curious. We are looking for candidates with strong experience in genomic data analysis and enthusiasm in human populations and/or family history. Through regular mentorship from our scientists, you will gain valuable research experience for the next step in your career. You will have the opportunity to apply cutting edge computational and statistical approaches to the largest database of human genome and pedigree data in the world. You will develop methods to help millions of people understand their ancestry, their family, and themselves. Help us make an impact in this exciting field! Join the DNA Science team as a Postdoctoral Fellow and conduct innovative research on the world’s largest genomic and pedigree database to further our knowledge on human populations and family history. What you will do: - You will analyze Ancestry’s massive-scale genomic datasets to uncover deep insights into population structure, demographic history, and fine-scale genetic relationship inference. - You will develop novel statistical models and computational methods, leveraging advanced Machine Learning and Deep Learning to solve complex genomic challenges. - You will conduct exploratory research while maintaining a focus on demonstrating progress, identifying clear next steps, and moving toward concrete milestones. - You will aim to publish original research in high-impact scientific journals and present your findings at top-tier conferences. - While your primary focus is research, you will have the chance to collaborate with other scientists to explore opportunities to integrate your research outcomes into the Ancestry DNA product ecosystem to benefit millions of customers. Who you are: - Ph.D. in Population Genetics, or Genomic Data Science, or Machine Learning (with experience applying it to genomic data). - Demonstrated expertise in applying and developing sophisticated computational methods. Experience with modern AI methods is a plus. - High proficiency in Python, C, or any other language that supports efficient, large-scale data analysis and machine learning. - A mindset for critical thinking and innovation, with a demonstrated ability to show results and progress while in an exploratory mode. - Strong communication skills to present complex genomics and statistical concepts to both technical and non-technical audiences. Additional Information: Ancestry is an Equal Opportunity Employer that makes employment decisions without regard to race, color, religious creed, national origin, ancestry, sex, pregnancy, sexual orientation, gender, gender identity, gender expression, age, mental or physical disability, medical condition, military or veteran status, citizenship, marital status, genetic information, or any other characteristic protected by applicable law. In addition, Ancestry will provide reasonable accommodations for qualified individuals with disabilities. All job offers are contingent on a background check screen that complies with applicable law. For candidates who live in San Francisco, CA, pursuant to the San Francisco Fair Chance Ordinance, Ancestry will consider for employment qualified applicants with arrest and conviction records. Ancestry is not accepting unsolicited assistance from search firms for this employment opportunity. All resumes submitted by search firms to any employee at Ancestry via-email, the Internet or in any form and/or method without a valid written search agreement in place for this position will be deemed the sole property of Ancestry. No fee will be paid in the event the candidate is hired by Ancestry as a result of the referral or through other means.

