Reach peak performance | IT consulting and software engineering backed by our expertise in Dev Experience, ML and Scala
Staff Data Engineer
Location
New York
Posted
92 days ago
Salary
0
Seniority
Lead
Job Description
Staff Data Engineer
VirtusLab
• building web crawling or large-scale data systems from scratch • designing scalable, fault-tolerant distributed systems • leading complex technical initiatives • mentoring engineers and promoting a collaborative culture • operating ETL/ELT pipelines • extracting structured/unstructured web data
Job Requirements
- Proven experience building web crawling or large-scale data systems from scratch
- Strong architectural skills in designing scalable, fault-tolerant distributed systems
- Track record leading complex technical initiatives and driving architecture direction for teams
- Demonstrated ability to evolve production systems incrementally while maintaining reliability
- Experience mentoring engineers at all levels and promoting a collaborative culture
- Deep background in large-scale data engineering (terabytes daily)
- Hands-on experience with cloud data warehouses (BigQuery, Snowflake)
- Experience with Apache Kafka, Kubernetes (GKE/EKS), and orchestration tools (Airflow)
- Familiarity with multi-cloud environments (GCP + AWS)
- Expertise in designing and operating ETL/ELT pipelines
- Deep expertise in web crawling technologies and advanced scraping (Scrapy or similar)
- Experience in extracting structured/unstructured web data and SERP extraction
- Knowledge of proxy infrastructure management, anti-bot detection, and ethical crawling
- Familiarity with crawling vendors and AI/LLM-based extraction approaches
- Support the VirtusLab U.S. and international teams by lending senior technical expertise to client-facing activities, including technical discovery sessions, workshops, and solution architecture
- Conduct requirements analysis and solution discovery, identifying business and technical needs
- Provide technical consulting and advisory services, recommending appropriate data architectures aligned with customer goals
- Prepare and review technical sections of commercial offers, including solution descriptions, statements of work (SoWs), project estimates, timelines, and delivery models
Benefits
- self-development opportunities
- good working conditions
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
Relativity Archiving Analyst
Contact Government ServicesContact Review prides itself on finding high-quality, high-accountability, barred attorneys specifically tailored to the needs of our project. Assists with document review, privilege review, expert testimony, legal research, and foreign language translation Fosters a culture where every team member sees themselves as an extension of the project's team Looks for ways to improve efficiency and streamline workflows
This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more. Role Description CGS is seeking a Relativity Archiving Analyst, who will be responsible for vetting Relativity workspaces and file share folders and archiving or purging them. File shares will be moved to archive locations. Relativity workspaces will be archived using both Relativity ARM and a flat format which can be fully restored in Relativity or another system. - Collaborating with DOJ management, lead attorneys, and Section Chiefs on the disposition of data/files - Archiving older file shares - Archiving full Relativity workspaces using ARM - Archiving images, natives, text - Archiving in flat format the metadata, coding fields, choices/tags - Documenting user interface - Documenting the archiving process for approval by the Senior IT Manager - Evaluating and resolving any archiving issues Qualifications - At least 3 years of hands-on experience with backend Relativity 2022 and prior - At least 3 years of hands-on experience with archiving Relativity workspaces - At least 3 years of hands-on experience with restoring Relativity archives workspaces - Knowledge of Windows permissions and file transfer utilities - Excellent written and oral communication skills required - Experience working in a collaborative environment - Must be a US Citizen - Must be able to obtain a Public Trust security clearance Requirements - An undergraduate degree is strongly preferred; preferably in the computer science or management information/technology disciplines - Experience in storage technology planning, performance capacity planning, and modeling applications Benefits - Health, Dental, and Vision - Life Insurance - 401k - Flexible Spending Account (Health, Dependent Care, and Commuter) - Paid Time Off and Observance of State/Federal Holidays
Lead Microsoft Data Engineer
Procentrix, LLCDelivering practical solutions to solve complex challenges with an eye on maximizing customers' current IT investments
This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more. Role Description The Lead Azure Data Engineer is responsible for migration, governance, and data integration for the implementation of new Microsoft Cloud based document routing and task management system. They will ensure alignment with the technical, security, and document management requirements such as: - Leading data discovery and mapping - Validating metadata standards - Managing data migration into Dataverse and SharePoint Online - Ensuring proper implementation of retention, auditability, and role-based access controls within the Microsoft SaaS architecture Qualifications - A minimum of 6 years IT experience with focus on data integration and data migration using Azure Data Services - Strong experience designing data models, configuring Dataverse tables, managing relationships, enforcing row- and field-level security, and supporting workflow-driven data structures - Knowledge of SharePoint Online document libraries, metadata schemas, retention labels, records management, and large-volume document storage design - Proficiency in data mapping, transformation, validation, and migration using APIs, Azure services, Power Automate, and REST-based integrations - Experience with Azure data services (e.g., Azure Data Factory, Azure SQL, Azure AI Search, Azure storage components) in FedRAMP-aligned environments - Understanding of producing audit logging, identity integration (Entra ID), and support for records retention and legal hold requirements - Ability to structure data for Power BI dashboards, performance reporting, and AI-enabled search and discovery - Establishing metadata standards, validation rules, lifecycle controls, and ensuring data accuracy and consistency across enterprise workflows - Bachelors degree or equivalent - Must be a US Citizen Requirements - Microsoft Azure data services certifications - Active Federal Government Public Trust clearance Benefits The projected compensation range for this position is $135K - $160K annualized (USD). The final salary offered will generally fall within this range and is determined by various factors, including but not limited to the individual's particular combination of education, knowledge, skills, competencies, and experience, as well as internal pay equity, location, contract-specific affordability and other organizational requirements.
Data Engineer I
Centene CorporationTransforming the health of the communities we serve, one person at a time.
You could be the one who changes everything for our 28 million members by using technology to improve health outcomes around the world. As a diversified, national organization, Centene's technology professionals have access to competitive benefits including a fresh perspective on workplace flexibility. Position Purpose: Develops and operationalizes data pipelines to make data available for consumption (reports and advanced analytics). This includes data ingestion, data transformation, data validation/quality, data pipeline optimization, orchestration, and engaging with DevSecOps Engineer during continuous integration and continuous deployment. - Designs and implements standardized data management procedures around data staging, data ingestion, data preparation, data provisioning, and data destruction (scripts, programs, automation, assisted by automation, etc.) - Designs, develops, implements, tests, documents, and operates large-scale, high-volume, high-performance data structures for business intelligence analytics - Designs, develops, and maintains real-time processing applications and real-time data pipelines - Ensures quality of technical solutions as data moves across Centene’s environments - Provides insight into the changing data environment, data processing, data storage, and utilization requirements for the company and offer suggestions for solutions - Develops, constructs, tests, and maintains architectures using programming language and tools - Identifies ways to improve data reliability, efficiency, and quality; use data to discover tasks that can be automated - Helps maintain the integrity and security of company data - Performs other duties as assigned - Complies with all policies and standards Education/Experience: A Bachelor's degree in a quantitative or business field (e.g., statistics, mathematics, engineering, computer science) and requires 0 – 2 years of related experience. Or equivalent experience acquired through accomplishments of applicable knowledge, duties, scope and skill reflective of the level of this position. Or completion of a Centene-sponsored emerging talent program. Technical Skills: - Experience with Microsoft SQL Servers; SQL Server System - Experience with Big Data; Data Manipulation; Data Mining; Power Tool Operation - Experience building / operating highly available, distributed systems of data extraction, ingestion, and processing of large data sets - Plus if you have experience with Teradata and Snowflake Soft Skills: - Beginner - Seeks to acquire knowledge in area of specialty - Beginner - Ability to identify basic problems and procedural irregularities, collect data, establish facts, and draw valid conclusions Pay Range: $27.02 - $48.55 per hour Centene offers a comprehensive benefits package including: competitive pay, health insurance, 401K and stock purchase plans, tuition reimbursement, paid time off plus holidays, and a flexible approach to work with remote, hybrid, field or office work schedules. Actual pay will be adjusted based on an individual's skills, experience, education, and other job-related factors permitted by law, including full-time or part-time status. Total compensation may also include additional forms of incentives. Benefits may be subject to program eligibility. Centene is an equal opportunity employer that is committed to diversity, and values the ways in which we are different. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, veteran status, or other characteristic protected by applicable law. Qualified applicants with arrest or conviction records will be considered in accordance with the LA County Ordinance and the California Fair Chance Act
• Design, implement, and optimize relational data models for web applications and data warehouses • Develop and optimize complex batch and near-real-time SQL queries to meet product requirements and business needs. • Design and implement core foundational datasets that are reusable, scalable, and performant. • Architect, implement, deploy, and maintain data-driven solutions in Snowflake and Databricks. • Develop and manage data pipeline supporting multiple reports, tools, and applications. • Engagement in data governance council to define, classify, and review standards/guidelines to promote data quality and best practices. • Lead, plan, and collaborate in an agile scrum team • Lead and engage in requirement planning, trainings, and presentations with technical and non-technical teams



