Distro logo
Distro

Distro is a marketplace to find, hire, and pay technical talent in over 200 countries. Join now for free.

Data Scraper

Data EngineerData EngineerFull TimeRemoteMid LevelTeam 1-10Since 2021H1B SponsorCompany SiteLinkedIn

Location

Worldwide

Posted

2 days ago

Salary

$350 - $500 / year

Seniority

Mid Level

Job Description

Data Scraper

Distro

Role Description We are looking for a Data Scraping Specialist to collect, organize, and normalize data from public and government sources into reliable formats. 🕒 Schedule: Monday to Friday: 12:00 PM – 8:00 PM CST What will be your main challenges? (Responsibilities) - Research and identify public and government data sources. - Extract, transform, and normalize data from websites, APIs, feeds, FTP sources, and online repositories. - Design and build reusable, scalable, and maintainable ETL processes and workflows (no one-off scripts!). - Apply advanced web scraping techniques using Python, HTTP requests, and HTML parsing. - Ensure quality: Identify inconsistencies, validate data samples, and document methodologies and processes. - Collaborate and version control: Maintain repositories using Git under development best practices and maintain clear communication with stakeholders. Qualifications - Solid experience in web scraping, data scraping, and structured/unstructured data extraction. - Technical proficiency: Hands-on experience programming in Python (or similar languages), knowledge of APIs, HTTP, FTP, HTML parsing, and relational databases like PostgreSQL. - Language: Advanced English level (fluent written and technical communication). - Analytical mindset: Ability to solve complex data acquisition problems, optimize solutions, and work independently while taking on technical challenges. - Quality focus: Strong emphasis on data validation, normalization, and documentation. Benefits - $350 - $500 a month Company Description We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses and identifying potential inconsistencies or verification signals in application materials based on available information. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

Related Categories

Related Job Pages

More Data Engineer Jobs

Data Engineer Miami, FL / Remote / Hybrid / Tampa, FL Information Technology /Full Time Hybrid /Hybrid Job Description Data Engineer  Employment Type: Full-Time, Mid-level Department: Business Intelligence CGS is seeking a passionate and driven Data Engineer to support a rapidly growing Data Analytics and Business Intelligence platform focused on providing solutions that empower our federal customers with the tools and capabilities needed to turn data into actionable insights. The ideal candidate is a critical thinker and perpetual learner excited to gain exposure and build skillsets across a range of technologies while solving some of our clients toughest challenges. CGS brings motivated highly skilled and creative people together to solve the governments most dynamic problems with cutting-edge technology. To carry out our mission we are seeking candidates who are excited to contribute to government innovation appreciate collaboration and can anticipate the needs of others. Here at CGS we offer an environment in which our employees feel supported and we encourage professional growth through various learning opportunities. Skills and attributes for success - Complete development efforts across data pipeline to store manage store and provision to data consumers. - Being an active and collaborating member of an AgileScrum team and following all AgileScrum best practices. - Write code to ensure the performance and reliability of data extraction and processing. - Support continuous process automation for data ingest. - Achieve technical excellence by advocating for and adhering to lean-agile engineering principles and practices such as API-first design simple design continuous integration version control and automated testing. - Work with program management and engineers to implement and document complex and evolving requirements. - Help cultivate an environment that promotes customer service excellence innovation collaboration and teamwork. - Collaborate with others as part of a cross-functional team that includes user experience researchers and designers product managers engineers and other functional specialists. Qualifications - Must be a US Citizen. - Must be able to obtain a Public Trust Clearance. - 7 years of IT experience including experience in design management and solutioning of large complex data sets and models. - Experience with developing data pipelines from many sources from structured and unstructured data sets in a variety of formats. - Proficiency in developing ETL processes and performing test and validation steps. - Proficiency to manipulate data Python R SQL SAS. - Strong knowledge of big data analysis and storage tools and technologies. - Strong understanding of the agile principles and ability to apply them. - Strong understanding of the CICD pipelines and ability to apply them. - Experience with relational database such as PostgreSQL. - Work comfortably in version control systems such as Git Repositories. Ideally you will also have - Experience creating and consuming APIs. - Experience with DHS and knowledge of DHS standards a plus. - Candidates will be given special consideration for extensive experience with Python. - Ability to develop visualizations utilizing Tableau or PowerBI. - Experience in developing Shell scripts on Linux. - Demonstrated experience translating business and technical requirements into comprehensive data strategies and analytic solutions. - Demonstrated ability to communicate across all levels of the organization and communicate technical terms to non-technical audiences. Our Commitment Contact Government Services CGS strives to simplify and enhance government bureaucracy through the optimization of human technical and financial resources. We combine cutting-edge technology with world-class personnel to deliver customized solutions that fit our clients specific needs. We are committed to solving the most challenging and dynamic problems. For the past seven years weve been growing our government-contracting portfolio and along the way weve created valuable partnerships by demonstrating a commitment to honesty professionalism and quality work. Here at CGS we value honesty through hard work and self-awareness professionalism in all we do and to deliver the best quality to our consumers mending those relations for years to come. We care about our employees. Therefore we offer a comprehensive benefits package - Health Dental and Vision-Life Insurance - 401k - Flexible Spending Account Health Dependent Care and Commuter - Paid Time Off and Observance of StateFederal Holidays Contact Government Services LLC is an Equal Opportunity Employer. Applicants will be considered without regard to their race color religion sex sexual orientation gender identity national origin disability or status as a protected veteran. Join our team and become part of government innovation 112597.33 - 152810.66 a year We may use artificial intelligence AI tools to support parts of the hiring process such as reviewing applications analyzing resumes or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed please contact us.

Florida
$112.6K - $152.8K / year
Evolved Ideas logo

Snowflake Data Engineer

Evolved Ideas

We don't just build technology, we build highly scalable technology-enabled businesses.

Data Engineer3 days ago
Full TimeRemoteTeam 51-200Since 2007H1B No Sponsor

Role Description We are looking for a Data Engineer to join the Evolved Ideas team for one of our projects. - Contributing to a high-scale, complex product and seeing the real-time impact of your work Qualifications - 4+ Years Experience of Snowflake data platform - SQL, Python - Experience of Github / Workflows / Azure DevOps - Experience of Open flow (Snowflake hosted would be beneficial) - Experience of DBT - DBT Core / Dbt Project (not cloud) - Experience of using Cortex and AI functions to process documentation and extract key information - Experience of Contractual documentation extraction and reporting - Experience of data modelling (Star schema) - Experience of Power BI report creation (design, build and testing) - Client facing role, so communication skills are needed Benefits - Healthcare insurance - Educational budget - Challenging tasks and professional development, knowledge & best practice sharing Company Description We are a multi-award winning team of over 100 engineers, designers and analysts based in Leicester, with development hubs in Ukraine and Spain. We specialise in bespoke software development, extended teams / staff augmentation and support as a service.

Ukraine
BraunAbility logo

AI Data Architect

BraunAbility

Devoted to making life a moving experience for all.

Data Engineer3 days ago
Full TimeRemoteTeam 1,001-5,000Since 1972

• Build, optimize, and maintain the data pipelines that feed BAA's AI solutions • Automate the ingestion and structuring of unstructured data to enable highly accurate Retrieval-Augmented Generation (RAG) architectures • Map complex enterprise data schemas to support the cognitive architecture • Design and implement strict Role-Based Access Control (RBAC) and enterprise security protocols within the AI environment • Monitor data flows for system drift, broken APIs, or context collapse • Establish and maintain Service Level Agreements (SLAs) for pipeline uptime • Develop secure API integrations between enterprise AI platforms and BAA's existing technology stack

United States

Senior Data Specialist II Location: - Baltimore, MD - Rockville, MD Full-time Hybrid Job Description Senior Data Specialist II  Full-Time Experienced Department: eDiscovery CGS is seeking an experienced Senior Data Specialist II with extensive knowledge of litigation discovery processes to provide assistance in the EDRM workflow for a large Federal agency initiative. CGS brings motivated, highly skilled, and creative people together to solve the governments most dynamic problems with cutting-edge technology. To carry out our mission, we are seeking candidates who are excited to contribute to government innovation, appreciate collaboration, and can anticipate the needs of others. Here at CGS, we offer an environment in which our employees feel supported, and we encourage professional growth through various learning opportunities. Skills and Attributes for Success • Performs file manipulation, loading, conversion services, database indexing, and quality checks of loads. • Develops, evaluates, and modifies methodologies and procedures for manipulating files for use with COTS products and litigation support applications. • Responsible for ensuring that incoming productions are made pursuant to the applicable ESI specifications. • Performs advanced tasks related to exporting data from contractor and client databases, including identifying data for export, confirming redactions, and other markups, ensuring that exports comply with applicable ESI specifications and quality check of exported data. • Supports client attorneys, investigators, and paralegals by tracking and processing incoming documents, subpoena returns, and data, creating loading and managing document review databases, producing documents to opposing parties in litigation, and tracking produced documents. • Applications used include Everlaw, Relativity, Eclipse, Trial Director, NUIX LAW, EZManage, CaseView, Metadata Assistant, Beyond Compare, eScan-IT, CaseMap, TextMap, TimeMap, Camtasia, and other applications as directed or as required to complete processing. Responsibilities • Under guidance from the client attorneys, manages documents and data, including the use of document review tools. • Documents and data include physical documents, a wide range of Electronically Stored Information (ESI) discovery, forensic images, subpoena returns, PDFs, audio/video files, pictures, forms, email, and others as required to support the client attorneys. • Document review tools include those listed in item. • Contractor will work with the Litigation Support Manager to ensure that incoming productions are made pursuant to the applicable ESI specifications and when deficiencies are found provides Litigation Support Manager with detailed notice of deficiencies. • Coordinate with the clients Technology Service Center regarding litigation support projects that are outsourced to the client. • Contractor will ensure that all exports for productions are made pursuant to applicable ESI specifications and/or the requirement of the requesting party or client personnel using the guidelines utilized by the Litigation Support Unit. • Work with Litigation Support Manager and client attorneys when issues may arise in discovery negotiations with defense counsel. • Contractor will work with the Litigation Support Specialist in modifying and manipulating files for use with COTS products and litigation support applications. Qualifications • Undergraduate degree preferred, preferably in computer science or related field. • Requires knowledge of litigation discovery process and the Electronic Discovery Reference Model (EDRM) workflow. • Knowledge of Governments IT environment, including office automation, networks, PC, and server-based applications preferred. • Working knowledge of personal computers, including Windows, document review software, and encryption methods. • Experience with LAW, IPRO, Relativity, or other document processing platform. • Familiarity with ICONECT, Relativity, MS Office Suite, and West LiveNote valued. • At least two years of experience performing eDiscovery roles, including but not limited to electronic files processing, EFP image and data file conversion, data culling using review tools, quality assurance, database loads, and retrieval, and data analysis. Our Commitment Contact Government Services, CGS strives to simplify and enhance government bureaucracy through the optimization of human, technical, and financial resources. We combine cutting-edge technology with world-class personnel to deliver customized solutions that fit our clients specific needs. We are committed to solving the most challenging and dynamic problems. For the past seven years, weve been growing our government-contracting portfolio, and along the way, weve created valuable partnerships by demonstrating a commitment to honesty, professionalism, and quality work. Here at CGS, we value honesty through hard work and self-awareness, professionalism in all we do, and to deliver the best quality to our consumers, mending those relations for years to come. We Care About Our Employees We offer a comprehensive benefits package, including: • Health • Dental and Vision • Life Insurance • 401k • Flexible Spending Account • Health Dependent Care and Commuter • Paid Time Off and Observance of State/Federal Holidays Contact Information Government Services LLC is an Equal Opportunity Employer. Applicants will be considered without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran. Join our team and become part of government innovation. Explore additional job opportunities with CGS on our Job Board: https://cgsfederal.com/join-our-team For more information about CGS, please visit: https://www.cgsfederal.com Salary Range $120,000 - $150,000 per year Note We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

Maryland
$120K - $150K / year