Job Closed

This listing is no longer active.

Arbitration Forums Inc.

AF is a remote working environment.

Data Catalog Specialist

Data EngineerData EngineerFull Time Remote Mid Level

Location

United States

Posted

7 days ago

Salary

Seniority

Mid Level

AI/ML AI SQL NoSQL Hadoop Azure Snowflake Databricks ETL Apache HTTP Server dbt Informatica Matillion AWS GCP Unity Power BI Python R

Job Description

Role Description This role at Arbitration Forums is as unique as it is rewarding because of the AF IPAAL Values (Integrity, Passion, Accountability, Achievement, Leadership) and TRI Model (Trust, Respect, Inclusion). The Data Catalog Specialist drives and supports our data governance initiatives and ensures the integrity, quality, and security of our organization's data assets. This role is responsible for defining, implementing, and managing our data cataloging and metadata management practices as part of our overall data governance program. This role is an SME for AF in the areas of data classification, cataloging, lineage harvesting and plotting, and metadata, contributing to the quality and security of our data, ensuring that data assets are documented, discoverable, and accessible for utilization in advanced analytics, data science, machine learning, and AI-powered applications. Departmental Expectation of Employee - Adheres to AF Policy and Procedures and the AF IPAAL Values and TRI Model - Acts as a role model within and outside AF - Performs duties as workload necessitates - Maintains a positive and respectful attitude - Communicates regularly with the departmental leader about department issues - Demonstrates flexible and efficient time management and ability to prioritize workload - Consistently reports to work on time, prepared to perform duties of the position - Meets Department productivity standards Essential Duties and Responsibilities - Develop, implement, and maintain strategies for data discovery, cataloging, and metadata management - Ensure that the quality of the data cataloged meets company standards and adheres to internal guidelines - Curate and validate metadata for integrity and accuracy, ensuring that taxonomies and structures are captured to provide additional context - Maintain clear lineage information and establish source of truth for data items, including mapping and reconciliation of information - Work with data owners to ensure that taxonomies are aligned with business definitions - Define and measure the key performance indicators (KPIs) associated with data cataloging and lineage, identifying deviations and recommending corrective actions - Enable effective data stewardship and accountability through the definition of supporting processes and implementation of procedures in cataloging tools - Lead and drive to completion efforts to capture and maintain metadata, including data lineage, data quality, and data usage information - Provide guidance to employees on data governance and cataloging processes and procedures, as well as training on the data governance tool stack - Raise awareness of data management best practices and their importance to business operations - Foster a culture of data stewardship, collaboration, and continuous improvement - Monitor compliance with data governance policies and assess effectiveness of controls - Prepare and present data governance reports, dashboards, and metrics to senior leadership Qualifications - Bachelor’s or Master’s degree in Computer Science, Information Systems, Data Science, or a related field - 6 years of experience in data cataloging tools and platforms - In-depth understanding of metadata management principles and practices, as well as strong working knowledge of frameworks, processes, and tools for data lineage plotting and harvesting - Strong understanding of data governance principles, frameworks, and best practices - Familiarity with regulatory requirements and industry standards related to data privacy and security - Technical Skills: - Proficiency in data modeling, database design, and data warehousing (e.g., SQL, NoSQL, Hadoop, MS Azure cloud, etc.) - Understanding of cloud-based data platforms (Snowflake, Databricks) - Experience with ETL/ELT tools and data integration technologies (e.g., Apache, DBT, Informatica, Matillion, Talend) - Working knowledge of cloud services (i.e., MS Azure, AWS, Google Cloud) - Strong understanding of data catalogs and lineage plotting practices and tools (Unity, Purview) - Familiarity with data visualization and reporting tools (e.g., Webfocus, Power BI) - Proficiency in programming languages such as Python or R - Soft Skills: - Excellent analytical and problem-solving abilities - Strong communication and interpersonal skills to collaborate with cross-functional teams - Ability to lead projects and mentor junior staff - Auto Insurance claims industry experience preferred Americans with Disability Specifications Physical Demands The physical demands described here are representative of those that must be met by an employee to successfully perform the essential functions of this job. While performing the duties of this job, the employee is occasionally required to stand; walk; sit; use hands to finger, handle, or feel objects, tools, or controls; reach with hands and arms; climb stairs; balance; stoop, kneel, crouch or crawl; talk or hear; taste or smell. The employee must occasionally lift and/or move up to 25 pounds. Specific vision abilities required by the job include close vision, distance vision, color vision, peripheral vision, depth perception, and the ability to adjust focus. Work Environment This is a fully remote position requiring reliable high-speed internet access and a dedicated workspace. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions.

Related Categories

Data Engineer

Related Job Pages

Remote Full-time Jobs (US)Remote Python Jobs (US)More Remote Jobs

More Data Engineer Jobs

Data Engineering Analyst, Mid-level

LIBBS FARMACÊUTICA LTDA

Porque se trata da vida

Data Engineer7 days ago

Full Time RemoteTeam 1,001-5,000Since 1958H1B No Sponsor

Company Site LinkedIn

• Design and implement robust, scalable data pipelines to meet various business and analytics team needs. • Integrate and process large volumes of data from diverse sources (structured and unstructured). • Contribute to defining and standardizing data engineering best practices and ensure data quality and integrity. • Develop automation solutions for data ingestion, transformation and provisioning routines. • Support business areas with technical solutions aimed at democratizing data (self-service BI, catalogs, data models). • Actively participate in agile ceremonies and collaborate with multidisciplinary teams across data, product and technology. • Document processes, data flows and technical standards to promote reuse and scalability.

Airflow AWS Azure Cloud Google Cloud Platform Python Spark SQL

View details: Data Engineering Analyst, Mid-level

Brazil

Apply

Senior Data Engineer – AI Automations, Azure, AWS

Lumini IT Solutions

Transformando negócios através da tecnologia! ☁️💡

Data Engineer7 days ago

Contract RemoteTeam 51-200Since 2008H1B No Sponsor

Company Site LinkedIn

• Design and develop data ingestion pipelines from external sources (ETLs, APIs, relational and non-relational databases, files, telemetry sensors, etc.); • Integrate and consolidate data from different sources within the Azure environment, ensuring data quality and consistency; • Implement reference architecture for data processing and storage; • Create and manage data lakes, data warehouses, and data marts within the Azure ecosystem; • Develop and optimize ETL/ELT processes using Microsoft tools such as Azure Data Factory, Databricks, Synapse Analytics, and Azure Functions; • Ensure governance, security, and scalability of data pipelines; • Monitor and optimize data load performance, identify bottlenecks, and propose improvements; • Work closely with data scientists and functional analysts to meet business needs.

Azure ETL Kafka MongoDB PostgreSQL Python SOAP Spark SQL

View details: Senior Data Engineer – AI Automations, Azure, AWS

Brazil

Apply

Data Infrastructure Engineer

Guidehouse

Solving big problems, building trust in society, and empowering our clients to shape the future.

Data Engineer7 days ago

Full Time RemoteTeam 10,001+Since 2018H1B Sponsor

Company Site LinkedIn

Title: Data Infrastructure Engineer Location: US - Remote (Any location) Job Description: Job Family:Data Science& Analysis Travel Required:None Clearance Required:Ability to Obtain Public Trust We are seeking a Data Infrastructure Engineer to build and operate the data platform that powers AI/ML analytics modules. You will design and implement scalable data ingestion pipelines, robust ETL/ELT, and a modern data lake / delta lake (lakehouse) on AWS. You’ll also establish a managed metadata repository and governance layers (catalog, lineage, quality, access controls) and deliver automated cloud provisioning plus CI/CD for data pipelines to enable reliable, repeatable deployments across environments. This role is ideal for an engineer who enjoys platform building, automation, and enabling advanced analytics through trusted, well-governed data. What You Will Do: Build & Operate Data Pipelines (Batch + Streaming) - Design and implement batch and streaming ingestion from APIs, relational databases, file drops, event streams, and external partners. - Build and optimize ETL/ELT pipelines to produce curated, analytics-ready datasets for reporting and ML consumption. - Implement incremental processing patterns, change data capture (CDC) approaches where appropriate, and data contract standards. Deliver a Modern Lakehouse (Data Lake / Delta Lake) - Build and manage a scalable lakehouse on AWS object storage (e.g., S3) using open table/file formats and delta/lakehouse concepts (e.g., ACID tables, schema evolution, time travel patterns). - Optimize performance and cost through partitioning, compaction, lifecycle policies, and efficient compute/storage usage. - Establish environment standards for dev/test/prod and consistent promotion across stages. Metadata, Governance, Lineage & Quality (Trust Layer) - Implement a managed metadata repository for dataset cataloging, ownership, glossary/definitions, tagging, and discoverability. - Enable end-to-end lineage (source → transformations → consumption) to support auditability and impact analysis. - Implement governance controls including policy-based access, data classification, retention, and secure data handling. - Build operational data quality checks (freshness, completeness, validity, anomaly detection) and publish SLAs/SLOs. AWS Automation + CI/CD for Data Pipelines - Implement automated cloud provisioning in AWS using Infrastructure as Code (IaC) for consistent environments and secure-by-default baselines. - Build and enhance CI/CD for data pipelines, including automated tests, validation gates, promotion workflows, and rollback strategies. - Improve observability with metrics/logs/alerts, dashboards, runbooks, and incident response readiness. Cross-Team Collaboration & Documentation - Work closely with engineering, security, networking, and application teams to support mission needs and delivery timelines. - Maintain high-quality engineering documentation including SOPs, system diagrams, and secure configuration baselines. - Summarize and present findings and recommendations—both written and verbal—to technical and non-technical stakeholders. What You Will Need: - Bachelor’s degree in Engineering, IT, Computer Science, or related field (or equivalent experience). - Zero(0) to Two(2) Years of experience. - Experience building production data pipelines and/or data platforms. - Knowledge in implementing data ingestion and ETL/ELT workflows, including data modeling and transformation best practices. - Knowledge in building a data lake / delta lake (lakehouse) on AWS (or equivalent cloud) using object storage and modern table formats/patterns. - Proficiency in SQL and one programming language commonly used for data engineering (Python preferred; Scala/Java acceptable). - Knowledge with metadata management and governance: cataloging, lineage, ownership, access controls, classification and policy enforcement. - Knowledge in implementing automated AWS provisioning using IaC and operating across multiple environments. - Proven experience developing RAG applications - Solid security fundamentals: IAM/least privilege, encryption, secrets management, secure SDLC practices. - Must be able to OBTAIN and MAINTAIN a Federal or DoD "PUBLIC TRUST"; candidates must obtain approved adjudication of their PUBLIC TRUST prior to onboarding with Guidehouse. Candidates with an ACTIVE PUBLIC TRUST or SUITABILITY are preferred. What Would Be Nice To Have: - Hands-on experience with Databricks - Experience in operating CI/CD pipelines for data workflows (testing, packaging, deployment automation, environment promotion). - Hands-on experience utilizing modern DevOps practices, including tools like Git, Terraform, Jenkins, AWS CodePipeline, and Docker. - Experience utilizing AI-assisted coding tools (e.g., GitHub Copilot, ChatGPT, Cursor, Kiro) to safely accelerate implementation while maintaining strict code quality through testing, code reviews, and security practices. - Knowledge graph and Graph RAG experience, including: - Graph modeling and ontology/taxonomy alignment - Entity resolution and relationship extraction - Hybrid retrieval approaches combining graph traversal with semantic/vector search to improve grounding and explainability The annual salary range for this position is $65,000.00-$108,000.00. Compensation decisions depend on a wide range of factors, including but not limited to skill sets, experience and training, security clearances, licensure and certifications, and other business and organizational needs. What We Offer: Guidehouse offers a comprehensive, total rewards package that includes competitive compensation and a flexible benefits package that reflects our commitment to creating a diverse and supportive workplace. Benefits include: - Medical, Rx, Dental & Vision Insurance - Personal and Family Sick Time & Company Paid Holidays - Parental Leave - 401(k) Retirement Plan - Group Term Life and Travel Assistance - Voluntary Life and AD&D Insurance - Health Savings Account, Health Care & Dependent Care Flexible Spending Accounts - Transit and Parking Commuter Benefits - Short-Term & Long-Term Disability - Tuition Reimbursement, Personal Development, Certifications & Learning Opportunities - Employee Referral Program - Corporate Sponsored Events & Community Outreach - Care.com annual membership - Employee Assistance Program - Supplemental Benefits via Corestream (Critical Care, Hospital Indemnity, Accident Insurance, Legal Assistance and ID theft protection, etc.) - Position may be eligible for a discretionary variable incentive bonus About Guidehouse Guidehouse is an Equal Opportunity Employer–Protected Veterans, Individuals with Disabilities or any other basis protected by law, ordinance, or regulation. Guidehouse will consider for employment qualified applicants with criminal histories in a manner consistent with the requirements of applicable law or ordinance including the Fair Chance Ordinance of Los Angeles and San Francisco. If you have visited our website for information about employment opportunities, or to apply for a position, and you require an accommodation, please contact Guidehouse Recruiting at 1-571-633-1711 or via email at RecruitingAccommodation@guidehouse.com. All information you provide will be kept confidential and will be used only to the extent required to provide needed reasonable accommodation. All communication regarding recruitment for a Guidehouse position will be sent from Guidehouse email domains including @guidehouse.com or guidehouse@myworkday.com. Correspondence received by an applicant from any other domain should be considered unauthorized and will not be honored by Guidehouse. Note that Guidehouse will never charge a fee or require a money transfer at any stage of the recruitment process and does not collect fees from educational institutions for participation in a recruitment event. Never provide your banking information to a third party purporting to need that information to proceed in the hiring process. If any person or organization demands money related to a job opportunity with Guidehouse, please report the matter to Guidehouse’s Ethics Hotline. If you want to check the validity of correspondence you have received, please contact recruiting@guidehouse.com. Guidehouse is not responsible for losses incurred (monetary or otherwise) from an applicant’s dealings with unauthorized third parties. Guidehouse does not accept unsolicited resumes through or from search firms or staffing agencies. All unsolicited resumes will be considered the property of Guidehouse and Guidehouse will not be obligated to pay a placement fee.

AI/ML ETL AWS CI/CD Amazon S3 Infrastructure as Code Observability/Monitoring SQL Data Engineering Python Scala Java Amazon IAM SDLC Databricks Git Terraform Jenkins Docker GitHub

View details: Data Infrastructure Engineer

Worldwide

$65K - $108K / year

Apply

Senior Data Engineer – Data Governance, English

Capco

Capco, a Wipro company, is a management & technology consultancy dedicated to the financial services & energy industries

Data Engineer7 days ago

Full Time RemoteTeam 1,001-5,000Since 1998H1B Sponsor

Company Site LinkedIn

• Prepare, organize and make data available for analytical, operational and governance use. • Design, develop and maintain scalable data pipelines for extraction, integration, consolidation, transformation and processing of data from multiple sources. • Create interfaces, flows and mechanisms that ensure secure, efficient and reliable access to information. • Implement and administer solutions using Informatica Cloud Data Governance and Catalog (CDGC). • Work together with business and technology teams to define and implement data governance policies, standards and best practices. • Integrate governance and data catalog solutions with enterprise platforms and cloud environments. • Support initiatives related to Data Quality, Master Data Management (MDM) and Reference Data Management (RDM). • Ensure that data assets are reliable, traceable, accessible and compliant with organizational standards.

AWS Cloud Informatica SQL

View details: Senior Data Engineer – Data Governance, English

Brazil

Apply

Data Catalog Specialist

Job Description

Related Guides

Related Categories

Related Job Pages

More Data Engineer Jobs

Data Engineering Analyst, Mid-level

Senior Data Engineer – AI Automations, Azure, AWS

Data Infrastructure Engineer

Senior Data Engineer – Data Governance, English