Job Closed

This listing is no longer active.

Data Platform, Cloud Infrastructure Engineer – Senior

Infrastructure EngineerInfrastructure EngineerOtherRemoteSeniorTeam 501-1,000H1B No SponsorCompany SiteLinkedIn

Location

United States

Posted

111 days ago

Salary

0

Seniority

Senior

Job Description

Data Platform, Cloud Infrastructure Engineer – Senior

Domus Global

• Design, deploy and maintain cloud infrastructure for data and ML workloads using Infrastructure as Code. • Manage and evolve AWS-based data platform components running on Kubernetes (EKS). • Provision and maintain services such as EMR on EKS, SageMaker, MWAA (Managed Airflow), Lambda, API Gateway and Step Functions. • Implement and maintain IAM roles, permissions and governance policies aligned with compliance requirements. • Support orchestration frameworks used by data teams (DBT, Airflow, Step Functions). • Collaborate with data engineers to troubleshoot infrastructure or platform issues affecting pipelines. • Participate in platform observability initiatives (metrics, logging and monitoring). • Maintain Terraform modules and deployment pipelines. • Support platform migrations and organizational AWS changes when required. • Contribute to platform reliability, scalability and operational excellence.

Job Requirements

  • 3+ years of experience working with AWS cloud infrastructure
  • Strong experience with Terraform or similar Infrastructure as Code tools
  • Experience deploying and operating containerized workloads on Kubernetes / EKS
  • Solid understanding of AWS IAM, roles and security best practices
  • Experience with serverless architectures (Lambda, API Gateway, Step Functions)
  • Experience supporting data or ML platforms from an infrastructure perspective
  • DevOps mindset and experience managing CI/CD or infrastructure automation
  • Strong troubleshooting skills across distributed systems.

Benefits

  • Remote
  • Professional development opportunities

Related Categories

Related Job Pages

More Infrastructure Engineer Jobs

Fiserv logo

Senior Infrastructure Architect

Fiserv

We aspire to move money and information in a way that moves the world.

OtherRemoteTeam 10,001+Since 1984H1B Sponsor

• Define and evolve enterprise infrastructure reference architectures and patterns for Azure Cloud, hybrid, and on-premises environments. • Lead end-to-end design for network, compute, storage, identity, and security controls aligned to regulatory and risk requirements. • Partner with product, security, compliance, and operations teams to translate objectives into architecture decisions and implementation guidance. • Produce high-quality architecture artifacts, diagrams, and blueprints; review solutions for adherence to standards and non-functional requirements. • Evaluate emerging infrastructure technologies; run proofs of concept and document recommendations with measurable outcomes. • Establish and govern standards for Infrastructure as Code (IaC) and configuration management using tools such as Terraform, Ansible, Puppet, or Chef. • Drive reliability and resilience through capacity planning, performance modeling, and disaster recovery architectures.

Iowa
Job Closed
Serverfarm logo

Senior Critical Infrastructure Engineer

Serverfarm

Managing everything physical in the virtual world.

OtherRemoteTeam 51-200Since 1999H1B No Sponsor

• Develop and maintain global consistent colocation design standards to drive equivalent resilience in new and existing data centers. • Support the deployment of new ServerFarm products in colocation data centers in a consistent way, including liquid cooling and high-density racks and systems to support hyperscale requirements for cloud-compute, machine Learning and AI services. • Reviews and provides suggestions to update global design standards and generational data center template designs. • Support development and operational engineering teams to review and accept proposed changes to site-specific infrastructure at ServerFarm data centers to validate standards, and to identify and document accepted deviations when appropriate. • Work with internal and external global teams to drive consistent standard solutions to expedite review processes and drive cost efficiency. • Set up global strategies to deploy specific customer products or design, and to drive consistency and efficiencies for delivering customer requirements. Functionally decompose complex problems into simple, straight-forward solutions. • Documentation, release and management of design guides, standards, specifications and procedures. • 35% domestic and international travel

United States
$120K - $180K / year
Job Closed
TribalScale logo

Oracle Cloud Infrastructure Engineer

TribalScale

A digital innovation firm with a mission to right the future. Our work spans industries, platforms, and continents.

Full TimeRemoteTeam 51-200H1B No Sponsor

• OCI Architecture: Architect, design, and implement scalable microservices using Spring Boot and Java specifically optimized for the OCI environment. • Infrastructure as Code (IaC): Apply IaC practices (Terraform, OCI Resource Manager) to automate infrastructure provisioning, management, and scaling. • Container Orchestration: Deploy and manage microservices using Docker and OCI Container Engine for Kubernetes (OKE). • Event-Driven Systems: Build and manage event-driven architectures, leveraging OCI Streaming or Apache Kafka. • Performance & Availability: Design and maintain high-performance, low-latency, and high-availability systems with a focus on OCI’s unique regional and AD (Availability Domain) structures. • DevOps Integration: Collaborate with teams to implement CI/CD pipelines using Jenkins, Gradle/Maven, BitBucket, and Ansible. • Security & Observability: Implement proactive monitoring using OCI Monitoring/Logging or tools like Splunk and Dynatrace. Ensure data encryption (PKI, TLS, HTTPS) at rest and in transit.

Canada
Hydra Host logo

AI Infrastructure Engineer

Hydra Host

A distributed marketplace for compute

OtherRemoteTeam 11-50H1B No Sponsor

• Get AI Platform customers production-ready on Hydra — standing up Kubernetes clusters, configuring GPU drivers, validating networking, and troubleshooting the issues that surface when real workloads hit real hardware. • Own the bare metal ←→ platform layer — bridging GPU infrastructure (NCCL, InfiniBand, NVLink, storage) with orchestration layers (Kubernetes, SLURM) and MLOps tooling that customers actually use. • Configure, benchmark, and debug NVIDIA driver stacks — firmware versions, CUDA compatibility, NCCL tuning, MIG configurations. • Run quality benchmarks and diagnostics to validate performance for inference and training workloads across chip types. • Identify gaps before customers do — pressure-testing Hydra's infrastructure, APIs, and workflows to find what's missing or broken. • Turn customer learnings into product — working with Product and Engineering to build reusable templates, default configurations, and automated workflows that eliminate manual onboarding. • Advise customers on chip selection and tokenomics — helping AI platform customers understand price/performance trade-offs across GPU types, cost-per-token economics, and which hardware fits their inference or training workloads.

United States
$150K - $225K / year