Granted is the consumer-first, AI-native product that fights medical bills and helps people navigate health benefits. The U.S. healthcare system is confusing by design. Insurance coverage is opaque, benefits are hard to understand, and medical bills are often wrong. Nearly 40% of bills contain errors or are incorrectly denied, yet most people don’t have the time, expertise, or energy to challenge them or even know where to start. Insurers and providers rely on that confusion. Granted exists to change the balance of power. We use AI to turn healthcare advocacy into a product that works continuously in the background. By connecting directly to a user’s healthcare data, we help people understand their benefits, choose the right care, and fight incorrect bills and claim denials when they occur. Instead of endless phone calls and handoffs, users get clear guidance, fast answers, and an advocate that actually takes action. We go direct to consumers with an affordable product and a free tier accessible to anyone. Granted works for you, not your employer, insurance company, or healthcare providers. We support individuals and families across their entire healthcare journey, no matter their insurer, job changes, or life stage. We’re currently in stealth with strong early user satisfaction and are focused on scaling responsibly. Our goal is simple: help people save time and money, reduce stress, and feel confident navigating a system that too often works against them.

Infrastructure Engineer (Senior/Staff Level)

Infrastructure EngineerInfrastructure EngineerFull Time Remote SeniorTeam 23Since 2023Company Site

Location

United States

Posted

74 days ago

Salary

$150K - $225K / year

Seniority

Senior

Bachelor Degree9 yrs expEnglishAWS Datadog Docker GitHub Actions OpenTelemetry Pulumi Sentry TypeScript

Job Description

💡 MissionThe US healthcare system is complex, error-prone, and financially draining. Medical bills and insurance coverage shouldn’t be this hard to navigate. At Granted, we’re building the one solution every American can turn to for help. Thanks to AI and new regulations, Granted can fight claim denials, correct billing errors, negotiate bills, and make coverage easier to understand—saving people time, money, and stress. Our goal is simple: to be the #1 platform that empowers all Americans to take charge of their healthcare. 🩺 About UsFounded by a former Oscar Health leader, we’re a seed-stage company with $17M in funding. We’re lucky to be backed by the founders and investors at Hugging Face, Rocket Money, Oscar Health, CaseText, Forerunner Ventures, RRE Ventures, and more. We are well-funded for the next few years. Location: While parts of our engineering team operate remotely elsewhere in the US, we strongly encourage New York–based applicants who are excited to engage in a more in-person hybrid dynamic. Local employees typically work from our Chelsea office Monday through Thursday, with Fridays working from home. 🔎 About the Role - AI Integration: Accelerate developers and internal operations by exposing or configuring API and MCP access to our infrastructure, while keeping costs and security under control. - Developer Experience: Enhance our TypeScript monorepo performance, optimize Nx build system, migrate from CommonJS to ESM, and improve our Docker-based DevContainers - CI/CD Operations: Improve GitHub Actions workflows, implement faster CI runners, enhance caching strategies, and manage multi-environment deployments (dev/prod) using Pulumi - Infrastructure Management: Maintain AWS infrastructure including ECS, Aurora RDS (Gel), VPC networking, and monitoring stack (OpenTelemetry, Honeycomb, Sentry → but soon datadog), and AWS access provisioning - MDM & Security & Compliance: Support HIPAA compliance initiatives including RBAC implementation, encryption standards, and security monitoring with GuardDuty/CloudTrail. Set up stricter standards for MDM. - Monitoring & Observability: Optimize our observability stack across web and mobile platforms, assist with on-call load reduction, and enhance performance monitoring - Access Policy: Implement and maintain role-based access controls, manage admin privileges in internal systems, and coordinate with user management systems. 👩‍💻 We'll be most excited if you - Bring 5–10+ years of experience in infrastructure, or platform engineering, with a strong understanding of SOC2 and security compliance requirements - and the ability to build and automate through code, not just configure systems. - Are mission-driven and passionate about solving meaningful problems with real-world impact. - Are expert in SaaS administration, MDM solutions, and IT security best practices - Thrive in ambiguity, demonstrating autonomy and adaptability in a fast-paced environment. - Think big-picture, designing pragmatic solutions for complex technical challenges. - Communicate clearly and effectively, both verbally and in writing. 📢 You don't need to check every single box—if you're passionate, driven, and excited about the impact AI can make on the healthcare system, we'd love to hear from you! 📩 🦸 Working with usGranted is headquartered in New York. Our team is split evenly between our Chelsea office & the rest of the US. We marry the best of in-office culture with the best of a remote-first company. We’re a team of passionate individuals who show up every day to do the best work of their lives following these principles: - Move with Urgency and Focus: Prioritize what matters and find the shortcuts. - Own the Outcome: Take initiative and see it through until the end. - Commit to Your Craft: Excellence is the standard. Care about the details. - Play to Win: If we win, Americans win, so we work hard and do what it takes for us to be #1. - Feedback is Fuel: Intellectual honesty is at the core of everything we do.

Benefits

401(K), Commuter benefits, Company equity, Company-sponsored outings, Dental insurance, Disability insurance, Family medical leave, Flexible Spending Account (FSA), Free daily meals, Generous parental leave, Company-sponsored happy hours, Health insurance, Job training & conferences, Open door policy, Life insurance, Mean gender pay gap below 10%, Open office floor plan, Paid holidays, Pair programming, Paid sick days, Pet friendly, Promote from within, Lunch and learns, Relocation assistance, Free snacks and drinks, Team based strategic planning, OKR operational model, Unlimited vacation policy, Vision insurance, Some meals provided, Mental health benefits, Home-office stipend for remote employees, Hiring practices that promote diversity, Hybrid work model, Wellness days, Mother's room, Personal development training, Flexible time off, Bereavement leave benefits

Related Categories

Infrastructure Engineer

Related Job Pages

Remote Full-time Jobs (US)More Remote Jobs

More Infrastructure Engineer Jobs

Senior Data Infrastructure Engineer – Tech Lead

Dandy

Helping dentists achieve more by making the entire lab process digital — and effortless.

Infrastructure Engineer74 days ago

Full Time RemoteTeam 501-1,000Since 2020H1B Sponsor

Company Site LinkedIn

• Set the vision and technical direction for Dandy’s Data Infrastructure team, collaborating with stakeholders. • Own Dandy’s data pipelines and warehouse, ingesting data from hundreds of sources and scaling the tooling and platform to support the data engineering and analytics teams’ needs as the business grows. • Develop and maintain infrastructure, systems, and tooling to support Dandy’s data engineering... • Improve developer experience and productivity across our range of software repositories. • Collaborate with stakeholders within the tech org to influence overall objectives and long-term goals of your team. • Advocate for improvements to product quality, security, and performance that have a particular impact across your team and others. • Design improvements to infrastructure quality, security, and performance. • Craft code that meets our internal standards for style, maintainability, and best practices. • Foster a culture of proactive collaboration and systems resiliency and performance. Drive scalable improvements to resiliency and effectiveness through automation. • Recognize impediments to our efficiency as a team ("technical debt"), propose and implement solutions.

Airflow AWS Azure ETL GCP Python SQL Terraform

View details: Senior Data Infrastructure Engineer – Tech Lead

United States

$201.5K - $237K / year

Apply

Job Closed

SRE Observability SLO Engineer

General Electric - GE

Built on more than 130 years of experience, GE Vernova, a division of General Electric (GE), is leading a new era of energy by electrifying the world while work

Infrastructure Engineer75 days ago

Full Time Remote

Company Site

Role Description GE Vernova's GridOS Platform Engineering team is building the next generation of SaaS reliability for critical energy infrastructure. The Observability & SLO Engineer is the eyes and ears of the GridOS SRE team. In this role you will build and own the full telemetry stack — from instrumentation standards to SLO dashboards to synthetic monitors — that give GE Vernova and its utility customers real-time confidence in the reliability of mission-critical energy management systems. This is a cyclical, high-impact position: you will drive an intensive initial ramp to establish v1.0 observability coverage across all customer environments, then shift into an ongoing improvement cadence aligned to new product releases and customer onboarding. Qualifications - 2–3 years in SRE, observability engineering, or infrastructure reliability roles. - Deep expertise with at least one major observability platform — Datadog, Grafana + Prometheus, AWS CloudWatch, Dynatrace, or New Relic. - Hands-on experience implementing SLIs, SLOs, and error budget burn-rate alerting in a production SaaS environment. - Strong understanding of distributed systems telemetry: metrics (Prometheus/CloudWatch), structured logging (CloudWatch Logs Insights, ELK), and distributed tracing (OpenTelemetry, AWS X-Ray). - Experience with Kubernetes observability — kube-state-metrics, node exporters, Helm-deployed monitoring stacks, and namespace-level resource metrics. - Proficiency in at least one query/visualization language: PromQL, Splunk SPL, Datadog Query Language, or CloudWatch Logs Insights query syntax. - Experience designing alerting strategies that minimize alert fatigue through symptom-based and burn-rate approaches. - Scripting skills in Python and/or Bash for automation of monitoring configuration and report generation. Requirements - Cloud Technologies - AWS Cloud Infrastructure - EKS, RDS, MSK, S3, EC2, EBS, SQS, etc. - Kubernetes - EKS, Rancher - Infrastructure as Code: Terraform - Deployment and Configuration Tools - Ansible, Chef or Puppet - Telemetry standards and tools - Open Telemetry, CloudWatch, Cloudtrail - Observability tools and technology - Datadog, Splunk, NewRelic, etc. - Alerting and notification - AWS and Azure alerting notification - Scripting - Go, Python, Groovy, Bash - Strong Linux Administration Skills - Strong analytical and problem solving skills Benefits - Relocation Assistance Provided: Yes - #LI-Remote - This is a remote position Leadership - Influences through others; builds direct and "behind the scenes" support for ideas. - Preemptively sees downstream consequences and effectively tailors influencing strategy to support a positive outcome. - Able to verbalize what is behind decisions and downstream implications. - Continuously reflecting on success and failures to improve performance and decision-making. - Understands and encourages change when needed. - Proactively identifies and removes project obstacles or barriers on behalf of the team. - Able to navigate accountability in a matrixed organization. - Self-starter; communicates and demonstrates a shared sense of purpose. Learns from failure. Personal Attributes - Critical thinker; able to quickly adapt to changing environments. - A hacker or tinkerer at heart. - Risk taker, not afraid to think outside the box or challenge the status quo. - Emotional Intelligence, ability to influence up and out and the ability to work independently. - Must be a team player with a strong desire to win. - Passionate about continuously learning. - Highly organized and efficient; able to balance competing priorities and execute accordingly. - Strong oral and written communication skills.

Datadog Grafana Prometheus Amazon CloudWatch New Relic OpenTelemetry AWS Helm Splunk Terraform Ansible Chef Puppet Python Groovy Shell Linux Amazon EKS Amazon RDS Amazon S3 Amazon EC2 Amazon SQS Azure

View details: SRE Observability SLO Engineer

Worldwide

Apply

Job Closed

SRE Platform Engineer

General Electric - GE

Built on more than 130 years of experience, GE Vernova, a division of General Electric (GE), is leading a new era of energy by electrifying the world while work

Infrastructure Engineer75 days ago

Full Time Remote

Company Site

Role Description The Platform System Reliability Engineer is the primary operations engineer and operator of our EKS Kubernetes environment, which serves as the foundation for our global grid software SaaS products. This role focuses on the "middle-mile" of software delivery, ensuring that the underlying compute, networking, and storage layers are secure, hardened, scalable, and resilient to support critical energy infrastructure in the cloud. You will be responsible for the full lifecycle of production clusters, from initial bootstrapping, performance tuning, patching and securing. Qualifications - Bachelor's Degree in Computer Science or “STEM” Majors (Science, Technology, Engineering and Math) with advanced experience. - 6–8 years in SRE or Platform Engineering roles supporting mission-critical, 24/7 cloud environments. Requirements - 5 years of experience operating production-grade Kubernetes clusters at scale. - Expert-level knowledge of multi-cluster management, performance tuning and experience implementing observability tools such as Prometheus/Grafana, Dynatrace, Splunk, Datadog, etc. - Deep hands-on experience with AWS core services (EKS, EC2, ALB, S3, RDS, MSK). - Proficiency in Terraform, Ansible, and Python or Go for infrastructure automation and deployment tools like ArgoCD or Flux. - Strong understanding and hands-on experience of cloud networking concepts such as VPCs, routing, load balancing and security configurations such as encryption, certificate management. Benefits - Relocation Assistance Provided: Yes - #LI-Remote - This is a remote position Roles and Responsibilities - Day 0: Provision & Infrastructure Hardening - Kubernetes Cluster Orchestration: Help design and deploy hardened EKS clusters across multiple AWS regions, ensuring consistent security baselines. - Infrastructure as Code (IaC): Build and maintain reusable Terraform and Ansible modules for automated provisioning of cloud infrastructure services including networking services, compute, storage, queue and cache, etc. - Security Architecture: Implement "Policy as Code" guardrails and secure network perimeters (ESPs) in alignment with NERC CIP and IEC 62443 standards. - Operationalize Cloud Infrastructure: Standardize run books, operating processes required to run critical infrastructure with highest reliability. - Day 1: Platform Readiness & Scaling - Resource Governance: Define and enforce Kubernetes resource quotas, limit ranges, and Pod Priority classes to ensure mission-critical services receive prioritized compute resources. - Connectivity & Ingress: Manage the ingress strategy and service mesh architecture to facilitate secure, performant connectivity between distributed microservices. - Acceptance Testing: Lead platform-level smoke, load testing and disaster recovery exercises to validate that the infrastructure can meet 99.99% uptime targets. - Sizing & Optimization: Partner with application teams to right-size containerized workloads, optimizing for both performance and cloud cost (FinOps). - Day 2: Operational Excellence & Tier 3 Support - L3 Escalation: Act as the highest technical escalation point for complex Kubernetes internals, troubleshooting issues such as failed pods, memory leaks, and network partitions. - Incident Response: Lead root cause analysis (RCA) for platform-level outages, implementing systemic fixes to prevent recurring failures. - Toil Elimination: Proactively identify and automate repetitive operational tasks—such as cluster upgrades and OS patching—to ensure the team spends at least 50% of their time on engineering improvements. - Observability Integration: Institutionalize platform monitoring using Prometheus and Grafana, creating dashboards that surface the "Golden Signals" of cluster health. Preferred Qualifications - Practical knowledge of NERC CIP, SOC2, ISO 27001, or IEC 62443 compliance standards in a SaaS context. - AWS Certified DevOps Engineer – Professional, CKA (Certified Kubernetes Administrator), or SRE Practitioner Certification. - Experience supporting mission-critical systems in energy, utilities, or other high-stakes industrial sectors. Personal Attributes - High level of energy and enthusiasm with the ability to thrive in a rapidly changing environment. - Demonstrated customer focus – evaluates decisions through the eyes of the customer; builds strong customer relationships; creates processes with customer viewpoint; partners with customers. - Change oriented – actively generates process improvements; champions and drives change initiatives; confronts. - Ability to work with global teams, act independently and as part of a team. - Strong analytical and problem-solving skills - communicates in a clear and succinct manner and effectively evaluates information/data to make decisions; anticipates obstacles and develops plans to resolve.

Amazon EKS Kubernetes Prometheus Grafana Splunk Datadog AWS Amazon EC2 Amazon S3 Amazon RDS Terraform Ansible Python Argo CD Flux

View details: SRE Platform Engineer

Worldwide

Apply

Job Closed

NOC and SOC Infrastructure Engineer

CRH Talento de IT

Infrastructure Engineer75 days ago

Full Time Remote

Role Description Buscamos un Ingeniero de Sistemas e Infraestructura con experiencia en entornos NOC y SOC, responsable de diseñar, implementar, monitorear y mantener la infraestructura tecnológica de la organización. Este rol es clave para garantizar la disponibilidad, rendimiento y seguridad de los sistemas tanto en ambientes on-premise como en la nube. Colaborará con equipos multidisciplinarios para asegurar la continuidad operativa, la detección proactiva de incidentes y el fortalecimiento de la postura de ciberseguridad. Responsibilities - Diseñar, implementar y mantener la infraestructura de TI, incluyendo servidores, almacenamiento, redes y plataformas de virtualización (on-premise y nube). - Operar en entornos NOC/SOC, monitoreando sistemas, redes y eventos de seguridad para asegurar la continuidad del servicio y la atención oportuna de incidentes. - Configurar y administrar componentes de infraestructura física y virtual alineados a los requerimientos del negocio. - Monitorear el rendimiento, capacidad y disponibilidad de los sistemas, implementando mejoras para garantizar alta disponibilidad y confiabilidad. - Ejecutar tareas de administración de sistemas: instalación, configuración, mantenimiento y actualización. - Automatizar procesos de infraestructura mediante scripting, herramientas de automatización e Infraestructura como Código (IaC). - Gestionar procesos de respaldo y recuperación de información (backup & disaster recovery). - Implementar controles de seguridad, gestión de accesos y mecanismos de cifrado para proteger la información. - Realizar evaluaciones de vulnerabilidades, escaneos de seguridad y apoyar en la respuesta a incidentes. - Administrar y optimizar servicios en la nube (cómputo, almacenamiento, redes e identidades). - Monitorear consumo y costos en la nube, proponiendo estrategias de optimización. - Mantenerse actualizado en tendencias tecnológicas y proponer mejoras continuas. Qualifications - Sólidos conocimientos en administración de servidores, redes e infraestructura. - Experiencia en entornos NOC (Network Operations Center) y SOC (Security Operations Center). - Conocimientos en virtualización (VMware, Hyper-V o similares). - Experiencia con plataformas cloud (AWS, Azure o GCP). - Manejo de PowerShell, Bash u otros lenguajes de scripting. - Conocimiento en automatización e Infraestructura como Código (Terraform, Ansible, etc.). - Conocimientos en ciberseguridad: controles de acceso, cifrado, gestión de vulnerabilidades y cumplimiento. - Capacidad analítica y de resolución de problemas. - Habilidades de comunicación y trabajo en equipo. - Organización, atención al detalle y manejo de múltiples tareas. Requirements - Licenciatura en Sistemas, Tecnologías de la Información o afín (deseable). - Experiencia comprobable en administración de sistemas, ingeniería de infraestructura o roles similares. - Experiencia en implementación y soporte de infraestructura compleja. - Experiencia con herramientas de automatización y control de versiones. Benefits - Contratación directa con la empresa. - Esquema 100% nómina. - Prestaciones de ley. - Fondo de ahorro. - Aguinaldo de 30 días. - Seguro de vida. - Seguro de gastos médicos mayores. - Vales de despensa.

VMware AWS Azure GCP PowerShell Shell Terraform Ansible

View details: NOC and SOC Infrastructure Engineer

Mexico

Apply