Epic Kids logo
Epic Kids

Little Legends Hub

Senior Software Engineer, Infrastructure

Full-stack EngineerSoftware EngineerFull TimeRemoteSeniorTeam 11-50H1B No SponsorCompany SiteLinkedIn

Location

United States

Posted

18 days ago

Salary

$160K - $200K / year

Seniority

Senior

Job Description

Senior Software Engineer, Infrastructure

Epic Kids

• Drive the stability and reliability of Epic's GCP infrastructure—setting and tracking SLOs/SLIs, reducing toil, and engineering out recurring sources of instability • Build and operate Epic's GCP infrastructure for high availability, scalability, and cost efficiency • Manage and harden our Docker and GKE container platform, including workload scheduling, autoscaling, networking, and graceful failure handling • Maintain and improve CI/CD pipelines that enable fast, safe, low-risk delivery across engineering teams • Own and evolve the observability stack—metrics, logs, traces, dashboards, and alerts—so that signals are actionable, noise is low, and on-call has the context to resolve issues quickly • Write and maintain Terraform to codify infrastructure across the organization, with a focus on consistency, change safety, and reproducibility • Contribute to capacity planning, cost optimization, and architectural reviews, with reliability as a first-class consideration • Champion platform security best practices, including secrets management, IAM policies, and network segmentation • Support compliance-aware infrastructure practices—vulnerability management, access reviews, audit-evidence flows, and incident-response readiness—as we mature our SOC 2 and student-data compliance programs • Partner with data engineering to operate the orchestration platform and supporting infrastructure—deployment, scaling, reliability, and observability • Collaborate with backend and data engineers to troubleshoot service and platform issues • Lead by example in a frequent on-call rotation; drive incident response, blameless post-mortems, and the follow-through that turns one-time outages into systemic, lasting reliability improvements • Provide guidance to developers on infrastructure concerns and best practices

Job Requirements

  • Bachelor's degree or higher in Computer Science, Software Engineering, or a related field
  • 5+ years of experience in infrastructure, platform, DevOps, or a related engineering role
  • Hands-on experience with GCP (GCE, GCS, VPC, IAM, Cloud Monitoring, and related services)
  • Experience with Docker and Kubernetes (GKE)—containerizing workloads, deploying to GKE, Helm, and cluster fundamentals
  • Experience with CI/CD pipelines (GitHub Actions, ArgoCD, Jenkins, or similar)
  • Experience with an observability platform such as New Relic (metrics, logging, alerting, dashboards)
  • Proficiency in Terraform for managing infrastructure as code
  • Scripting/programming skills in Python, Bash, or similar
  • Comfort participating in a frequent production on-call rotation
  • Track record of measurably improving reliability of production systems—e.g., defining SLOs, reducing incident frequency or MTTR, eliminating recurring failure modes
  • Strong problem-solving skills, sense of ownership, and ability to work effectively in evolving systems
  • Fluency in English for daily collaboration and technical documentation
  • Proficiency in Mandarin Chinese to collaborate effectively with global engineering and business partners.

Related Job Pages

More Full-stack Engineer Jobs

Customer.io logo

Senior Software Engineer – Email Channel

Customer.io

Customer.io helps companies communicate with their customers in a more authentic and human way. Its versatile marketing automation platform helps “bring humanity to business comm

• Build and evolve the frontend experiences customers use to create, preview, and analyze email campaigns — template editors, sending configuration, deliverability dashboards • Design and scale the backend systems that power high-volume email sending, including queue management, retry logic, and event processing (bounces, complaints, deferrals) • Own deliverability-adjacent systems: bounce classification, suppression management, IP/domain reputation monitoring, and feedback loop processing • Partner with our deliverability team to translate domain expertise into automated tooling — reputation scoring, warm-up schedules, sending throttles • Instrument and monitor the health of the email channel: inbox placement signals, block detection, ESP feedback, and alerting • Own problems end to end — from architecture and schema design to testing, deployment, and monitoring • Share knowledge and raise the bar through short videos, thoughtful writing, and mentorship • Use AI agents to make multi-file changes by scoping the work, writing the prompt, and verifying the output

United States
$150K - $200K / year
Full TimeRemoteTeam 10,001+Since 1993H1B Sponsor

• Architect new persistent data storage platforms that meet targeted AI/HPC workload requirements • Own the solid state drive selection process, storage performance optimization and ensuring operational excellence in NVIDIA platforms using SSDs • Lead system integration and optimizing storage performance and endurance for SSD based storage platforms for NVIDIA • Track and influence storage-class memory industry roadmap to keep NVIDIA leading in technological advancements

California + 3 moreAll locations: California | North Carolina | Texas | Washington
$248K - $391K / year
Full TimeRemoteTeam 5,001-10,000Since 1995H1B No Sponsor

• Desenvolver e evoluir aplicações fullstack com foco em escalabilidade, segurança e performance. • Atuar no desenvolvimento backend utilizando Java e Spring Boot. • Contribuir para construção e evolução de APIs RESTful seguindo práticas API First. • Desenvolver soluções distribuídas orientadas a eventos e comunicação assíncrona. • Atuar com arquiteturas cloud-native, microsserviços e ambientes Kubernetes. • Contribuir para práticas de DevOps, CI/CD, automação e qualidade de software. • Participar de definições técnicas, troubleshooting e resolução de problemas complexos. • Apoiar o time através de colaboração técnica, compartilhamento de conhecimento e melhoria contínua. • Contribuir para evolução de soluções resilientes e integradas a ambientes legados.

Brazil
Bellese Technologies logo

Engineer I, FullStack

Bellese Technologies

Improving the healthcare journey through civic innovation.

Full TimeRemoteTeam 51-200H1B No Sponsor

• Maintain and improve software at the Centers for Medicare and Medicaid Services (CMS) that supports the Hospital Quality Reporting program. • Modernize systems to reduce provider burden and minimize costs. • Leverage test-driven development to deliver backend systems and user interfaces. • Contribute to the development of APIs, specifications, and data models. • Optimize data operations for performance and scalability, ensuring data integrity and security. • Design and develop user interfaces informed by UX designs to meet customer needs.

United States
$88.2K - $109.6K / year
Job Closed