A digital currency exchange, Coinbase is used by consumers, merchants, and traders to buy and sell cryptocurrencies, such as Bitcoin, Ethereum, and Litecoin. Fo
Senior Site Reliability Engineer, Core AI Infrastructure
Location
California
Posted
5 days ago
Salary
$186.1K - $218.9K / year
Seniority
Senior
Job Description
Senior Site Reliability Engineer, Core AI Infrastructure
Coinbase
• Own the reliability, monitoring, and incident response lifecycle for AI infrastructure services, including on-call support for AWS deployment pipelines, root cause analysis, and blameless retros. • Build automation and tooling to streamline operational IT workflows, eliminate manual tasks, and improve deployment velocity across CI/CD frameworks and Kubernetes environments. • Partner with the Coinbase Infrastructure team to extend CI/CD frameworks supporting IT services and enterprise network platforms, and with Security and Compliance to integrate surveillance tooling into deployment pipelines. • Strengthen observability and documentation standards across IT engineering by defining metrics, implementing monitoring solutions, and maintaining technical documentation that sets a standard of excellence. • Develop full-stack applications that power internal AI products and infrastructure with Go or Python.
Job Requirements
- 5+ years of experience automating and supporting cloud infrastructure (AWS) and network environments
- Proven experience deploying, managing, and troubleshooting containerized workloads using Docker and Kubernetes in production environments
- Proficiency in at least one scripting or programming language (Python, Bash, Ruby, or Go) and version control workflows using Git-based CI/CD pipelines
- Track record of leading incident response in environments with strict SLAs, including root cause analysis, blameless retros, and measurable reliability improvements
- Utilizes generative AI responsibly, maintaining human oversight to deliver business-ready outputs and drive measurable improvements in workflow efficiency, cost, and quality.
Benefits
- medical
- dental
- vision
- 401(k)
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
Role Description - Own key parts of the DevOps effort for a global engineering team and a platform supporting millions of transactions per day — and more than doubling annually. - Drive improvement across our development process and tooling: source control, build, test, packaging, release, and deployment. - Drive improvement in our infrastructure configuration, management, and cost efficiency. - Drive enhancement of monitoring and observability across all infrastructure and services, for both availability and performance. - Partner with other technical leaders to improve our architecture’s maintainability, scalability, and resilience. - Serve as a technical lead to software and DevOps engineers on development and operations practices. - Partner with other technical leaders to strengthen our information security posture. - Contribute to the team’s development and operations standards and processes. Qualifications - Deep experience designing automated pipelines end to end — git-based source control and branching strategies, build and test automation (Jenkins, GitHub Actions, or similar), artifact and dependency management, and progressive, low-risk deployment patterns. - Strong, hands-on experience with Terraform (or similar) — reusable modules, state management, and managing multi-environment infrastructure (staging, beta, production) as code. - Experience operating a GitOps workflow with ArgoCD (or similar) as the source of truth for declarative, auditable deployments. - Strong experience building and operating cloud infrastructure at scale — compute, VPC networking, storage, message queues, serverless, DNS, load balancing, IAM, and logging. - Production experience running and operating Kubernetes at scale — cluster lifecycle and upgrades, workload scheduling and resource management, autoscaling (HPA/cluster/event-driven), networking and ingress, and diagnosing complex cluster issues. - Authoring, configuring, and maintaining Helm charts for templated, repeatable application deployments across environments. - Experience deploying, operating, and tuning a mix of data stores — relational (PostgreSQL / CloudSQL), NoSQL document (MongoDB), wide-column (Cassandra), and cache (Redis) — including replication, backups, scaling, and performance troubleshooting. - Demonstrated track record deploying and monitoring large-scale, mission-critical services — defining SLIs/SLOs, building actionable alerting, and driving incident response and blameless post-mortems. - Solid grounding in cloud and infrastructure security — IAM, secrets management, network policy, and supply-chain hygiene. - Bonus: Java application release engineering. Requirements - Great communication and documentation skills. - An obsession with automation and a desire to leave things better than you found them. - A customer-first mindset and strong attention to detail. - 5+ years as a DevOps or Site Reliability Engineer. - Bonus: A sense of humor. Benefits - Be part of a well-funded, fast-growing company tackling complex, relevant challenges in sustainable last-mile delivery. - Competitive compensation and equity incentives. - Health, vision, and dental insurance. - 401(k) plan. - Flexible time off. - 100% remote — open to candidates across the U.S. - Compensation: The targeted base salary range for this role is listed in the compensation section below. Actual salary may be above or below this range based on location, skills, and relevant experience. In addition, this position may include additional compensation in the form of bonuses, equity, or commissions.
Role Description The Sr. Manager Gift Card Operations leads the efforts of the gift card team in identifying and implementing new initiatives to allow for continued growth and competitive advantage. This position is also responsible for overseeing all work related to managing our Third Party Gift Card Sales relationships. Additionally, this role involves working with internal constituents and key distribution partners to develop new products and drive product enhancements to position our business for growth and capitalize on emerging market trends. Lastly, this position will manage the test & roll-out of a Third Party Gift Card Mall in THD stores. Key Responsibilities - 20% Manages Gift Card Sales relationships. - 20% Develops strategic initiatives and products that leverage opportunities both in the short term (2-6 months) and long term (6+ months) to drive Gift Card Sales & THD profits. - 15% Works closely with Gift Card Sales, Marketing, and Ops Managers to identify needs, competitive trends, industry direction, and B2B needs. - 10% Develops Business Case and Implementation plans for new initiatives. Presents to Sr. Leadership and gains consensus for proposed initiatives. - 25% Manages cross-functional teams to ensure flawless delivery of new initiatives/products on time, and within budget. Incorporates feedback and input from team members and cross-functional groups to ensure product viability for long-term success. - 10% Negotiates contracts with providers and business partners. Direct Manager/Direct Reports - Position reports to Director HD Incentives - Number of direct reports - 1 Travel Requirements - Typically requires overnight travel less than 10% of the time. Physical Requirements - Most of the time is spent sitting in a comfortable position with frequent opportunities to move about. - On rare occasions, there may be a need to move or lift light articles. Working Conditions - Located in a comfortable indoor area. Any unpleasant conditions would be infrequent and not objectionable. Qualifications - Must be eighteen years of age or older. - Must be legally permitted to work in the United States. Preferred Qualifications - Must have strong technical knowledge, experience managing complex IT projects, and the ability to understand and articulate technical business requirements. Minimum Education - The knowledge, skills, and abilities typically acquired through the completion of a bachelor's degree program or equivalent degree in a field of study related to the job. Minimum Years of Work Experience - 8 Competencies - Vendor negotiations - Complex project management skills - Experience managing internal and external business partners
DevOps / Site Reliability Engineer
AgileEngineAgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to Work awards.
Role Description We are looking for a Senior Site Reliability Engineer to maintain operational resilience and 24/7 stability across a multi-cloud security program spanning Azure, AWS, and GCP. You will engineer automated security guardrails using Terraform, design and optimize enterprise CI/CD pipelines for continuous ASPM ingestion, and act on CSPM telemetry using Wiz to secure cloud workloads under Zero Trust and federated IAM principles. The role requires deep compliance experience in PCI-DSS or SOC2 environments. Qualifications - 5+ years of experience in Site Reliability Engineering, DevOps, Cloud Security, or related roles; - In-depth architectural expertise in multi-cloud defense, federated IAM, and Zero Trust security principles; - Ability to work autonomously while driving the architecture of complex automated runbooks and mentoring mid-level SREs; - Extensive experience deploying, integrating, and tuning APIs from modern CNAPP/CSPM platforms, specifically Wiz; - Experience building and operating platforms subject to strict financial compliance standards such as PCI-DSS and SOC2; - Strong understanding of cloud security, automation, and infrastructure reliability; - Upper-intermediate English level. Requirements - Scale and maintain operational stability across multi-cloud environments including Azure, AWS, and GCP; - Engineer unified security policies and configuration baselines using Infrastructure as Code (Terraform) to prevent misconfigurations; - Design, maintain, and optimize enterprise CI/CD pipelines supporting continuous ASPM ingestion and deployment; - Act on continuous monitoring alerts and security findings using Cloud Security Posture Management (CSPM) platforms such as Wiz; - Help secure cloud workloads through Zero Trust and federated IAM principles; - Contribute to reliability, automation, and operational excellence initiatives across the platform. Benefits - Professional growth: Accelerate your professional journey with mentorship, TechTalks, and personalized growth roadmaps. - Competitive compensation: We match your ever-growing skills, talent, and contributions with competitive compensation. - Exciting projects: Join projects with modern solutions development and top-tier clients, including Fortune 500 enterprises and leading product brands. - Flextime: Tailor your schedule for an optimal work-life balance, with options for remote work and flexible hours.
• Design, develop and maintain automation framework and scripts to streamline security processes and workflows • Deploy and manage AWS resources using Terraform, ensuring secure and scalable infrastructure • Implement innovative security solutions to reduce the mean time to detect and respond • Implement and optimize AWS Step Functions and Lambda for serverless automation workflows • Leverage AWS security-centric services (e.g., IAM, Control Tower, KMS, Macie, GuardDuty, CloudTrail, EventBridge) to enhance cloud security • Collaborate with cross-functional teams to integrate security automation into CI/CD pipelines • Monitor and troubleshoot AWS infrastructure to ensure high availability, performance and compliance • Stay updated on AWS best practices, security trends, and emerging technologies to drive continuous improvements • Must be able to perform hands-on support for a wide range of security technologies, including, but not limited to: Pipeline security, DevSecOps, CloudFormation templates, Terraform, Docker, Kubernetes, SIEM, CSPM and Vulnerability Scanners • Work independently with minimal supervision, while providing guidance and collaborating with the team as needed


