Join the Vista Family. In March 2019, Vista Equity Partners acquired PlanSource, marking a new phase of growth. PlanSource is highly rated with our customers. Be proud of our sophisticated cloud-based technology that meets the needs of even the most complex benefit programs. Success is rewarded. With more than just a pat on the back, your success is recognized and rewarded. You can grow and develop professionally. PlanSource has a great track record of internal promotions within the company. Share our values. Be part of a team that values diversity and representation in all levels of the organization.
Cloud DevOps Engineer
Location
United States
Posted
5 days ago
Salary
0
Seniority
Mid Level
No structured requirement data.
Job Description
Cloud DevOps Engineer
Plansource
Role Description PlanSource is seeking a Cloud DevOps Engineer to design, build, and operate scalable, secure, and automated cloud infrastructure that enables high-quality, high-velocity software delivery. In this role, you will sit at the intersection of Cloud Engineering, DevOps, and Platform Reliability, owning infrastructure as code, CI/CD automation, and operational excellence across AWS environments. You will partner closely with Engineering, Security, and IT to improve reliability, reduce operational friction, and accelerate delivery through automation and AI-enabled tooling. Primary Responsibilities - Design, build, and maintain AWS infrastructure using Infrastructure as Code (Terraform/CloudFormation). - Automate provisioning, configuration, and scaling of cloud environments. - Own and enhance CI/CD pipelines to improve build, test, and deployment workflows. - Support containerized applications and orchestration platforms (Docker, Kubernetes/EKS). - Implement monitoring, logging, and alerting solutions to improve system reliability. - Participate in incident response, root cause analysis, and continuous improvement efforts. - Embed security practices into pipelines and infrastructure (IAM, secrets, vulnerability management). - Optimize cloud environments for cost, performance, and scalability. AI-Enabled DevOps - Leverage AI-assisted tools for log analysis, anomaly detection, and incident investigation. - Use AI tools to improve CI/CD pipelines and infrastructure automation. - Apply AI-driven insights to identify reliability risks, performance bottlenecks, and capacity constraints. - Contribute to responsible and governed use of AI within DevOps workflows. Qualifications - 5+ years of experience in Cloud Engineering, DevOps, or Site Reliability Engineering. - Strong experience with AWS services (compute, networking, IAM, storage). - Experience with Infrastructure as Code tools (Terraform preferred). - Hands-on experience with CI/CD tools (GitLab CI, GitHub Actions, Jenkins). - Experience with containerization and orchestration (Docker, Kubernetes). - Strong scripting skills (Python, Bash, or PowerShell). - Experience with monitoring/observability tools (Prometheus, Grafana, ELK, or similar). - Strong Linux systems knowledge. - Understanding of networking fundamentals (DNS, load balancing, VPCs). - Familiarity with DevSecOps principles. Requirements - Experience using AI tools such as GitHub Copilot, Claude, or similar. - Knowledge of AI-assisted automation, prompt engineering, or AIOps concepts. - Certifications in AWS, Kubernetes, or DevOps disciplines. - Experience with cost optimization and FinOps practices. Benefits - Comprehensive health coverage with multiple medical plan options - all covering 100% of in-network preventive care. - Employer-funded Health Savings Account (HSA) - up to $1,000 annually for family coverage. - Dental & Vision plans with 100% coverage for routine dental care and $250 vision frame allowance, plus employee-only vision premiums at $0. - 401(k) with immediate vesting and a 50% company match up to 6% of contributions. - Generous paid parental leave, adoption assistance, and fertility benefits. - Flexible PTO, paid holidays, a strong culture of work-life balance and Flex Fridays in the summer. - Mental health & wellbeing support, including Employee Assistance Program (EAP), movement and wellness resources. - Rewards and recognition programs that celebrate employees through peer recognition, awards, and quarterly recognition initiatives. Company Description - Join a company redefining how benefits work. - Our platform powers some of the most complex benefits programs in the market. - Recognized as a top workplace, PlanSource has earned multiple Great Place to Work certifications and numerous awards. - At PlanSource, career growth doesn’t happen by accident. - Our culture is rooted in connection, inclusion, and shared success.
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
Role Description Stack AV Site Reliability Engineers are responsible for enabling and ensuring our production systems meet their service-level objectives. Through the implementation of centralized observability and automation, the SRE team constantly ensures the health, reliability, scalability, and performance of Stack AV’s infrastructure. Members of the team are expected to contribute to a culture of continuous learning, provide consultation on architecting for high-availability, and ultimately drive the uptime and performance of our systems. Responsibilities - Monitor and maintain mission-critical production services to ensure maximum uptime. - Design and implement scalable distributed systems to facilitate the development of self-driving vehicles. - Design and implement an incident management framework and build a culture of blameless postmortems and continuous learning. - Scale the reliability and velocity of our systems and processes through increased automation. - Document actions to build a comprehensive library of runbooks, which will act as a knowledge base and foundation for automation. - Participate in an on-call rotation to uphold the SLOs and SLAs of production services. Qualifications - Expertise in at least one scripting language (e.g. Bash, Python). - Fundamental understanding of Linux operating system internals, TCP/IP networking, and storage subsystems. - Experience scaling and securing services in the cloud (AWS, GCP) or cloud native environments. - Experience using infrastructure-as-code principles to automate the creation of infrastructure resources (e.g. Terraform, CloudFormation). - Understanding of engineering design limitations and ability to provide guidance to teams to scale their services to achieve desired performance within budget. - Strong experience implementing and debugging cloud native and open source tools such as Kubernetes, etcd, Prometheus, OpenTelemetry, and Istio. - Strong communication skills and the ability to work effectively in a diverse and distributed team. Company Description Stack is developing revolutionary AI and advanced autonomous systems designed to enhance safety, reliability, and efficiency of modern operations. Stack's autonomous technology incorporates cutting-edge advancements in artificial intelligence, robotics, machine learning, and cloud technologies, empowering us to create innovative solutions that address the needs and challenges of the dynamic trucking transportation industry. With decades of experience creating and deploying real world systems for demanding environments, the Stack team is dedicated to developing an autonomous solution ecosystem tailored to the trucking industry's unique demands.
Staff Site Reliability Engineer – Volcano
Kong Inc.Kong Inc. is a cloud connectivity company founded in 2017 to create software products that power connections. Well-known as the creator of Kong, a widely adopte
• Own reliability for Volcano end-to-end: Define and drive SLOs, error budgets, and incident response practices for all Volcano services — edge deployments, managed Postgres, auth, realtime, storage, and the control plane. • Architect the platform's infrastructure: Design and build the multi-region Kubernetes infrastructure, networking, and data plane that powers Volcano's edge deployment pipeline and backend-as-a-service capabilities. • Build the GitOps and CI/CD backbone: Establish deployment automation, canary pipelines, and preview environment provisioning using ArgoCD, Helm, and Terraform/Terragrunt — setting patterns the broader team will follow. • Scale managed data services: Design, operate, and harden multi-tenant PostgreSQL clusters, Redis caching layers, and object storage — with a focus on data isolation, performance, and disaster recovery. • Drive observability from day one: Instrument every Volcano service with meaningful SLIs; build dashboards, alerts, and runbooks using Datadog, Prometheus, and Grafana before services go live, not after incidents. • Lead cross-functional reliability work: Collaborate with the OCTO team, product engineering, and security to bake reliability and compliance into Volcano's architecture — not bolt it on later. • Set SRE culture and standards: Mentor engineers across Volcano's contributing teams on reliability principles; lead postmortems, define on-call practices, and build a blameless engineering culture. • Evaluate and adopt emerging technologies: Given Volcano's greenfield nature, evaluate and make architectural decisions on edge runtimes, serverless compute, vector databases, and AI-native infrastructure components.
DevOps Engineer
CivicActionsCivicActions is a leading development, design, and strategy organization founded in 2004. It serves clients from nonprofit organizations to government agencies
Role Description This position will join our cross-functional and highly collaborative team developing the next generation of digital services, using modern technologies and practices. This position is remote (work from home), requires a federal background investigation and US residence for 3 of the last 5 years. - Break down complex problems into understandable and iterative solutions - Infrastructure-as-code development and operations on Kubernetes environments using Docker and Helm - Familiarity with AWS services including EKS, RDS, S3, CloudWatch and managing infrastructure using Terraform - Continuous integration & continuous deployment with tools such as Gitlab CI, Github Actions or Jenkins - Create and maintain documentation, timely and detailed ticket updates and communications around work - Planning and implementing migration of systems and applications between hosts with minimal downtime - Can work both collaboratively and solo, with experience navigating complex troubleshooting scenarios Qualifications - At least six years of DevOps, SRE, IT, sysadmin, security, developer or other relevant experience - Site reliability engineering (SRE) and on-call rotation - must be able to respond nights and/or weekends, as necessary - Experience with Infrastructure-as-code development and operations on Kubernetes environments using Docker and Helm - Familiarity with AWS services including EKS, RDS, S3, CloudWatch and managing infrastructure using Terraform - Experience with continuous integration & continuous deployment - Experience working in Agile and cross-functional teams (with users, developers, product managers, security and compliance) Requirements - Nice to have: Team leadership and/or cloud architecture experience - Experience working with distributed teams - Experience with Lagoon, Ansible, GNU/Linux, Apache, PHP and/or Drupal configuration - Previous federal background investigation Benefits - Fully remote work (always) - Comprehensive medical, dental, vision, life, and disability coverage for employees, with company contributions toward dependent coverage - 401(k) with a 3% company contribution - Flexible time off policy - 12 weeks paid parental leave - Annual professional development stipend, $1,200 - Annual technology stipend, $820 - Employee growth plans, appreciation programs, and company summits to support connection and career development
SRE Engineer
Deutsche Telekom IT SolutionsAs Hungary’s most attractive employer in 2025 (according to Randstad’s representative survey), Deutsche Telekom IT Solutions is a subsidiary of the Deutsche Telekom Group. The company provides a wide portfolio of IT and telecommunications services with more than 5300 employees. We have hundreds of large customers, corporations in Germany and in other European countries. DT-ITS received the Best in Educational Cooperation award from HIPA in 2019, acknowledged as the Most Ethical Multinational Company in 2019. The company continuously develops its four sites in Budapest, Debrecen, Pécs and Szeged and is looking for skilled IT professionals to join its team.
Role Description We are seeking a cloud engineer with a strong learning mindset to help build and operate our Google Cloud Platform (GCP) environment. You will work with senior engineers and architects to operate and implement infrastructure-as-code, networking, security guardrails, automation, and monitoring. This role suits someone with foundational, possibly theoretical, cloud knowledge who is eager to learn quickly and grow into a cloud engineering/architecture path. It is implemented as a Site Reliability Engineering (SRE) team. - Work with an agile mindset and according to agile methodology, collaborating closely within the team, with other teams in the Domain and Deutsche Telekom, and with software engineers using the cloud platform. - Approach work with a DevOps and continuous improvement mindset. - Contribute to the implementation of a reliable, scalable GCP platform under guidance from senior engineers. - Help design and maintain the landing zone in line with security and privacy guidelines. - Implement infrastructure-as-code (e.g., Terraform) for repeatable environments with code reviews and mentorship. - Assist with configuring networking components (load balancing, troubleshooting, DNS config, interconnect, VPCs, routing, VPN, distributed networks and how to integrate networks with cloud services) following established patterns. - Set up monitoring, logging, dashboards, and assist in alert tuning and runbook updates. - Automate routine platform tasks (CI/CD pipelines, scripts, tooling). - Maintain clear documentation and contribute to policies, standards, and guidelines. - Maintain current technical knowledge and make recommendations to help the team, hub, and company excel. Qualifications - English language proficiency (team and hub language is English). - 2 years of experience in cloud/platform/DevOps/SRE roles, or equivalent. - Familiarity with at least one major cloud (GCP preferred): core services like compute, storage, IAM, and basic networking concepts. - Proficiency in one programming language (Python, Go, Java, or Node.js) and shell scripting. - Understanding of version control with Git and CI/CD fundamentals. - Exposure to infrastructure-as-code concepts (Terraform or similar). - Knowledge of containers (Docker) and microservices concepts. - You are curious and interested in the latest technologies, especially cloud, agile methods, and DevOps. - You are willing to enter unknown territory, make mistakes, and learn from them together. - You enjoy varied topics in an interdisciplinary team and have high intrinsic motivation. - You can present and communicate ideas (e.g., about architecture) in a visual form. Requirements - Hands-on labs or projects on GCP, AWS, or Azure. - Intro-level certifications or in progress (e.g., Google Associate Cloud Engineer, AWS Solutions Architect – Associate, Azure Fundamentals). - Basics of networking (HTTP, TLS, load balancing), databases (SQL/NoSQL), and platform security/IAM. - Exposure to monitoring/logging tools (Cloud Monitoring/Logging, Prometheus, Grafana). Benefits - Dedicated mentorship from senior cloud architects/platform engineers. - Learning time and budget for certifications (target: Google Associate Cloud Engineer within 6–12 months). - Opportunities to present mini-architectures/POCs and contribute to platform standards. Additional Information - First months success indicators: - Ship reviewed Terraform code and CI/CD pipelines for production environments. - Implement monitored, documented configurations aligned with security guidelines. - Complete agreed GCP learning paths and attain at least one entry-level certification. - Please be informed that our remote working possibility is only available within Hungary due to European taxation regulation.


