Smartsheet

Modern work management platform

Senior DevOps Engineer

DevOps EngineerDevOps EngineerFull Time Remote SeniorTeam 1,001-5,000Since 2005H1B SponsorCompany Site LinkedIn

Location

Bulgaria

Posted

55 days ago

Salary

Seniority

Senior

Bachelor Degree5 yrs expEnglishAWS Cloud HAProxy Kubernetes Linux NGINX Node.js Python Terraform Go

Job Description

• Own and evolve the edge proxy platform: Maintain, upgrade, and extend a high-performance reverse proxy — including maintaining the proxy binary and its configuration tooling, writing Go and Python automation, managing the full container image lifecycle on hardened Linux base images, and working across the broader edge layer, including CDN, WAF, and traffic management capabilities. • Build and maintain cloud infrastructure as code: Design and implement Terraform/Terragrunt modules and live environment configurations managing EKS clusters, load balancers, IAM roles, VPC networking, ECR registries, and supporting AWS services across multiple regions including GovCloud. • Operate Kubernetes clusters at scale: Manage multi-region, multi-cluster EKS deployments via FluxCD GitOps workflows and Helm charts, including node AMI rotation, add-on lifecycle management, and horizontal pod autoscaling. • Build and own CI/CD pipelines: Design, maintain, and improve shared GitLab CI/CD pipeline templates used across all team repositories; build and operate alternative pipeline workflows for isolated government cloud environments. • Automate operational toil: Build and maintain tooling for tasks such as container image patching, EKS AMI rotation, air-gapped ECR image sync to GovCloud, and automated MR creation for monthly version-bump patching cycles. • Manage observability and on-call: Provision and maintain Datadog SLOs, monitors, and dashboards via Terraform; participate in the team's on-call rotation responding to edge proxy incidents across production and GovCloud environments. • Support FedRAMP/GovCloud operations: Operate the GovCloud environment with its unique constraints — air-gapped image distribution, infrastructure automation in isolated networks, and alert management with compliance-aware data handling. • Evaluate and adopt internal developer tooling: Research, prototype, and drive the adoption of internal tools that improve engineering productivity across the company — including developer portals, platform self-service capabilities, and other tooling that raises the bar for the developer experience at Smartsheet. • Mentor and collaborate: Share knowledge across the team through code reviews, architecture discussions, and runbook authorship; foster a culture of engineering excellence and operational rigour. • Strategically apply AI tools: Strategically apply and champion AI tools within your team's domain to improve project execution, infrastructure design, quality, and debugging, leading adoption of AI best practices.

Job Requirements

5+ years of experience in DevOps, platform engineering, or site reliability engineering.
A BS or MS in Computer Science, Engineering, or a related field, or equivalent industry experience.
Deep proficiency with Terraform and Terragrunt for managing production cloud infrastructure at scale across multiple environments and regions.
Strong Kubernetes expertise, including EKS cluster operations and Helm chart authoring.
Hands-on experience with AWS networking and container workload services: EKS, ALB/NLB, VPC, IAM, ECR, Route53, CloudWatch, and EventBridge.
Proficiency in at least one general-purpose programming language — Go or Python preferred — for building operational tooling and automation.
Solid understanding of reverse proxies, API gateways, or load balancers (NGINX, HAProxy, or equivalent).
Experience designing and maintaining CI/CD pipelines (GitLab CI preferred), including shared template libraries across multiple repositories.
Experience with container image security practices: hardened base images, vulnerability scanning, and image promotion workflows.
Strong operational instincts: comfort with on-call responsibilities, incident response, runbook authorship, and postmortems in production environments.
1 year of professional experience leveraging AI-based workflows to author, maintain, review, and deploy infrastructure or code.
Fluency in English is required.
Legally eligible to work in Bulgaria on an ongoing basis.

Benefits

Health insurance
Retirement plans
Paid time off
Flexible work arrangements
Professional development

Related Categories

DevOps Engineer

Related Job Pages

Remote Full-time Jobs (US)Remote Python Jobs (US)More Remote Jobs

More DevOps Engineer Jobs

AI Solutions - Agent Engineer

Marco Technologies

This is a remote-eligible position; however, Marco Technologies requires employees to reside within one of the following states: DE, FL, IA, IL, IN, KY, MD, MI, MN, MO, ME, NE, ND, NJ, PA, RI, SD, TX, WI.

DevOps Engineer55 days ago

Full Time Remote

Role Description The AI Solutions Engineer is a hands-on technical role focused on building, deploying, and supporting AI-driven solutions, including AI and Digital Employee offerings, that address real-world client and internal business needs. Operating at the intersection of emerging AI technologies and practical implementation, this role is responsible for executing on defined solution architectures, developing proof-of-concept and production-ready systems, and integrating AI capabilities into scalable, supportable environments. This individual works closely with Innovation leadership, AI Architects, engineering, and customer-facing teams to bring AI solutions to life. Responsibilities include: - Develop and support AI-based managed services and Digital Employee offerings from prototype through production deployment. - Engineer agentic systems with memory, tools, and orchestration. - Help define the future Digital Workforce platform. - Build and implement AI-driven solutions with a focus on LLMs, generative AI, AI agents, and automation platforms. - Execute on defined solution architectures by developing, integrating, and deploying AI systems. - Design and deliver proof-of-concept implementations to validate AI use cases. - Collaborate with AI Architects, Innovation, engineering, and operations teams. - Integrate AI solutions into client environments. - Ensure AI solutions align with client business objectives and governance requirements. - Support client engagements including technical discussions and ongoing optimization. - Translate complex AI concepts into clear, practical guidance. - Contribute to documentation, enablement materials, and internal training. - Continuously monitor emerging tools, frameworks, and best practices. - Participate in cross-functional planning sessions to align AI initiatives. Qualifications - Bachelor’s degree in Computer Science, Information Technology, Data Science, or a closely related discipline. - 3–6+ years of experience in software engineering, solution implementation, automation, or related technical roles. - Hands-on experience building or implementing artificial intelligence, machine learning, generative AI, or large language model (LLM)-based systems. - Experience developing and integrating APIs, automation workflows, or distributed systems in production environments. - Relevant certifications related to cloud platforms, AI/ML, or software engineering are preferred but not required. Requirements - Hands-on experience building and deploying AI solutions, including large language models (LLMs), agent-based systems, and automation workflows. - Proficiency in software development and integration (e.g., Python, APIs, SDKs, or equivalent modern development tools). - Experience working with cloud platforms and modern application architectures (cloud-native or hybrid environments). - Ability to implement and integrate AI solutions into business processes and existing technology ecosystems. - Strong problem-solving skills with the ability to move from concept to working solution quickly. - Effective communication skills, with the ability to explain technical concepts to both technical and non-technical audiences. - Proven ability to collaborate across engineering, innovation, sales, and delivery teams. - Demonstrated curiosity and commitment to continuous learning in a rapidly evolving AI landscape. - Ability to balance rapid experimentation with building stable, supportable solutions. Benefits - Pay Range: $109,855 - $175,768 annually. - The pay range listed for this position is based on candidate's skill level, experience, relevant licenses, and educational background. - For detailed information about our benefits, please visit our careers page at www.marconet.com/careers . Location This is a remote-eligible position; however, Marco Technologies requires employees to reside within one of the following states: DE, FL, IA, IL, IN, KY, MD, MI, MN, MO, ME, NE, ND, NJ, PA, RI, SD, TX, WI.

View details: AI Solutions - Agent Engineer

United States

$109.9K - $175.8K / year

Apply

DevOps / SRE Engineer - AI Platform

Makro PRO

Makro PRO is an exciting new digital venture by the iconic Makro. Our proud purpose is to build a technology platform that will help make business possible for restaurant owners, hotels, and independent retailers, and open the door for sellers. We welcome bold, energetic, and thoughtful people who share our belief in collaboration, diversity, excellence, and putting customers at the heart of our work. Clear focus Diverse Workplace (Our members are from around the world!) Non-hierarchical and agile environment Growth opportunity and career path

DevOps Engineer55 days ago

Full Time Remote

Role Description The DevOps / SRE Engineer owns the operational substrate of an AI-native retail decisioning platform — infrastructure, CI / CD, observability, cost meter, and incident response for a system that runs production agents taking real business actions. The role builds on the enterprise Terraform standard, CI / CD spine, and FinOps tagging policy rather than reinventing parallel infrastructure. Remote candidates outside of Thailand are welcome to apply. - Adopt the enterprise Terraform standard and module library for all platform infrastructure; author platform-specific modules where needed (agent runtime, vector DB, knowledge graph); run drift detection weekly. - Build platform-specific CI / CD pipelines on the enterprise spine — service deploys, agent deploys, eval-gate enforcement; integrate eval gates so no agent reaches production without eval pass. - Operate rollback orchestration with sub-15-minute recovery; quarterly game days. - Own the platform observability stack — OpenTelemetry, Langfuse for LLM traces, custom dashboards for per-agent cost. - Implement the per-agent cost meter end-to-end — token counts, vector queries, model inference, downstream LLM Gateway costs; surface cost data to the enterprise GenAI cost dashboard. - Stand up the platform on-call rotation; author runbooks for every production agent and service; lead incident response with measurable corrective actions. - Implement platform cost-tagging policy consistent with the enterprise standard (team, domain, environment, project, agent, suite, persona); report monthly to Cost Review. - Drive cost optimisation — right-sizing, caching, model routing decisions, reserved compute. Qualifications - Bachelor's or Master's degree in Computer Science, Engineering, or a related discipline. - 5+ years SRE / DevOps with production ownership. - Terraform at scale — modules, state, drift, environment promotion. - CI / CD for data + ML / AI services (GitLab CI / CD or comparable). - Cloud platform (Azure preferred; AWS / GCP transferable). - Observability — OpenTelemetry, Langfuse (or comparable LLM traces), custom dashboards. - FinOps — tagging policies, attribution, optimisation. - Incident response — on-call, post-mortems, runbook authorship. Preferred Qualifications - AI / agent platform SRE experience; cost-meter / chargeback systems built or operated. - Multi-cloud production experience; open-source contributions to IaC / observability tooling. - AI / ML / agent system observability instrumentation (LLM cost, agent cost, eval scores). - Vendor certifications such as HashiCorp Terraform Associate / Professional, Azure Solutions Architect Associate, or Databricks Data Engineer Professional.

View details: DevOps / SRE Engineer - AI Platform

Worldwide

Apply

Senior Platform Engineer – SRE

Filigran

Uncover Threats. Take Action. Home of OpenCTI, OpenBAS and more.

DevOps Engineer55 days ago

Full Time RemoteTeam 201-500Since 2022H1B No Sponsor

Company Site LinkedIn

• Design, build, and operate production‑grade Kubernetes clusters on bare metal and cloud • Industrialize, automate, improve observability & monitoring • Continue to create a culture of service delivery excellence • Participate in on‑call rotation, incident management, and post‑incident reviews • Design and drive projects around DevSecOps practices in the company

Ansible AWS Azure Cloud ElasticSearch Google Cloud Platform Grafana Java Kubernetes Linux Open Source Prometheus Python RabbitMQ Redis Terraform Go

View details: Senior Platform Engineer – SRE

France

Apply

DevOps Engineer

RemotePro.ph

We are a US-based IT services firm with a consistently growing and fully remote PH team.

DevOps Engineer55 days ago

Full Time RemoteTeam 51-200Since 2013H1B No Sponsor

Company Site LinkedIn

The best way to look at this role is you would have the main responsibility to own our Linux systems and the responsibility for deployment and the support of team that will handle the maintenance and monitoring of applications on them. This will include custom and other more standard open source and proprietary applications. Our favorite candidate will be able to support at least basic needs for Window DevOps and more. **RESPONSIBILITIES:** - Plan/Design, Build and define the monitoring of our Linux and Windows applications and the systems that run them. - Implement client-requested integrations. - Design/Plan and support team in deploying updates and fixes - Conduct root cause analysis if issues - Investigate and resolve technical issues and create/provide resources to team to prevent. - Develop scripts to automate processes and updates - Provide technical support and design procedures for system troubleshooting and maintenance.

Apache Cloud Linux NGINX Open Source PHP Python

View details: DevOps Engineer

Philippines

Apply

Job Closed

Senior DevOps Engineer

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

AI Solutions - Agent Engineer

DevOps / SRE Engineer - AI Platform

Senior Platform Engineer – SRE

DevOps Engineer