Job Closed
This listing is no longer active.
We help companies achieve their goals and expand their business through technology.
NC - DevOps Engineer - 231
Location
Brazil
Posted
79 days ago
Salary
0
Seniority
Mid Level
Job Description
NC - DevOps Engineer - 231
Thaloz
We are looking for a highly skilled DevOps Engineer with deep expertise in container orchestration and cloud infrastructure. You will be responsible for designing, deploying, and maintaining scalable, reliable, and secure infrastructure across AWS environments. You will work closely with development, security, and platform teams to accelerate software delivery and ensure operational excellence. Responsibilities: • Design, deploy, and manage containerized workloads using Amazon ECS (Elastic Container Service) and Amazon EKS (Elastic Kubernetes Service). • Build and maintain CI/CD pipelines to automate software delivery workflows. • Develop and manage Docker container images, registries (ECR), and container lifecycle best practices. • Implement Infrastructure as Code (IaC) using tools such as Terraform, CloudFormation, or CDK. • Monitor, troubleshoot, and optimize cloud infrastructure performance, availability, and cost. • Enforce security best practices across containerized environments (IAM roles, network policies, secrets management). • Collaborate with software engineers to containerize applications and migrate workloads to ECS/EKS. • Manage Kubernetes cluster configurations, namespaces, Helm charts, and service mesh integrations. • Define and maintain observability standards using tools like CloudWatch, Prometheus, Grafana, or Datadog. • Participate in on-call rotations and incident response processes.
Job Requirements
- 5+ years of experience in a DevOps, Platform Engineering, or Site Reliability Engineering role.
- Advanced expertise in Amazon ECS – task definitions, services, capacity providers, Fargate & EC2 launch types.
- Advanced expertise in Amazon EKS – cluster provisioning, node groups, autoscaling, RBAC, and networking (VPC CNI, CoreDNS).
- Deep knowledge of Docker and container best practices (multi-stage builds, image optimization, security scanning).
- Strong experience with Kubernetes concepts: Deployments, StatefulSets, DaemonSets, Ingress, ConfigMaps, Secrets, HPA/VPA.
- Proficiency in Infrastructure as Code (Terraform preferred).
- Solid understanding of AWS networking (VPC, subnets, security groups, ALB/NLB, Route 53).
- Experience with CI/CD tools such as GitHub Actions, Jenkins, GitLab CI, or AWS CodePipeline.
- Strong scripting skills in Bash, Python, or similar languages.
- Familiarity with GitOps workflows (ArgoCD, Flux).
- Nice to Have:
- AWS Certifications: AWS Certified DevOps Engineer – Professional, AWS Certified Solutions Architect.
- Kubernetes Certifications: CKA (Certified Kubernetes Administrator) or CKAD.
- Experience with service mesh technologies (Istio, AWS App Mesh).
- Knowledge of FinOps practices for container cost optimization.
- Experience with multi-account AWS Organizations and landing zone architectures.
- Familiarity with security tools such as Trivy, Snyk, or AWS Security Hub.
- Soft Skills:
- Strong problem-solving and analytical mindset.
- Excellent communication skills – able to translate complex infrastructure topics to non-technical stakeholders.
- Proactive, self-driven, and able to work in a fast-paced, agile environment.
- Team player with a collaborative approach to cross-functional work
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
DevSecOps Engineer
DeelDeel is a financial services company that has developed a payroll system for remote teams, connecting localized payments and compliance in the convenience of one platform. The priv
• Develop and maintain automated security tools and processes to identify vulnerabilities, perform code analysis, monitor systems and conduct security testing. This includes integrating security scanners, static code analysis tools, and vulnerability assessment tools into the CI/CD pipeline. • Work with infrastructure and operations teams to design and implement secure cloud infrastructure, network architecture, and deployment processes. This involves ensuring proper access controls, encryption, and monitoring are in place. • Implement security monitoring tools and processes to proactively identify and respond to security events and anomalies. This includes log analysis, intrusion detection, and system monitoring. • Foster collaboration and communication between development, operations, and security teams. Act as a liaison to ensure that security requirements are understood and integrated into the development process. • Assist in compliance assessments and audits to ensure adherence to regulatory requirements and industry standards. Collaborate with auditors and provide necessary documentation and evidence of security controls.
Principal Engineer - Release Engineering
FastlyFounded in 2001, Fastly is a privately-held internet company offering the Fastly Edge Cloud platform, a content delivery network that helps digital businesses s
Role Description We are looking for a Principal Release Engineer to join Fastly’s Release Engineering team. The Release Engineer is responsible for the set-up, maintenance, and ongoing development of continuous build/integration and deployment infrastructure. In this role, you will create and maintain fully automated CI build processes for multiple environments, including our global edge cache fleet, internal applications, and applications hosted in AWS and GCP. The ideal candidate will care deeply about providing other engineers with a seamless release experience and have a deep understanding of what engineers care about, how they ship code, and what world-class delivery infrastructure looks like. Responsibilities - Design, build, and operate release tooling across building, packaging, signing, artifact management, and deploying software. - Drive initiatives that make our engineers happier and more productive by reducing lead time for changes. - Collaborate with development and SRE teams to develop policies, standards, guidelines, governance, and related guidance for CI/D operations. - Support developers with build automation, merge resolution, CI, test automation, deployment based on tools usage and policies, standards. - Troubleshoot issues along the CI/D pipeline. - Participate in on-call support rotation. Qualifications - 10+ years of experience. - Ability to excel within an "Agile" environment (i.e. user stories, sprints, iterative development, continuous integration, continuous delivery, shared ownership, test-driven development, etc.). - Deep expertise in at least one of the following languages: Ruby, Python, Go. - Expertise with automation tools such as Jenkins, GitHub Actions, or Dagger. - Strong written and verbal communication skills. - Experience with Infrastructure-as-Code frameworks such as Chef, Terraform, Ansible, etc. - Familiarity with Varnish, Nginx, or other cache and proxy servers. - Knowledge of source code control management systems and configuration management (i.e. Git, GitHub, etc.) and code branching/merging strategies. - Experience with Linux and containerization, particularly with Docker & orchestration platforms like Kubernetes. - Experience with a Cloud-based environment, particularly AWS and/or GCP. Both would be ideal! - Good understanding of quality control and test automation in agile-based continuous integration environments. - Experience with Omnibus and/or Debian packaging. - Experience with artifact repositories such as Artifactory or Sonatype Nexus. - Some experience with SQL and relational databases administration (i.e. Oracle, MySQL). - Open source license tracking, auditing, and reporting. Requirements - This position will require you to be available during core business hours and occasional nights and weekends as needed for on-call support. Benefits - We care about you. Fastly works hard to create a positive environment for our employees, and we think your life outside of work is important too. - We offer a comprehensive benefits package designed to meet your needs. Our offerings may vary depending on the country where you work and are subject to change. Company Description - Fastly helps people stay better connected with the things they love. - Fastly’s edge cloud platform enables customers to create great digital experiences quickly, securely, and reliably. - Fastly’s customers include many of the world’s most prominent companies, including GitHub, Yelp, Paramount, and JetBlue. - We're building a more trustworthy Internet.
Site Reliability Engineer - Observability
CluepointsAt CluePoints, we’re redefining how clinical trials are run. As the premier provider of Risk-Based Quality Management (RBQM) and Data Quality Oversight software, we harness advanced statistics, artificial intelligence, and machine learning to ensure the quality, accuracy, and integrity of clinical trial data, helping life sciences organizations bring safer, more effective treatments to patients faster. Ambitious, fast-growing technology scale-up Dynamic and diverse international team representing more than 20 nationalities Culture of collaboration, flexibility, and continuous learning Values of Care, Passion, and Smart Disruption Mission to create smarter ways to run efficient clinical trials and deliver AI-powered insights that improve human outcomes worldwide
Role Description The Site Reliability Engineer, Observability & RUM is responsible for improving end-to-end observability across our platforms and customer-facing applications, with a particular focus on frontend and Real User Monitoring (RUM). This role combines core SRE practices with ownership of monitoring, logging, tracing, alerting, and user-experience telemetry in production. - Help evolve observability capabilities across Azure and Kubernetes environments. - Improve incident detection and diagnosis. - Support decisions around managed versus self-managed observability tooling. - Partner closely with Engineering, Support, QA, and Security teams to ensure systems ship with actionable telemetry, dashboards, alerts, and operational runbooks. Qualifications - 5+ years of experience in Site Reliability Engineering, DevOps, Platform Engineering, or Observability Engineering roles. - Strong hands-on experience with observability and monitoring platforms, including several of the following: Elastic, Grafana, Prometheus, OpenTelemetry, Sentry, monitoring agents, and managed APM/observability platforms. - Experience implementing and supporting Real User Monitoring (RUM) and frontend/application observability in production environments. - Ability to work across frontend, backend, and platform teams to improve telemetry, alerting, and incident diagnosis. - Experience evaluating or operating managed observability platforms and understanding the trade-offs versus self-managed stacks. Requirements - (Nice to have) Experience supporting ML, AI, or LLM-backed services in production (RAG, LangSmith, Arize Phoenix, LangChain, LangGraph, Azure OpenAI, OpenAI, or Anthropic APIs). Company Description At CluePoints, we’re redefining how clinical trials are run. As the premier provider of Risk-Based Quality Management (RBQM) and Data Quality Oversight software, we harness advanced statistics, artificial intelligence, and machine learning to ensure the quality, accuracy, and integrity of clinical trial data, helping life sciences organizations bring safer, more effective treatments to patients faster. - Ambitious, fast-growing technology scale-up. - Diverse international team representing more than 20 nationalities. - Culture of collaboration, flexibility, and continuous learning. - Values of Care, Passion, and Smart Disruption. - Mission to create smarter ways to run efficient clinical trials and deliver AI-powered insights that improve human outcomes worldwide.
Junior DevOps Engineer
Wilcore Technologies, Inc.Elevating federal solutions by creating experiences that empower humanity.
• Help build, deploy, and manage cloud infrastructure and CI/CD pipelines in support of modern application development. • Work across Azure environments, supporting automation, monitoring, and secure delivery practices that keep applications reliable, scalable, and resilient.



