Valid, a evolução da confiança.
DevOps Engineering Analyst III
Location
Brazil
Posted
2 days ago
Salary
0
Seniority
Senior
Job Description
DevOps Engineering Analyst III
Valid
• Administer and optimize Linux environments at an advanced level. • Manage Java application servers. • Implement and maintain Kubernetes clusters with a focus on scalability and resilience. • Maintain a strong focus on environment security. • Configure and monitor web servers and load balancers (Apache, NGINX, HAProxy). • Collaborate with cross-functional teams to ensure system reliability and drive innovation.
Job Requirements
- Bachelor's degree in a related field
- Advanced Linux administration
- Java application servers
- Kubernetes
- Apache, NGINX, and HAProxy
- Proxmox
- LDAP
- Python
- Ansible
- Availability to work in São Paulo or Rio de Janeiro
- Preferred: VSCode with AI features
- Service automation
- CI/CD
- GCP and AWS cloud
Benefits
- Medical insurance
- Dental plan
- iFood benefits
- Wellhub
- Transportation allowance (Vale-Transporte)
- Childcare assistance
- Profit-sharing (PLR)
- Life insurance
- Remote work model
- Day off
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
• Operate, administer, and continuously optimize our internal analytics platforms • Automate repetitive operational tasks and refine the workflows around them • Build and maintain observability, monitoring, and security mechanisms that keep services healthy • Troubleshoot end-to-end — across Linux systems, databases, OCI services, CI/CD pipelines, APIs, dashboards, and data refresh workflows • Partner with technical and business teams to keep services reliable and performant • Help shape the scalability, stability, and resilience of our infrastructure
DevOps Generalist
Digistore24 USAA full-service vendor & affiliate platform with one of the world’s largest affiliate marketplaces. #MoreSalesLessWork
Role Description Are you an experienced SRE or DevOps engineer? Do you want the freedom to work remotely and want to grow in the new field of site reliability at an internationally successful software and education company? Well, than take our reliability to the next level as part of our Site Reliability Engineering team :) Your new dream job includes: - Automation and Infrastructure as Code (IaC) : Automate repetitive tasks, deployments, and system management to reduce human error and improve efficiency. - Reliability and Performance Optimization : Continuously improve system uptime by identifying bottlenecks and optimizing system architecture. - Capacity Planning and Scaling : Assess and predict system resource requirements (CPU, memory, storage) to ensure infrastructure can scale with increasing demand. - System Monitoring and Incident Response : Monitor system performance, uptime, and reliability using tools like Prometheus, Grafana, or ElasticSearch. - Incident Postmortems and Continuous Improvement : Conduct root cause analysis (RCA) after incidents to identify what went wrong and how to prevent similar issues in the future. Qualifications - Communication Mastery : Communicate precisely and in a recipient-friendly manner, diffusing potential conflicts with sensitivity. - Collaboration Wizardry : Collaborate with developers, stakeholders, and operations to bring everyone on the same page. - Automation Sorcery : Promote automation to save time and reduce errors, implementing tools that improve productivity. - Problem-Solving Genius : Dive deep into problems, identify root causes, and come up with solutions. - Self-organization : Thrive on autonomy and excel at organizing and structuring complex projects. Requirements - Tech stack : Kubernetes / Container Technology, CI/CD (Github Workflows, Helm, Kustomize), Cloud Services (preferably Google). - Excellent spelling and grammar in German. - PHP language experience would be a plus. Benefits - Play a crucial role in shaping cutting-edge projects in a collaborative work environment. - Work in coworking spaces or from home, ensuring uninterrupted internet access. - Regular further education opportunities. - Stability of an extremely successful German high-tech company. - Outcome-focused teams and a culture of direct feedback. - Modern equipment: Thinkpad or MacBook. - International, collaborative team with strong cohesion. - Spectacular team events in various European countries. - Autonomy from day one. - Contribution to the retirement scheme. - Work in your team on a first-name basis, without a dress code. - Flexible working hours from Mondays to Fridays (core working hours from 10AM to 4PM).
• Support and improve many hybrid production infrastructure environments for more than 15 development teams handling 100+ products, 10K+ domains and Billions of hits day. • Help developers solve complex technical problems for multiple stacks and environments. • Architect and plan improvements of a multi-datacenter development environment. • Participate in design decisions, including new technology research and prototyping. • Advocate and help development teams to migrate their services to more automated and elastic infrastructures leveraging best-of-breed technology (Cloud, Kubernetes, Serverless, etc.) • Documenting processes and monitoring performance metrics for all our environments and services. • Promote and implement proper CICD strategies and practices. • Help develop and ameliorate critical cloud-based shared services serving many Products. • Support and augment the coverage of our Infrastructure as Code (IaC) setup. • Mentoring junior DevOps in gaining experience and follow best practices.
• Improve the deployment experience for our new system. • Reduce operational bottlenecks that slow down engineering and feature delivery. • Strengthen our AWS production setup, currently based on ECS and containers. • Improve our GitHub Actions CI/CD workflows. • Work with Terraform / OpenTofu to make infrastructure safer, clearer, and easier to change. • Improve production debugging across AWS, containers, networking, Linux, and application-level issues. • Improve our observability across the three pillars: metrics, logs, and traces. • Create or improve runbooks, repo instructions, service maps, deployment guides, and operational documentation. • Introduce agentic engineering workflows that help engineers diagnose issues, propose fixes, and validate changes before they reach production. • Design safe guardrails for agent-assisted work: permissions, approval gates, auditability, sandboxing, rollback procedures, and human review.




