Job Closed
This listing is no longer active.
💻 Descomplicamos o acesso à informação jurídica por meio da tecnologia
Senior Site Reliability Engineer – FinOps
Location
Brazil
Posted
75 days ago
Salary
0
Seniority
Senior
Job Description
Senior Site Reliability Engineer – FinOps
Jusbrasil
• Ensure the reliability, availability, and scalability of systems and services. • Develop and implement monitoring and observability solutions focused on performance and cost. • Create and maintain infrastructure as code using tools such as Terraform. • Work closely with the Engineering Platform, SRE Partner, FP&A, and product teams. • Help build and evolve a culture oriented toward financial efficiency (FinOps) and reliability (SRE). • Actively contribute to the evolution of the Agentic Engineering Platform. • Support continuous improvement initiatives for infrastructure, automation, and performance. • Contribute to migrations of critical systems.
Job Requirements
- Experience with cloud environments, preferably GCP.
- Proficiency with observability tools and practices (Prometheus, Grafana, Metabase, Loki, Thanos, Elasticsearch, DORA metrics, etc.).
- Strong knowledge of infrastructure as code (IaC) and Terraform.
- Ability to analyze logs and distributed system performance.
- Solid knowledge of Linux and Kubernetes.
- Experience with cloud cost, pricing, and financial management.
- Knowledge of cloud resource tagging and cost governance.
- Knowledge of migrations for critical systems (databases, applications, environments).
- Data-driven mindset, viewing the infrastructure environment holistically and using data to support decisions.
Benefits
- Transforming the justice system with technology is not a trivial challenge.
- Jusbrasil positions itself as a company that values those who pursue deep mastery.
- We are building something big.
- We operate with intensity, focus, and excellence.
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
• Execute configuration, administration and troubleshooting activities on physical and virtualized networks present in the NFVi environment, including L2/L3 switches, virtual switches, SDN controllers and overlay/underlay networks. • Perform acceptance tests and functional validation of new elements and systems. • Work with monitoring and inventory teams to validate integrations, adjust metrics and ensure environment visibility. • Document operational routines, troubleshooting procedures and propose continuous improvements to increase operational efficiency. • Monitor network KPIs and metrics, applying tuning, load balancing and preventive adjustments to avoid congestion, packet loss and bottlenecks. • Perform connectivity troubleshooting involving VLANs, VXLAN, BGP/EVPN, OSPF; Port-channels, VLT/MLAG, STP/ERPS and Overlay/Underlay. • Provide technical support for NFVi infrastructure evolution projects such as upgrades, migrations and integrations. • Collaborate with technical teams to resolve issues and continuously improve environments.
• As a Staff DevOps Engineer, you will play a critical role in our dynamic team, contributing to the management, operation, and optimization of our GCP, AWS, and Azure cloud infrastructure. • Implement and Manage Infrastructure : Utilize Terraform to define and provision our GCP infrastructure, ensuring it is scalable, reliable, and cost-effective. • Automate Configuration Management : Use Ansible to automate the configuration and management of our systems, improving efficiency and reducing the risk of manual errors. • Monitor and Optimize Performance : Leverage DataDog to monitor system performance and application metrics, identifying and resolving issues before they impact our users. • Collaborate Across Teams : Work closely with software engineers, QA, and operations teams to ensure smooth integration and delivery of our applications. • Innovate and Improve : Continuously explore and implement new tools and practices to enhance our DevOps processes and improve overall system reliability and performance.
• Improve observability, reliability and availability by defining and measuring key metrics • Build automation and improve systems to eliminate toil and operations work. • Collaborate with our core infrastructure team to performance tune and optimize our cloud deployments. (Think Docker, Terraform, Kubernetes, EC2, etc.) • Collaborate with Coinbase product teams to reduce service disruptions and automate incident response. • Proactively find and analyze reliability problems across our business units and stack, then design and implement software to create step-function improvements. • Educate, mentor and hold accountable the engineering team to improve the reliability of our systems and make reliability a core value of the Coinbase engineering culture. • Write high quality, well tested code to meet the needs of your customers. • Debugging extremely difficult technical problems, and making systems and products both work better and are easier to deploy, own, operate and diagnose. • Review all feature designs within your product area and across the company for cross-cutting projects. • Be an owner of the security, safety, scale, operational integrity, and architectural clarity of these designs. • Build pipelines to integrate with 3rd party vendors. • Participate in an on-call support rotation to provide timely troubleshooting and resolution of urgent issues.
Site Reliability Engineer II – Government Cloud
Ping IdentityIdentity Security for the Global Enterprise
• Work both collaboratively and independently to design production infrastructure that prioritizes resiliency, observability, and cost-efficiency. • Shape mission-critical solutions using optimized CI/CD pipelines that integrate automated security controls and compliance checks. • Help build and maintain infrastructure within FedRAMP authorized boundaries, ensuring continuous compliance with NIST 800-53 controls. • Orchestrate containerized workloads to ensure high availability, implementing self-healing patterns and automated scaling. • Proactively communicate technical concepts to diverse audiences and share your expertise to help the team develop. • Participate in planning, identify areas for improvement, and join an on-call rotation to maintain the health of our cloud solutions.




