Job Closed

This listing is no longer active.

DEUNA logo
DEUNA

Libera el poder de tu e-commerce

Site Reliability Engineer

DevOps EngineerDevOps EngineerFull TimeRemoteSeniorTeam 51-200H1B No SponsorCompany SiteLinkedIn

Location

Mexico

Posted

59 days ago

Salary

0

Seniority

Senior

Bachelor DegreeEnglishAWSGrafanaPrometheusGo

Job Description

Site Reliability Engineer

DEUNA

• Design, define, and maintain observability and monitoring for our AWS infrastructure. • Define and track SLIs, SLOs, and SLAs for critical systems. • Improve system uptime, latency, and fault tolerance across the platform. • Provide internal libraries and toolsets to developers for diagnostics and debugging. • Manage scaling, performance, and resilience efforts related to system reliability. • Collaborate with technical teams on capacity planning, load testing, and scaling policies. • Improve production operations by defining and evolving deployment strategies and conducting disaster recovery (DR) testing.

Job Requirements

  • Excellent communication and collaboration skills.
  • Adaptability to thrive in dynamic, fast-paced environments.
  • Strong time management and task prioritization.
  • Proficiency in English.
  • Expertise with Prometheus, Grafana, OpenTelemetry, AWS CloudWatch, or other observability tools.
  • Experience designing dashboards, alerts, and log aggregation pipelines.
  • Deep understanding of AWS services: ECS, Lambda, RDS, CodePipeline.
  • Strong proficiency in Go programming language.
  • Skilled at defining SLIs, SLOs, error budgets, and improving Mean Time to Recovery (MTTR).
  • Experience conducting failure drills (e.g., Chaos Monkey, Gremlin) to ensure system resilience.

Benefits

  • Vacations and additional PTO 🏝️
  • Remote work from anywhere 💻
  • Economic support for health insurance, internet and cell phone line📱🌐
  • We all own DEUNA, we offer stock options 💸
  • Learning and development platform 📚
  • Multidisciplinary, diverse and dynamic team 🧡
  • Growth and career path 🚀

Related Categories

Related Job Pages

More DevOps Engineer Jobs

• Define and drive the technical vision, architecture, and strategy for YugabyteDB’s Database-as-a-Service (DBaaS). • Lead, Design, develop, test, debug, troubleshoot, and maintain components of the DBaaS cloud infrastructure • Manage operational priorities of the DBaaS infrastructure • Establish processes for handling and leading response to incidents on databases or infrastructure • Automate and manage regular maintenance operations such as upgrades etc. • Design and build DBaaS processes for encryption, security key/password management, storage management, etc. • Utilize SRE golden signals to analyze and optimize the DBaaS system's performance and reliability strategies

California
$220K - $250K / year
PropertyRadar logo

Senior DevOps Engineer

PropertyRadar

Data-driven real estate and home services professionals use PropertyRadar to drive new business directly since 2007.

DevOps Engineer59 days ago
Full TimeRemoteTeam 11-50Since 2007H1B No Sponsor

• Design and implement scalable, resilient, and secure cloud architectures on AWS • Provide proactive monitoring, management, and support for cloud environments • Lead the migration of legacy AWS cloud workloads and services to AWS managed services • Champion automation initiatives, develop and implement Infrastructure-as-Code (IaC) solutions • Diagnose and resolve complex cloud-related issues and enhance service delivery

California
$100K - $150K / year
Job Closed
IO Connect Services logo

Senior Site Reliability Engineer

IO Connect Services

Cloud Technologies | Enterprise Integrations | E-Commerce | Retail | Cloud-Native Development | DevOps | MSP

DevOps Engineer59 days ago
Full TimeRemoteTeam 51-200H1B No Sponsor

• Responsible for designing, building, maintaining, and scaling production services and server farms across multiple data centers for complex and data-intensive cloud services. • Design and enhance software architecture to improve scalability, service reliability, capacity, and performance. • Write automation code for provisioning and operating infrastructure at massive scale. You are not an operator, you’re an experienced software engineer focused on operations. • Work with development teams to make sure the applications fit nicely within the infrastructure and scalability/reliability is designed and implemented from the grounds up. You will work with QA on building pipelines and automation for delivering and deploying applications to production. • Roll up the sleeves to troubleshoot incidents, formulate theories and test your hypothesis, and narrow down possibilities to find the root cause. • Write postmortem reviews and remediation recommendation. • Identify bad trends before they become problems; respond to automated system alerts, effectively troubleshoot system errors and work incidents to return systems to normal operating conditions • Author and update high-quality documentation of all relevant specifications, systems and procedures • Support and comply with the company’s Quality Management System policies and procedures.

Mexico
Full TimeRemoteTeam 1-10H1B Sponsor

• Supporting clients' custom and off-the-shelf software through all phases of the lifecycle • Owns the change request process and may coordinate with other teams as necessary • Provides technical advice and weighs in on technical decisions that impact cross functional teams • Researches and may propose new technologies • Develops and owns list of final enhancements • Develops and defines application scope and objectives and prepares technical and/or functional specifications • Performs technical design reviews and code reviews • May own technical testing to ensure unit test is completed and meets the test plan requirements • Assesses current status and supports data information planning • Coordinates on-call support and ensures effective monitoring of system • Mentors others and may lead multiple or small to medium sized projects • Provides technical guidance, and mentoring • Maintain application and environment configuration through automated processes • Monitor testing and production environments to ensure stable operation • Perform initial triage of application issues to ensure rapid resolution

Illinois