Meaningful classroom learning for every student.
Senior Site Reliability Engineer
Location
Argentina
Posted
24 days ago
Salary
0
Seniority
Senior
Job Description
Senior Site Reliability Engineer
Newsela
• As a Senior Site Reliability Contractor, you will be involved in coding infrastructure automation with Terraform and Github Actions CI/CD. • Focus on improving our monitoring using Datadog, Sentry, and Cloudwatch. • Be on an on-call rotation to respond to incidents that impact Newsela.com availability and provide support for developers during internal and external incidents. • Maintain and assist in extending our infrastructure with Terraform, Github Actions CI/CD, Prefect, and AWS services. • Build monitoring that alerts on symptoms rather than outages using Datadog, Sentry and CloudWatch. • Look for ways to turn repeatable manual actions into automations to reduce on-call toil. • Improve operational processes (such as deployments, releases, migrations, etc) to make them run seamlessly with fault tolerance in mind. • Design, build and maintain core cloud infrastructure on AWS and GCP that enables scaling to support thousands of concurrent users. • Debug production issues across services and levels of the stack. • Provide infrastructure and architectural planning support as an embedded team member within a domain of Newsela’s application developers. • Plan the growth of Newsela’s infrastructure.
Job Requirements
- Infrastructure as code: use Terraform and Github CI/CD for automation, containerize our environments (Docker, ECS), and leverage cloud technologies to meet our goals
- Systems: manage, configure and troubleshoot operating system issues, storage (block and object), networking (VPCs, proxies and CDNs), and administer high-availability datastores (mySQL, Postgres, Neo4J) and Redis clusters
- Monitoring and instrumentation: implement metrics in Datadog, Sentry, log management and related systems, and Slack/JIRA integrations
- Engineering practices: availability, reliability and scalability, as well as disaster recovery
- Work in a variety of languages: Shell, IaC, Python, SQL
- Planning: familiarity with agile methodologies; use epics, issues to drive projects
- Organization: personal and team workload organization, OKR leadership
- Management: able to self-organize and accomplish tasks asynchronously
- Leading and contributing to scope and designs for issues, epics, and OKRs
- Contributing to Newsela architecture diagrams, process diagrams and runbook documentation
- Completing Root Cause Analysis (RCA) investigations and perform readiness reviews
- Improving team practices through code reviews, handoffs of work and incidents
- Knowledge sharing, mentoring within SRE and developer teams
- Self-awareness, handling conflict in the team, providing and receiving feedback
- Maintaining good relationships with other engineering teams at Newsela that help improve the product
- Accountability: willing to proactively step in and do the right thing while providing candid and constructive feedback.
Benefits
- Comprehensive medical benefits with employer contribution to premiums and to HSA accounts. Additional benefits such as gym reimbursement, pet insurance, free access to the Calm app, Rocket Lawyer and more to help you stay healthy: mind, body, and soul.
- We are a fully remote company. We provide a monthly tech stipend to support your WFH needs!
- Inclusive benefits to support you and your family, including parental leave, fertility support, adoption, and more!
- Invest in your future with our 401(k) plan, which includes a employer match to help you build long-term financial security.
- Flexible PTO, paid sick time off, company holidays plus winter break (Dec 24th - Jan 1st).
- Newsela offers an annual learning and development allowance to employees to attend external training sessions, classes, workshops, conferences, and educational materials to foster professional growth within their current role and career aspirations at Newsela.
- No matter your role or department, the work you do each day helps share the future of education and improves the lives of students and teachers.
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
Role Description Serás parte del equipo que es el motor de cambio para transformar la sociedad y mejorar la vida de las personas. Por eso desarrollamos soluciones innovadoras que ofrecen propuestas integrales de modernización de arquitecturas y gestión de aplicaciones. Impulsamos la evolución de sectores clave y conectamos talento, tecnología y negocio para generar impacto real. En Minsait construimos un futuro mejor para todos, un futuro sostenible, seguro y conectado. - Participarás en proyectos de migración de pipelines CI/CD a GitHub Actions. - Diseñarás e implementarás flujos de integración y entrega continua optimizados. - Gestionarás la orquestación de contenedores con Kubernetes. - Colaborarás con distintos equipos técnicos y de desarrollo para definir, construir y mantener arquitecturas DevOps eficientes. - Aportarás visión técnica para estandarizar y mejorar los procesos de despliegue en diferentes entornos de cliente. Qualifications - Experiencia avanzada como perfil DevOps, especialmente en entornos con CI/CD. - Dominio de GitHub Actions, tanto en su configuración como en la construcción de pipelines complejos. - Conocimiento experto de Kubernetes y su gestión en proyectos reales. - Visión técnica, capacidad de adaptación y trabajo colaborativo. - Valorable experiencia en migraciones desde otros orquestadores CI/CD (como Jenkins, GitLab CI…). Benefits - Estabilidad y Futuro: Proyectos a largo plazo en una empresa líder en tecnología con más de 50.000 profesionales, con seguridad financiera. - Proyectos Innovadores y de Alto Alcance: Trabajarás con tecnologías de vanguardia, con un impacto tanto a nivel nacional como internacional. - Ambiente Cercano y Transparente: Disfrutarás de una comunicación directa y fluida con responsables y compañeros/as, en un entorno colaborativo y abierto. - Autonomía y Flexibilidad: Tendrás libertad para organizar tu trabajo, con una conciliación real y adaptada a tu ritmo. - Plan de carrera adaptado a ti: Diseñado para impulsar tu crecimiento y desarrollo profesional. - Formación continua en Open University y Udemy for Business (¡más de 6.000 cursos para especializarte!). - Descuentos exclusivos para tu bienestar: Disfruta de ventajas en gimnasios, restaurantes, tiendas, ocio y mucho más al ser empleado de Indra. - Retribución competitiva y planes de compensación flexibles a tus necesidades. Selection Process - Revisión de tu perfil: Evaluamos tu experiencia y habilidades para determinar si encajas con lo que buscamos. - Primera toma de contacto (5-10 min): Si recibes una llamada de un número desconocido, ¡es nuestro equipo! Será una conversación rápida para conocerte y resolver cualquier duda. - Entrevista técnica: Te reunirás con el equipo, quienes te explicarán el proyecto y las tareas diarias. También exploraremos tus conocimientos técnicos. Además, te realizaremos breves pruebas de competencias psicológicas e inglés (si es necesario). - Entrevista con el equipo de atracción de talento: Queremos que nos conozcas mejor como empresa: valores, flexibilidad, modelo de carrera... para que, tanto tú como nuestro equipo de talento, podamos analizar si hay match. - Oferta y bienvenida: Si todo va bien, ¡te incorporas a nuestro equipo y comenzamos esta nueva etapa!
Site Reliability Engineer III
Rent the RunwaySince its launch in 2009, Rent the Runway has championed a new market within the fashion industry. The company offers online clothing and accessories rentals for women across all 5
Title: Site Reliability Engineer III Location: Galway, Ireland Job Description: About Rent the Runway Founded in 2009, Rent the Runway is disrupting the trillion-dollar fashion industry and changing the way women get dressed through the Closet in the Cloud, the world’s first and largest shared designer closet. RTR’s mission has remained the same since its founding: powering women to feel their best every day. Through RTR, customers can subscribe, rent items a-la-carte and shop resale from hundreds of designer brands. The Closet in the Cloud offers a wide assortment of millions of items for every occasion, from evening wear and accessories to ready-to-wear, workwear, denim, casual, maternity, outerwear, blouses, knitwear, loungewear, jewelry, handbags, activewear, ski wear, home goods and kidswear. RTR has built a two-sided discovery engine, which connects deeply engaged customers and differentiated brand partners on a powerful platform built around its brand, data, logistics and technology. Under CEO and Co-Founder Jennifer Hyman’s leadership, RTR has been named to CNBC’s “Disruptor 50” five times in ten years, and has been placed on Fast Company’s Most Innovative Companies list four times, while Hyman herself has been named to the “TIME 100: Most Influential People in the World" and as one of People Magazine’s “Women Changing the World.” Galway Office Rent The Runway established its European Technology Hub in Galway in April 2019. Based in the historic Claddagh area of the city, the growing team in Galway tackles core technology challenges and influences the next generation of services critical to Rent The Runway’s success and continued growth. The Galway office is Rent the Runway's first international office outside the US and enables the company to significantly expand its Software Engineering, Product Development, Machine Learning Engineering and Data Science footprint. Rent The Runway’s Galway-based employees have the opportunity to grow their careers across several roles and career paths in Technology. About the Team: Our Platform Engineering team is smart, pragmatic, and entrepreneurial. We are reliability-focused and relentlessly passionate about making the closet-in-the-cloud a reality for our customers. We drive the operational capability of creating and advocating best practices to support largely distributed, fault-tolerant systems in the cloud that serve our customers every day. We practice continuous improvement & process management techniques to put quality into everything we do. We cross-functionally service the Rent the Runway business and support multiple departments across IT, Engineering, Product, Security, Compliance and the Business. About the Job: As a Site Reliability Engineer (SRE) you will have the opportunity to contribute to technology initiatives in the realm of cloud infrastructure, software delivery and observability. You will be responsible for building and developing tooling, policies, and processes to advance Rent The Runway to higher levels of scale, and performance. You will have the opportunity to lead assigned projects, and be responsible for the overall delivery of these initiatives. You will be part of a high-impact engagement with the Platform Engineering team delivering operational excellence through system automation, self-service and developer tooling that empowers the entire organisation to deliver exceptional results for our customers. What You’ll Do: - Utilise technologies and languages like Terraform, Helm, Python, Go, Container Orchestration services including Docker and Kubernetes, and a variety of GCP and services to drive service reliability. - Implement software development practices to build observability, alerting, tracing, automation, and self-healing capabilities to maintain the highest levels of platform availability. - End-to-end coordination across platforms, while supporting, identifying, responding, and reporting of issues; then escalating to respective teams for remediation promptly. - Develop maintenance and operations automation through CI/CD. About You: - Passion for CI/CD: Demonstrated enthusiasm for developing and improving Continuous Integration/Continuous Deployment processes. - Orchestration Technology: 2 years of hands-on experience with orchestration tools such as Kubernetes and/or Helm. - Coding and Scripting: Proficiency in Terraform, Ansible, or Helm, with an understanding of CI/CD tools like GitHub, GitLab, and Artifactory. - Monitoring Solutions Expertise: Practical experience with monitoring, alerting, and logging tools, including Splunk and GCP Monitoring. - Production Environment Support: 2 years of experience in maintaining production environments across cloud platforms like GCP, AWS, or Azure. - Software Development: Some experience in developing and delivering products using programming languages such as Bash, Python, Golang, or Java is desirable. - System Optimisation: Track record of contributions to enhance existing systems, building robust infrastructure, and automating processes to reduce workload. - Agile Methodology: Experience working within Agile teams, adhering to sprint cadences and delivery timelines. - Problem-Solving Skills: Ability to effectively triage issues and conduct root-cause analyses when necessary. - Team Collaboration: Strong team player with the ability to work collaboratively within diverse groups. - On-Call Duties: Willingness to participate in an on-call rotation, troubleshoot production issues, perform Root Cause Analyses, and share insights with the Engineering and Operations teams. Benefits: At Rent the Runway, we’re committed to the happiness and wellbeing of our employees, and aim to create a workplace that fosters both personal and professional growth. Our inclusive benefits include, but are not limited to: - Generous Paid Time Off including annual leave, paid bereavement, and family sick leave - every employee needs time to take care of themselves and their family. - Universal Paid Parental Leave for both parents + flexible return to work program - because we know your newest family member(s) deserve your undivided attention. - Paid Sabbatical after 5 years of continuous service - unplug, recharge, and have some fun. - Competitive Stakeholder Pension - taking care of your future. - Comprehensive health, dental care and dependents care from day 1 of employment - Your health comes first and we’ve got you covered. - Company wide events and outings - our team spirit is no joke - we know how to have fun! - Hybrid Work - This is a hybrid role based in our Galway, Ireland, office 2-3 days per week. Rent the Runway is an equal opportunity employer. In accordance with applicable law, we prohibit discrimination against any applicant or employee on any legally-recognised basis, including, but not limited to: gender, marital status, family status, age disability, sexual orientation, race, religion, and membership of the Traveller community. LI-EM1 By submitting your application below, you agree that you have read and acknowledge Rent the Runway's Candidate Privacy Policy, found here.
Senior DevOps, Platform Engineer
KasewareOne platform for everything from case management to case closed.
• Design, build, and operate cloud infrastructure across Azure and AWS commercial environments supporting Kaseware’s customer deployments • Administer and operate Kubernetes clusters (AKS, EKS), including upgrades, networking, RBAC, storage, and platform-level tooling • Author, modify, and maintain Infrastructure as Code using Terraform – including modules, state management, and multi-environment deployment patterns • Implement and operate GitOps continuous delivery using ArgoCD; build and maintain workflows with Argo Workflows where applicable • Author and maintain Helm charts – templating, dependencies, and release management • Build, maintain, and enhance CI/CD pipelines and automation across the development and deployment lifecycle • Instrument, monitor, and troubleshoot cloud and Kubernetes workloads using Datadog and other observability tooling (Prometheus, Grafana, Elastic stack, Fluentd, Istio, etc.) • Support customer installations and consult on deployment topology for both cloud and on-premises environments • Administer development environments and developer tooling that keep engineering teams productive • Participate in production incident response and on-call rotations across distributed time zones, supporting any Kaseware system other than U.S. federal government protected environments • Contribute new ideas, tools, and patterns to the Kaseware technical stack – we encourage every team member to innovate
Intermediate Site Reliability Engineer
Dev.ProSoftware Development Partner. Result-driven. Quality-obsessed.
• Maintain the company’s on-premises and cloud server infrastructure • Manage, monitor, and optimize network systems, websites, and related services • Automate CI/CD processes and infrastructure with Ansible and Terraform • Monitor systems, troubleshoot issues, and resolve alerts from monitoring tools to address vulnerabilities • Create and maintain clear and comprehensive documentation • Collaborate with internal teams (e.g., process automation, AI) to support infrastructure, deployment, and automation tasks • Provide similar support with infrastructure, deployment, and automation tasks for client projects • Participate in daytime shifts on weekdays, assisting clients with any monitoring alerts or issues




