Job Closed
This listing is no longer active.
Creating Digital Leaders. Digital Transformation Consultancy Services and Solutions
Senior AWS DevOps Engineer
Location
Romania
Posted
117 days ago
Salary
0
Seniority
Senior
Job Description
Senior AWS DevOps Engineer
Xebia
• building and supporting the tools, processes and infrastructure empowering the faster delivery and scaling of software iterations • ensuring availability, reliability and scalability of application infrastructure • building and supporting continuous integration/delivery and release tools • ensuring the right metrics are collected and monitored
Job Requirements
- 5+ years of experience working with DevOps practices and Continuous Delivery
- practical knowledge of AWS services, infrastructure and networking
- solid experience with Kubernetes (ideally EKS on AWS) and container orchestration
- Python knowledge
- hands-on with GitOps practices, preferably with ArgoCD
- strong skills in Terraform and Helm
- proficiency in Bash and PowerShell scripting
- experience with CI/CD pipelines and tooling (GitLab CI/CD, GitHub Actions, or similar)
- experience with monitoring, observability, and logging tools, such as Prometheus, Grafana, AppDynamics, and OpenSearch
- security awareness (OWASP, encryption, secrets management)
- very communicative and collaborative, with a strong sense of ownership
- upper intermediate/advanced English (B2/C1)
Benefits
- professional development budgets — for both tech and soft skills
- culture that actively supports your growth
- community support
- meetups (Software Talks, Data Tech Talks)
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
• Building and supporting the tools, processes and infrastructure empowering the faster delivery and scaling of software iterations • Ensuring availability, reliability and scalability of application infrastructure • Building and supporting continuous integration/delivery and release tools • Ensuring the right metrics are collected and monitored
• Monitor systems performance and work with the team to identify opportunities to increase reliability • Design a framework that allows for rolling releases within hours of approval from QA to production • Standardize automation for creating ephemeral testing servers for both internal and external customers • Review deployment strategy and create a plan to standardize configuration in a single repository • Guide and mentor the team on a comprehensive infrastructure as code (IaC) strategy • Reduce MTTR to less than 1 hour for all deployment-related incidents • Automate 90% of the team's repetitive work so they can focus on high-value engineering details • Own the hiring, performance management, and professional development of the DevOps engineering team
• Oversee all cloud infrastructure and resources, including provisioning, performing regular patch management, and proactive capacity planning • Establish comprehensive system observability and maintain alerting infrastructure; serve as the escalation point for major incidents, drive resolution, and champion thorough Root Cause Analysis (RCA) • Define and maintain a robust security posture by enforcing Identity & Access Management (IAM), completing security audits, ensuring data encryption, and managing audit logs for regulatory compliance • Actively track cloud spend against budgets, direct the team in performing right-sizing and waste elimination, and optimize rates through reserved instances and savings plans (FinOps strategy) • Direct the implementation and regular testing of comprehensive disaster recovery and business continuity plans, including backup management and maintaining a High Availability (HA) architecture across multiple zones
• Own the health, performance, and availability of Air's PostgreSQL Aurora infrastructure. • Proactively optimize database parameters, indexes, and query patterns to maintain sub-100ms p95 response times. • Uplevel migration practices and tooling to ensure zero-downtime schema changes as the platform scales. • Establish and maintain comprehensive backup, recovery, and disaster recovery procedures with documented RTO/RPO targets. • Partner with backend engineers to implement database best practices in application code (connection pooling, query optimization, caching strategies). • Develop multi-quarter roadmap to scale Air's database infrastructure to support 10x growth in asset volume and user activity. • Collaborate with backend engineers and product leadership to model data growth patterns and anticipate scaling inflection points. • Evaluate and implement horizontal scaling strategies (read replicas, sharding, partitioning) aligned with business needs. • Continuously assess AWS Aurora capabilities, PostgreSQL ecosystem innovations, and emerging database technologies for strategic advantage. • Design and implement database architecture that supports Air's AI-powered features and real-time creative workflows. • Create comprehensive monitoring, alerting, and reporting systems to maintain database reliability and inform data-driven infrastructure decisions. • Implement detailed instrumentation for database performance metrics (query latency, connection pool utilization, replication lag, disk I/O). • Build automated alerting for anomalies in query performance, connection patterns, and resource utilization. • Create executive-level dashboards showing database health trends, capacity utilization, and cost efficiency. • Develop regular database health review cadence with engineering leadership to surface insights and drive continuous improvement.




