B2B marketing platform built for the modern data stack
SDE2, DevOps
Location
India
Posted
20 days ago
Salary
0
Seniority
Senior
Job Description
SDE2, DevOps
Inflection.io
• Manage and optimize production infrastructure on AWS, ensuring scalability and reliability. • Deploy and orchestrate containerized applications using Kubernetes. • Implement and maintain infrastructure as code (IaC) using Terraform. • Set up and manage CI/CD pipelines using tools like Jenkins or Github Actions to streamline deployment processes. • Troubleshoot and resolve infrastructure issues to ensure high availability and performance. • Collaborate with cross-functional teams to define technical requirements and deliver solutions.
Job Requirements
- 3+ years of experience with AWS, including practical exposure to its services in production environments.
- Demonstrated expertise in Kubernetes for container orchestration.
- Proficiency in using Terraform for managing infrastructure as code.
- Exposure to at least one CI/CD tool such as Jenkins or Github Actions.
- Nice-to-have: Experience managing queueing systems like SQS, Kafka.
- Comfortable working US night shifts from India to support US teams.
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
• Drive long-term networking strategy in the public cloud • Automate networking related tasks and build self-service tools • Maintain an enterprise scale SASE solution • Run and maintain the network – debug, troubleshoot, tune, support • Participate in organizational objective planning sessions • Manage relationships with third party vendors • Cross-train and mentor teammates
• Design, implement and support operational and reliability aspects of large scale Observability & Telemetry collection platform with a focus on performance at scale, real time monitoring, logging and alerting • Engage in and improve the whole lifecycle of services—from inception and design through deployment, operation and refinement • Support services before they go live through activities such as system design consulting, developing software tools, platforms and frameworks, capacity management and launch reviews • Maintain services once they are live by measuring and monitoring availability, latency and overall system health • Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity • Practice sustainable incident response and blameless postmortems • Be part of an on call rotation to support production systems
• Design and document a hub-and-spoke network topology (SmartNet as hub, AF bases as spokes) connecting to AWS commercial • Set up site-to-site VPN tunnels / secure connections for hardware panels at each facility (2 ports/connections per site) • Configure firewall rules and port settings for bidirectional communication between AWS and on-prem panels • Work side-by-side with Steven (their one internal tech resource) to build, document, and make the process repeatable • Produce documentation suitable for Air Force RMF (Risk Management Framework) / ATO submission • Build with future scalability in mind ,CCTV/video management, additional servers, failover/redundancy
Role Description We are seeking a skilled DevOps / Cloud Engineer to join the team responsible for managing core network platform - OSS. In this role, you will design, deploy, and operate cloud and hybrid infrastructure, build and maintain CI/CD pipelines, and take ownership of running third-party vendor software in our environment. You will bridge the gap between infrastructure and application operations, ensuring our systems are scalable, secure, highly available, and cost-efficient. - Design, deploy, and manage AWS cloud and hybrid infrastructure solutions using Infrastructure as Code (IaC) tools. - Deploy, operate, and maintain cloud environments across multi-account and multi-region AWS architectures. - Deploy, configure, and operate vendor-supplied software within our cloud/hybrid environment, serving as the operational owner for these applications. - Coordinate with vendors on installation, upgrades, patching, and configuration changes, translating vendor requirements into infrastructure and deployment solutions. - Ensure vendor applications meet our availability, performance, and security standards through own monitoring and incident management processes. - Own the security, availability, and reliability of infrastructure, applying best practices for IAM, encryption, and vulnerability management. - Build, maintain, and improve CI/CD pipelines that automate testing, security scanning, and deployment workflows. - Automate infrastructure provisioning, configuration management, and operational tasks to eliminate manual toil. - Implement and maintain monitoring, alerting, and observability tooling to enable proactive issue detection and resolution. - Analyze cloud performance metrics and resource utilization to continuously optimize system efficiency and control costs. - Partner closely with internal and vendor teams to align on cloud infrastructure deployment and integration practices - bridging code and underlying infrastructure. - Provide technical guidance and mentoring to teammates, driving engineering excellence. Qualifications - Bachelor’s degree in Computer Science, Engineering, or a related field — or equivalent professional experience. - A minimum of 3+ years of hands-on experience in cloud infrastructure, DevOps, or site reliability engineering roles. - Public Cloud Expertise (AWS, Azure or GCP): Proven hands-on experience with public cloud services including compute, networking, databases, K8s, IAM, monitoring. - Infrastructure as Code: Proficiency in Terraform or AWS CDK for building and managing infrastructure at scale. - CI/CD: Demonstrated experience designing and maintaining CI/CD pipelines (e.g., GitHub Actions, GitLab CI, Jenkins, ArgoCD). - Containerization & Orchestration: Solid experience with Docker and K8s for application deployment and management. - Hybrid & Multi-Environment Operations: Experience operating workloads across cloud and on-premises or hybrid environments. - Third-Party Software Operations: Experience deploying and managing vendor-provided applications in cloud environments, including coordination of upgrades, patching, and configuration. - Databases: Hands-on experience with SQL (e.g., PostgreSQL, MySQL/RDS) and NoSQL (e.g., DynamoDB, Redis/Elasticache) databases. - Messaging & Queuing: Familiarity with message-driven systems such as SQS, RabbitMQ or Kafka. - Scripting & Automation: Proficiency in at least one scripting language (Python, Bash, or similar) for automation tasks. Requirements - AWS or Azure/GCP certification (e.g., AWS Certified DevOps Engineer, Solutions Architect). - Skills in remote debugging and troubleshooting distributed systems. - Familiarity with security and compliance frameworks (e.g., SOC 2, ISO 27001). - Experience with observability platforms (e.g. Prometheus/Grafana, ELK). - English proficiency at B2 level or above; able to collaborate effectively with global, cross-functional teams. - Experience in software development is a plus. - Background in telecom, satellite, or other high-availability, mission-critical environments is a plus. Soft Skills - Strong problem-solving mindset with a bias toward automation and operational efficiency. - Collaborative and communicative — comfortable working in a globally distributed team. - Ownership mentality - take responsibility for end-to-end reliability of systems under your care. - Adaptable and self-directed, with the ability to manage competing priorities in a fast-paced environment. - Meticulous attention to detail in documentation, change management, and operational procedures. Technology Stack - Cloud - AWS (EC2, EKS, RDS, DynamoDB, Elasticache, S3, Route53, VPC networking, IAM, CloudWatch). - IaC - Terraform, AWS CDK. - Containers & Orchestration - Docker, K8s. - CI/CD - GitHub Actions, GitLab CI, Jenkins, ArgoCD (or similar). - Messaging - SQS, Kafka. - Scripting - Python, Bash. - Databases - PostgreSQL, MySQL, DynamoDB, Redis. - Monitoring & Observability - Prometheus, Grafana, CloudWatch. - Version Control - Git. Physical Requirements - Ability to work in a standard office or remote home-office environment and use a computer for extended periods. - Ability to participate in occasional after-hours incident response actions.




