Knock - Knockaway, Inc. logo
Knock - Knockaway, Inc.

Knock, also known as Knockaway, Inc., has developed an online home-selling platform that allows homeowners to sell or trade their houses quickly and easily for

Infrastructure Engineer

Location

United States

Posted

22 days ago

Salary

0

Seniority

Mid Level

Job Description

Infrastructure Engineer

Knock - Knockaway, Inc.

Role Description We're looking for an infrastructure engineer to join our small but growing platform team. The platform team at Knock is responsible for building, scaling, and maintaining the core services and infrastructure that run Knock. You will have a high degree of ownership and autonomy in improving the Knock platform, starting with our foundational infrastructure. We’re an engineer-led team that obsesses over the reliability and availability of our service. We care deeply about building a team and culture that is inclusive and equitable for people of all backgrounds and experiences, and believe firmly that the best teams are diverse. We particularly encourage people from underrepresented communities to apply. Last thing: you can be a great fit even if you don't perfectly match what's described below. We know there's a lot we don't know and haven't thought of yet, and we're looking for teammates that can tell us what those things are. If that's you, don't hesitate to apply and tell us about yourself! What you’ll be doing in this role - Adopting a Terraform-backed EKS cluster, modernizing & maintaining it for elastic scale, reliability, performance, security, etc. - Going deep into troubleshooting Postgres performance, queues of every shape and size, and coming out the other side with a plan for scaling another 10x to 100x. - Identifying and correcting scaling issues before they affect our customers by relying on and improving our telemetry and traces in Datadog, AWS Cloudwatch, and Honeycomb. - Maintaining and improving upon our >99.95% uptime track record. - Supporting our product engineering team at moving fast to deliver customer value. Improving the day-to-day developer experience through canaries, faster cycle time, blue/green deploys, etc. - Joining on-call rotations on a schedule with the rest of the engineering team. - This position is both high autonomy and high accountability: you will have a lot of room to work and raise our existing standards, while also communicating those changes and bringing the rest of the team along for the ride, often in the form of runbooks & internal documentation. Qualifications - 4+ years experience as a DevOps engineer or similar in a startup or mid-sized company working with complex systems that operate at scale. - Experience working in and on production Kubernetes clusters using infrastructure as code (we use Terraform, but others like Pulumi or Cloudformation are fine too). - Experience working on complex AWS deployments (multi-account, complex VPC structure to support EKS, EKS experience). - Experience operating and scaling different database technologies. We use Aurora Postgres, Mongo, and ClickHouse so significant experience with at least one of these is a must. - Some past experience or familiarity operating and scaling different queues and streams across SQS, Kinesis, Kafka or similar. - Strong problem-solving skills with a focus on reliability, scalability, and performance. - Strong communications skills, with the ability to work in a fully distributed, remote-first team. A note on AI at Knock We’re a team that has fully embraced AI tools to help us in our day-to-day. We use these tools to accelerate us, but remain clear-eyed about where they shine and where the pitfalls lie. We’re not overly prescriptive about the tools you can use, and we encourage experimentation as we embrace this new method of working. We have a collaborative culture of figuring out together what works and what doesn’t — sharing what we’ve learned, comparing notes, and iterating on our workflows as the tooling landscape evolves. As a member of the Knock team, we expect you to be familiar with tools like Cursor, Claude Code, Codex, or similar to assist you in your job. You’ll be allowed to use these tools in some parts of your interview loop, but there will be times where we’ll ask that you refrain.

Related Categories

Related Job Pages

More Infrastructure Engineer Jobs

Periferia IT Group logo

Especialista en Infraestructura Cloud, GCP

Periferia IT Group

Transformamos proyectos y negocios impulsados por metodologías ágiles, llevándolos al siguiente nivel.

Full TimeRemoteTeam 1,001-5,000Since 2007H1B No Sponsor

• Diseñar, implementar y optimizar infraestructura cloud en entornos GCP asegurando disponibilidad y escalabilidad. • Administrar y asegurar accesos, roles IAM y controles de ciberseguridad bajo buenas prácticas y cumplimiento (PCI). • Automatizar y gestionar despliegues mediante IaC (Terraform) y pipelines CI/CD en Azure DevOps, incluyendo entornos contenerizados.

Colombia
Job Closed
LWSA logo

Infrastructure Specialist, MySQL

LWSA

Integrando soluções & Impulsionando negócios

Full TimeRemoteTeam 1,001-5,000Since 1998H1B No Sponsor

• Design and evolve distributed, resilient, and scalable database architectures on AWS, focusing on high availability and disaster recovery (DR). • Administer and perform deep tuning of MySQL/MariaDB engines and TiDB clusters, ensuring performance in high-concurrency scenarios. • Lead architectural design reviews with Product and Engineering teams, ensuring data design supports business growth. • Build robust automation using Python, Go, or Bash for provisioning, maintenance, and self-healing of infrastructure (IaC). • Implement and maintain an advanced observability strategy that goes beyond basic monitoring to include distributed tracing and structured logs for rapid anomaly detection. • Mentor analysts and engineers, raising the team's technical bar in modern database and cloud practices. • Ensure data security by design, implementing hardening, encryption (in transit and at rest), access management (IAM with least-privilege principle) and compliance with network policies. • Lead critical incident responses (War Rooms), conducting post-mortems focused on preventing recurrence.

Brazil
Menlo Security Inc. logo

Platform Infrastructure Engineer, Containers

Menlo Security Inc.

Menlo Security protects productivity online with a one-of-a-kind, isolation-powered cloud security platform.

Full TimeRemoteTeam 201-500H1B No Sponsor

• Design, deploy, and maintain VM and Kubernetes infrastructure on GCP and AWS across dozens of clusters spanning development, staging, and production environments in multiple regions. • Coordinate with your peers in your direct team as well as across teams to ensure that the tasks you’re working on are going to solve the problems that we need them to solve. • Build and maintain Infrastructure as Code (IaC) using Terraform modules, managing resources through Spacelift or equivalent Terraform Automation and Collaboration Software (TACOS). Provision cloud infrastructure including networking, compute, storage, and security components primarily on GCP, with secondary AWS support. • Implement and manage workflows with sophisticated multi-layer configuration management. • Build and maintain comprehensive observability solutions using Grafana Cloud, Prometheus/Mimir, and OTel collectors. Design Grafana dashboards, configure alerting rules, and ensure visibility across all platform components. • Manage certificate lifecycle, DNS automation, ingress controllers, and service mesh networking with Cilium. • Partner with Engineering, Product, Compliance, and Security teams to design resilient, scalable systems. Consult on capacity planning, disaster recovery, and architectural decisions for cloud-native applications. • Identify and eliminate toil through automation. Write scripts, develop tools, and build CI/CD pipelines to improve operational efficiency and reduce manual work. • Participate in a 24x7 on-call rotation as part of a globally distributed team, responding to incidents and driving post-incident reviews.

Canada
Full TimeRemoteTeam 1,001-5,000H1B Sponsor

• Provide critical technical recommendations to resolve complex issues and guide strategic technology decisions for the PMO and ESMC • Conduct comprehensive analyses of alternatives (AoA) for data analytics, visualization, and AI/ML platforms, advocating for Commercial-off-the-Shelf (COTS) or Government-off-the-Shelf (GOTS) solutions where practical • Lead long-term strategy, technology roadmap planning, and capacity planning to ensure the infrastructure meets future business goals • Continuously monitor technical trends through independent research, reporting on new products and services that align with strategic goals • Translate complex technical concepts into clear, actionable insights for non-technical stakeholders through briefings, whitepapers, and reports • Perform periodic technical assessments of programs and projects, evaluating status, progress, work product quality, and schedule and cost management • Design, implement, and manage a secure, scalable multi-cloud and hybrid hosting infrastructure, ensuring high availability and performance • Provide expertise on enhancing application architecture to leverage Cloud Native solutions (e.g., ElasticSearch, Kibana, containerization) and optimize for performance and cost on AWS GovCloud • Assist in the sizing of application components, usage estimates, and technical support configurations • Identify system component incompatibilities and recommend solutions for seamless integration during technology upgrades • Collaborate with the Cybersecurity team to implement robust security protocols, manage access controls, conduct vulnerability assessments, and respond to incidents. • Ensure all infrastructure, technical specifications, and designs meet and maintain compliance with DoD, DLA, and other relevant government security standards • Evaluate technical specifications (i.e., design, database components, interface designs) to provide expert guidance and ensure industry best practices are met

Virginia
Job Closed