DevOps Engineer
Location
Alaska + 3 moreAll locations: Alaska | California | Hawaii | Washington
Posted
103 days ago
Salary
$110K - $170K / year
Seniority
Mid Level
Job Description
DevOps Engineer
Patriot Software
• You’ll help design, build, and run reliable, scalable infrastructure with a strong platform engineering and SRE mindset. • Manage critical platform components (including cloud networking). • Automate manual tasks to reduce operational toil. • Collaborate on technical decisions. • Partner closely with application and security teams. • Continuously work to improve reliability, performance, and security across the platform. • Contribute to platform engineering initiatives using Kubernetes (EKS), Helm, and Infrastructure as Code (IaC). • Maintain and improve CI/CD platforms and deployment pipelines. • Build and support observability foundations, including metrics, logging, alerting, and dashboards tied to service health. • Provision, configure, and operate scalable AWS infrastructure. • Support cloud networking and connectivity architecture. • Configure and manage Cloudflare for edge security and CDN performance. • Actively participate in incident response and post-mortems. • Implement security and compliance best practices across infrastructure. • Monitor and track service reliability using established metrics.
Job Requirements
- 2+ years of experience in DevOps, Platform Engineering, or SRE roles
- Hands-on experience with:
- AWS infrastructure and core services
- Kubernetes (EKS)
- Infrastructure as Code (AWS CDK, Terraform, or similar)
- CI/CD platforms (GitHub Actions preferred)
- Working knowledge of Cloudflare or a similar CDN/edge platform (handling CDN, WAF, and DNS).
- Scripting skills in Python, Bash, or similar languages for automation and operational tooling.
- Experience supporting production systems with high availability, performance, and security requirements.
- Solid understanding of cloud networking fundamentals (routing, load balancing, security groups, NACLs).
- Familiarity with SRE principles and operational excellence (such as service health monitoring, logging, and alerting).
- Strong communication skills and the ability to collaborate effectively across application and security teams.
- Consistent, reliable high-speed internet access.
- Dedicated, quiet workspace free from distractions.
Benefits
- Comprehensive suite of benefits that help our team thrive inside and outside of work
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
• Lead the initial setup of our DevOps and platform engineering practices, defining standards, tooling, and infrastructure strategy • Design and deliver an internal platform for personal or feature environments to empower developers and improve velocity • Build and maintain AWS-based infrastructure to ensure optimal performance, scalability, and security • Develop CI/CD pipelines to streamline deployments and automate release processes • Implement observability tooling (logging, monitoring, alerting) to detect and resolve issues proactively • Collaborate closely with developers to identify bottlenecks, improve delivery, and enhance system reliability • Ensure high availability and uptime through effective monitoring, incident response, and recovery strategies • Promote infrastructure automation and Infrastructure as Code best practices • Document systems, configurations, and incident processes for team-wide clarity and consistency • Participate in on-call rotations or coordinate time zone coverage to ensure 24-hour reliability across regions
Backend/DevOps Engineer
Nick AIWe are building an AI Agent Trading Platform. Create your Agent, customize strategy & trade on your favorite exchanges.
• Design and manage infrastructure deployments using Docker, Kubernetes, and AWS/GCP. • Develop secure key management systems for API keys and wallet abstraction. • Implement monitoring, logging, and incident handling for execution nodes. • Set up CI/CD pipelines and streamlined developer workflows (GitHub Actions). • Optimize infrastructure for reliability, security, and scalability. • Collaborate with backend engineers to support execution and receipts systems.
Database Reliability Engineer
WorkOSWorkOS is an internet company providing a developer platform that helps app-builders sell their apps to enterprise customers with only a few lines of code. Founded in 2019, the com
• Own the reliability, performance, and scalability of WorkOS's PostgreSQL infrastructure. • Analyze and implement best practices for our database clusters, including replication, connection pooling, high availability, and disaster recovery. • Build and maintain observability for database metrics (query performance, replication lag, connection saturation, storage growth) and ensure we meet our database SLOs. • Provide database expertise to product engineering teams through migration reviews, query optimization guidance, and schema design consultation. • Develop automation and self-service tooling that enables engineers to safely interact with databases without bottlenecking on the DBRE team. • Participate in on-call rotations and lead incident response for database-related production issues, performing root cause analysis and implementing permanent fixes. • Plan and manage database capacity, forecasting growth and ensuring our infrastructure can handle increased workloads. • Collaborate with SREs to roll out infrastructure changes to production environments, with a focus on minimizing risk to the data layer. • Document operational procedures, runbooks, and architectural decisions so learnings become repeatable actions and eventually automation. • Drive improvements to backup and recovery strategies, regularly testing and validating disaster recovery procedures.
Site Reliability Engineer
WorkOSWorkOS is an internet company providing a developer platform that helps app-builders sell their apps to enterprise customers with only a few lines of code. Founded in 2019, the com
• Design and evolve the systems, tooling, and processes that improve the reliability and performance of WorkOS • Collaborate with product and infrastructure teams to ensure services are production-ready, observable, and resilient to failure • Define and measure SLIs/SLOs to guide reliability improvements • Write and optimize backend systems (in TypeScript) with a focus on performance, maintainability, and graceful degradation • Improve our incident response process, lead postmortems, and drive follow-through on reliability risks • Develop internal tools and automations that make it easier to operate and scale our systems • Participate in our on-call rotation—responding to, resolving, and learning from production incidents • Contribute to design and architecture discussions with a focus on operability and long-term sustainability • Document systems, share learnings, and help grow a reliability-minded engineering culture



