AudioEye logo
AudioEye

A cloud-based digital accessibility platform helping businesses of all sizes build inclusive and compliant websites.

Staff Engineer, Platform Engineering – Operational Health

Platform EngineerPlatform EngineerFull TimeRemoteLeadTeam 51-200Since 2015H1B No SponsorCompany SiteLinkedIn

Location

United States

Posted

2 days ago

Salary

$170K - $200K / year

Seniority

Lead

Job Description

Staff Engineer, Platform Engineering – Operational Health

AudioEye

• Conduct comprehensive audits of infrastructure, deployment processes, incident patterns, and on-call burden • Identify foundational issues causing operational pain: fragile systems, deployment friction, poor observability, and architectural weaknesses • Establish baseline metrics for system health and operational efficiency • Prioritize improvements systematically based on impact to on-call burden and reliability • Design and implement solutions that address the root causes of incidents • Eliminate single points of failure in critical paths • Implement patterns for graceful degradation and rapid recovery • Build comprehensive observability, logging, metrics, and tracing infrastructure • Identify and automate repetitive manual work that burdens operational staff • Establish organizational standards and expectations for reliability • Design and maintain runbooks, playbooks, and incident response processes • Create feedback loops from incidents to systemic improvements • Support CI/CD pipelines that enable safe, frequent deployments • Develop operational tooling (dashboards, alerts, automation) • Reduce friction in how engineers interact with infrastructure and deploy code • Scale infrastructure responsibly as the organization grows • Optimize platform reliability, performance, and cost through capacity planning, workload tuning, and architectural tradeoffs • Own the technical strategy and roadmap for infrastructure, reliability, and operational posture • Mentor engineers across the organization on operational thinking and reliability engineering • Lead architecture reviews and establish technical standards • Build organizational practices and knowledge that outlast any single engineer

Job Requirements

  • 5–8+ years of software engineering experience with demonstrated expertise in infrastructure, operations, reliability engineering, or SRE
  • Staff-level technical depth with ownership of significant platform decisions at scale
  • Proven ability to manage complex trade-offs and defend technical direction
  • Experience carrying the pager; understanding of on-call burden from lived experience
  • Demonstrated pattern recognition; ability to design systemic solutions rather than one-off fixes
  • Expert-level infrastructure-as-code (Terraform); comfortable with complex multi-environment deployments
  • Docker and container orchestration expertise (Kubernetes or similar); experience designing or improving deployment pipelines
  • Deep familiarity with Node.js and TypeScript in production environments
  • Substantial AWS experience (EC2, RDS, Lambda, CloudWatch, VPC, load balancers, auto-scaling, managed services)
  • Understanding of AWS architecture patterns and cost optimization
  • Proficiency in scripting/automation languages (Python, Go, Bash); ability to write maintainable, scalable automation
  • Hands-on experience designing and implementing comprehensive monitoring, logging, metrics collection, and distributed tracing
  • Ability to instrument systems for visibility and rapid diagnosis
  • Experience optimizing systems at scale and handling growth challenges
  • Strategic thinking about capacity planning, resource requirements, and cost management
  • Understanding of relational databases, including performance considerations and optimization
  • Knowledge of networking concepts, caching patterns, and load balancing strategies
  • Understanding of distributed systems concepts and failure modes
  • Comfortable with AI/LLM tools (Claude, ChatGPT, Cursor, etc.) and uses them effectively in day-to-day work
  • Ability to validate AI/LLM output and use it as a jump-off point (ask better questions, iterate, and verify with real evidence)
  • Experience experimenting with AI for infrastructure automation, code generation, and problem-solving
  • Proactive thinking about how AI can accelerate platform engineering work
  • Experience leading the design of resilient systems and conducting architecture reviews
  • Proven ability to elevate other engineers' capabilities and establish technical standards
  • Strong communication skills; able to explain technical complexity to diverse audiences
  • Ability to write clear ADRs and technical documentation that guides future decisions
  • Demonstrated ability to influence technical direction without formal authority

Benefits

  • Work with a talented but humble team
  • Competitive compensation and equity
  • Weekly paid family meal
  • 401k, medical, dental, and vision insurance
  • Flexible PTO Policy
  • 15.5 company-paid holidays including Juneteenth, MLK Day and a 1-week company shut down
  • To support remote work conditions, AudioEye provides each employee a one-time stipend of $300

Related Categories

Related Job Pages

More Platform Engineer Jobs

ContractRemoteTeam 1-10Since 2007H1B No Sponsor

• Solution Design & Development: Design, develop, and deploy custom applications and integrations using Microsoft Dynamics 365 and Power Platform tools • Customization & Configuration: Customize Dynamics 365 modules (Sales, Service, Marketing, etc.) and build Power Apps/Power Automate workflows • Integration: Integrate Microsoft Dynamics 365 with other enterprise systems using web services, APIs, and connectors • Data Analysis & Reporting: Leverage Power BI to create dashboards, reports, and data visualizations • Collaboration with Stakeholders: Work with business analysts, project managers, and team members • Quality Assurance & Testing: Ensure the quality and performance of applications by performing testing • Coding and Testing: Write and deploy code to ensure applications meet quality standards • Best Practices & Documentation: Develop and maintain technical documentation • Data Management: Oversee data migration, integration, and data quality assurance processes • Project Management: Participate in project planning and delivery • Mentorship/Training: Mentor fellow developers on Dynamics related issues and provide support

United States
TMS LLC logo

Senior Cloud Security & Platform Engineering Lead

TMS LLC

All your information will be kept confidential according to EEO guidelines.

ContractRemoteTeam 51-200

Role Description We are seeking a highly experienced Senior Cloud Security & Platform Engineering Lead to lead the design, implementation, and governance of secure and scalable cloud-native platforms across enterprise environments. The ideal candidate should have strong expertise in cloud security, platform engineering, Kubernetes, Zero Trust architecture, and software supply chain security. Qualifications - 10+ years of experience in Cloud Security, Platform Engineering, DevSecOps, or Infrastructure Security - Deep expertise in AWS, Azure, and GCP - Strong hands-on experience with Kubernetes, containers, and cloud-native infrastructure - Experience implementing: - SPIFFE / SPIRE enterprise federation - In-Toto pipeline enforcement - Tekton Chains production attestation - OPA / Gatekeeper governance platforms - eBPF runtime detection engineering - Strong understanding of Zero Trust Architecture and Cryptographic Workload Identity - Experience implementing Software Supply Chain Security frameworks (SLSA, Provenance, Attestation) - Expertise in Infrastructure-as-Code and Policy-as-Code - Strong programming/scripting experience in Go, Python, Bash, Terraform, YAML - Experience leading enterprise cloud modernization initiatives Requirements - AWS / Azure / GCP - Kubernetes - SPIFFE / SPIRE - Tekton Chains - In-Toto - SLSA - eBPF - OPA / Gatekeeper - Zero Trust - Terraform - DevSecOps - Cloud Security - Platform Engineering Benefits - All your information will be kept confidential according to EEO guidelines.

United States
Pickle logo

Technical Platform & Infrastructure Lead

Pickle

FYXER recognises the benefits of a diverse workforce and strives to be an inclusive organisation. We are committed to treating everyone with dignity and respect regardless of race, culture, gender, disability, age, sexual orientation, religion or belief and we promote diversity of thought.

Role Description We are partnering with a digital transformation and AI-enabled learning business seeking a highly technical Platform & Infrastructure Lead to support the design, scalability, and operational performance of modern AI-enabled platforms and technical ecosystems. This role requires genuine hands-on infrastructure and platform experience, with a strong understanding of modern technical environments, scalability, integration, operational resilience, and AI-enabled systems. Initially, this will begin as a freelance / contract engagement with the potential to evolve into a longer-term role as the wider programme grows. You will take ownership of platform and infrastructure considerations across a growing AI-enabled programme, helping ensure systems, integrations, environments, and technical operations are scalable, secure, reliable, and fit for future growth. The successful candidate will be highly practical, technically credible, and comfortable operating across infrastructure, cloud environments, systems architecture, operational workflows, and platform optimisation. Key Responsibilities - Lead technical platform and infrastructure planning across AI-enabled systems - Support platform scalability, reliability, and operational performance - Advise on cloud infrastructure, integrations, and technical environments - Collaborate with technical and operational stakeholders to define platform requirements - Support infrastructure-related decision-making across delivery and implementation - Evaluate technical risks, dependencies, and scalability considerations - Help establish operational best practices for platform management and resilience - Contribute to system optimisation, automation, and infrastructure improvements - Support troubleshooting, issue resolution, and technical problem-solving where required - Work closely with cross-functional teams across product, operations, and delivery Qualifications - Strong hands-on infrastructure and platform engineering experience - Experience working within modern cloud-based technical environments - Exposure to AI-enabled products, systems, or workflows - Strong understanding of scalability, integrations, system reliability, and operational resilience - Experience supporting complex technical ecosystems or platforms - Ability to operate in fast-paced and evolving environments - Strong troubleshooting and technical problem-solving capability - Excellent stakeholder communication and collaboration skills Essential Experience - Experience supporting AI-enabled learning, knowledge, or digital transformation platforms - Exposure to automation, DevOps, or platform optimisation initiatives - Experience working within consultancy, freelance, or transformation programmes - Familiarity with enterprise platform environments and integrations Benefits - Flexible remote working environment - Opportunity to grow into greater project ownership - Exposure to multiple projects and delivery teams - Collaborative, fast-paced environment with strong development potential Contract Details - Freelance / contract engagement - Approx. 160 hours per month - UK time zone alignment preferred - Potential for extension or longer-term opportunity Diversity & Inclusion FYXER recognises the benefits of a diverse workforce and strives to be an inclusive organisation. We are committed to treating everyone with dignity and respect regardless of race, culture, gender, disability, age, sexual orientation, religion or belief and we promote diversity of thought.

United Kingdom
Elastic logo

Principal Software Engineer (Networking) - Platform

Elastic

Self-described as the leading platform for search-powered solutions, Elastic helps organizations, their customers, and their employees find what they need faster while protecting a

Role Description As part of the Platform Engineering department, the Traffic team is crafting, building, and improving the multi-cloud platform at scale for Elastic Cloud Hosted and Serverless. We grow and mature our distributed network services and solutions for multiple cloud service provider platforms. We are built on Kubernetes, Go/Scala, and custom orchestration architectures. In your daily life with us, you will participate in: - Coding and innovating technical designs - Crafting solutions and improving resilience - Prioritizing security, bug fixes, and features - Debugging Azure Networking for Elastic Cloud Serverless Qualifications - 10+ years in Software Engineering with product success in delivering Cloud network solutions - Experience in public cloud, Go, and managed Kubernetes services is advantageous - Success and lessons from striving for 'progress not perfection' in Platform reliability - Passion for developing solutions with inclusive communication methods - Examples of working in distributed teams or working remotely is desirable Requirements - Designed and built a SaaS product in a public cloud using Infrastructure-as-Code tooling such as Crossplane or Terraform - Built Kubernetes-at-scale infrastructure across multiple cloud providers - Written product features or functions in Golang or other programming languages - Worked with containerized services (such as Docker) - Proven results in leading and improving cross-team engineering initiatives - Experience in system administration with professional skills in Linux on distributed systems at scale - Diagnosed, designed, implemented, and created solutions with the Elastic Stack - Experienced in a self-organizing and sharing in a globally distributed team environment - Strengthened team members through coaching and mentoring Benefits - Competitive pay based on the work you do - Health coverage for you and your family in many locations - Flexible locations and schedules for many roles - Generous number of vacation days each year - Matching up to $2000 for financial donations and service - Up to 40 hours each year for volunteer projects - A minimum of 16 weeks of parental leave

Canada
C$174K - C$219.7K / year