Job Closed

This listing is no longer active.

Senior Platform Engineer - Cloud Infrastructure

Location

United States + 29 moreAll locations: United States | Canada | Brazil | Colombia | Argentina | Chile | Venezuela | Bolivia | Ecuador | French Guiana | Guyana | Paraguay | Peru | Suriname | Uruguay | Mexico | Costa Rica | El Salvador | Guatemala | Honduras | Nicaragua | Panama | Dominican Republic | Puerto Rico | Bahamas | Guadeloupe | Haiti | Jamaica | Martinique | Montserrat

Posted

88 days ago

Salary

0

Seniority

Senior

Job Description

Senior Platform Engineer - Cloud Infrastructure

Qdrant

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more. Role Description As a Senior Platform Engineer - Cloud Infrastructure, you will work on the automation and internal systems that operate Qdrant Cloud reliably at scale. Rather than manually managing infrastructure, your work will focus on building tools, services, and automation that allow the platform to operate itself. This role is best suited for engineers who enjoy writing code, automating infrastructure, and building platform systems. Location: This role is fully remote but restricted to candidates located in the Americas (North, Central, or South America) to ensure time-zone alignment with the team. - Build internal tools and services that power Qdrant Cloud infrastructure. - Develop automation and platform components using Go and Python. - Design systems for cluster provisioning, lifecycle management, and infrastructure automation. - Improve Kubernetes automation through controllers, operators, and infrastructure tooling. - Design solutions that reduce operational toil and eliminate manual infrastructure work. - Improve reliability, scalability, and observability of the cloud platform. - Collaborate with platform and infrastructure teams on system architecture and automation. - Participate in incident response and implement improvements to prevent recurrence. - Continuously improve the internal platform used to operate Qdrant Cloud. Qualifications - 5+ years of experience in SRE, platform engineering, or infrastructure software engineering. - Strong programming skills (Go preferred, Python acceptable). - Experience building automation in production. - Hands-on experience operating Kubernetes in production. - Experience working with AWS, GCP, or Azure. - Strong understanding of Linux systems and networking fundamentals. - Experience improving reliability through automation and systems design. - Comfortable participating in on-call rotations. Requirements - Experience building Kubernetes controllers or operators (Nice to Have). - Experience with Terraform or infrastructure-as-code tools (Nice to Have). - Experience with observability stacks such as Prometheus, Grafana, or OpenTelemetry (Nice to Have). - Experience operating large-scale SaaS infrastructure (Nice to Have). - Experience working with database or data infrastructure systems (Nice to Have). Benefits - Competitive salary. - Fully remote work environment. - Flexible working hours. - Opportunity to work on mission-critical infrastructure powering AI workloads. - Strong collaboration with experienced infrastructure and platform engineers. - Ownership of systems operating a globally distributed cloud platform.

Job Requirements

  • 5+ years of experience in SRE, platform engineering, or infrastructure software engineering.
  • Strong programming skills (Go preferred, Python acceptable).
  • Experience building automation in production.
  • Hands-on experience operating Kubernetes in production.
  • Experience working with AWS, GCP, or Azure.
  • Strong understanding of Linux systems and networking fundamentals.
  • Experience improving reliability through automation and systems design.
  • Comfortable participating in on-call rotations.
  • Experience building Kubernetes controllers or operators (Nice to Have).
  • Experience with Terraform or infrastructure-as-code tools (Nice to Have).
  • Experience with observability stacks such as Prometheus, Grafana, or OpenTelemetry (Nice to Have).
  • Experience operating large-scale SaaS infrastructure (Nice to Have).
  • Experience working with database or data infrastructure systems (Nice to Have).

Benefits

  • Competitive salary.
  • Fully remote work environment.
  • Flexible working hours.
  • Opportunity to work on mission-critical infrastructure powering AI workloads.
  • Strong collaboration with experienced infrastructure and platform engineers.
  • Ownership of systems operating a globally distributed cloud platform.

Related Categories

Related Job Pages

More Platform Engineer Jobs

OtherRemoteTeam 5,001-10,000Since 2011H1B Sponsor

• Administer and maintain endpoint management systems, including Jamf Pro for macOS and Microsoft SCCM for Windows • Drive vulnerability and patch management efforts for all client endpoints • Implement and enforce endpoint security policies and compliance standards • Work closely with Cybersecurity, IT Operations, and Support teams • Troubleshoot and resolve complex issues on user endpoints • Develop scripts and automated workflows using tools like PowerShell, Bash, or Python • Continuously monitor endpoint performance, security alerts, and compliance dashboards • Create and maintain documentation for endpoint management processes and standards

United States
$125K - $180K / year
Job Closed
BlackLine logo

Senior Platform Engineer

BlackLine

BlackLine is a leading global provider of cloud software that controls and automates accounting and finance processes for businesses and organizations of all si

Get to Know Us: It's fun to work in a company where people truly believe in what they're doing! At BlackLine, we're committed to bringing passion and customer focus to the business of enterprise applications. Since being founded in 2001, BlackLine has become a leading provider of cloud software that automates and controls the entire financial close process. Our vision is to modernize the finance and accounting function to enable greater operational effectiveness and agility, and we are committed to delivering innovative solutions and services to empower accounting and finance leaders around the world to achieve Modern Finance. Being a best-in-class SaaS Company, we understand that bringing in new ideas and innovative technology is mission critical. At BlackLine we are always working with new, cutting edge technology that encourages our teams to learn something new and expand their creativity and technical skillset that will accelerate their careers. Work, Play and Grow at BlackLine! Make Your Mark: We're looking for a Full Stack Platform Engineer that'll play a crucial role in developing and refining our overall product roadmap. You'll report directly to our VP of Engineering and work closely with our engineering and product teams to build, implement and refine various features and products that drive our platform's core functionalities and user experience. You'll Get To: - Design, develop, and maintain both front-end and back-end components of our fin-tech AI platform - Implement responsive and intuitive user interfaces that effectively present complex financial data - Develop and optimize server-side logic, APIs, and database structures - Integrate AI and machine learning models into the platform's architecture - Collaborate with cross-functional teams to define and implement new features - Ensure high performance, responsiveness, and security of the platform - Participate in code reviews and contribute to technical documentation What You'll Bring: - Java Language and JVM Mastery: At least 3+ years of experience with deep expertise in the Java programming language, including a strong understanding of its ecosystems, Object-Oriented Programming (OOP) principles, and the Java Virtual Machine (JVM) internals (memory management, concurrency). - Spring Framework and Spring Boot: 3+ years of proven ability in building scalable enterprise applications using the Spring Framework. Expertise in Spring Boot for creating stand-alone, production-grade microservices is essential. - Cloud and DevOps Principles: A minimum of 3 years of experience with a major cloud platform (AWS, Azure, or GCP) and a strong understanding of DevOps culture. This includes hands-on involvement in designing, building, and maintaining CI/CD pipelines with tools like Jenkins or GitLab. - Containerization and Orchestration: Over 3+ years of hands-on experience with container technologies like Docker for packaging applications and container orchestration using Kubernetes for deployment, scaling, and management. - Database and Data Management: At least 3+ years of experience working with both relational (SQL) and NoSQL databases is required. You must be proficient in writing efficient SQL queries and have experience with ORM frameworks like Hibernate or Spring Data JPA. - API Design and Microservices Architecture: 3+ years of proficiency in designing and implementing RESTful APIs and a strong grasp of microservices patterns. You will be responsible for building and maintaining scalable, independent services that form a cohesive system. We're Even More Excited If You Have: - Infrastructure as Code (IaC): Experience with tools like Terraform or Ansible to automate infrastructure provisioning and management. This demonstrates an ability to create repeatable and consistent environments, reducing manual configuration and errors. - Advanced Observability and Monitoring: Familiarity with modern monitoring stacks such as Prometheus, Grafana, and distributed tracing tools (like Jaeger or Zipkin). This goes beyond basic logging to provide deep insights into application performance and system health. - Performance Tuning and JVM Internals: Advanced knowledge of Java performance tuning, including garbage collection optimization, memory profiling, and troubleshooting complex performance bottlenecks at the JVM level. This skill is crucial for ensuring high-performance and low-latency applications. Thrive at BlackLine Because You Are Joining: - A technology-based company with a sense of adventure and a vision for the future. Every door at BlackLine is open. Just bring your brains, your problem-solving skills, and be part of a winning team at the world's most trusted name in Finance Automation! - A culture that is kind, open, and accepting. It's a place where people can embrace what makes them unique, and the mix of cultural backgrounds and varying interests cultivates diverse thought and perspectives. - A culture where BlackLiner's continued growth and learning is empowered. BlackLine offers a wide variety of professional development seminars and inclusive affinity groups to celebrate and support our diversity. BlackLine is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to sex, gender identity or expression, race, ethnicity, age, religious creed, national origin, physical or mental disability, ancestry, color, marital status, sexual orientation, military or veteran status, status as a victim of domestic violence, sexual assault or stalking, medical condition, genetic information, or any other protected class or category recognized by applicable equal employment opportunity or other similar laws. BlackLine recognizes that the ways we work and the workplace itself have shifted. We innovate in a workplace that optimizes a combination of virtual and in-person interactions to maximize collaboration and nurture our culture. Candidates who live within a reasonable commute to one of our offices will work in the office at least 2 days a week. Salary Range: USD $156,000.00/Yr. - USD $196,000.00/Yr. Pay Transparency Statement: Placement within this range depends upon several factors, including the applicant's prior relevant job experience, skill set, and geographic location. In addition to base pay, BlackLine also offers short-term and long-term incentive programs, based on eligibility, along with a robust offering of benefit and wellness plans. BlackLine is committed to creating an inclusive and accessible experience for all candidates. If you require a reasonable accommodation that would better enable your success during the application or interview process, please complete this form. Accommodations: BlackLine is committed to creating an inclusive and accessible experience for all candidates. If you require a reasonable accommodation that would better enable your success during the application or interview process, please complete this form.

New York
$156K - $196K / year
Job Closed
10x.Team logo

Platform Engineer, AI Trainer – Freelance

10x.Team

Built for Humans. Powered by AI. The AI Recruiter that takes over first interviews — fast, fair, and compliant.

ContractRemoteTeam 11-50Since 2023H1B No Sponsor

• Review and refine AI-generated responses, workflows, and technical explanations related to platform engineering • Evaluate outputs for architectural accuracy, technical validity, and real-world applicability • Draft realistic engineering scenarios involving cloud infrastructure, deployment pipelines, system reliability, and DevOps best practices • Assess AI reasoning in areas such as system scaling, performance optimization, security, and automation • Identify gaps, unrealistic solutions, or methodological flaws in AI outputs • Create scenario variations from the perspective of different stakeholders, such as engineer, operations, or management

United Kingdom
€84 - €150 / hour

Platform Reliability Engineer (Agentic AI)

Search Atlas

The all-in-one Agentic SEO and AI Visibility platform - Get found everywhere people search

OtherRemoteTeam 51-200H1B No Sponsor

The Mission: Building the Autonomous Nervous System Search Atlas is moving beyond suggestions to full execution. Our agent, Atlas Brain, handles SEO, AEO, Google Ads, and AI Content Generation autonomously—zero manual intervention. While Platform Engineers build self-service tools for developers, you ensure those tools enable autonomous AI execution with 99.99% reliability. You're not keeping dashboards alive; you're building the engine that allows an AI Agent to replace manual marketing execution. If the platform is reliable, the agent is unstoppable. What You Will Do: Architect the Autonomous Backbone Design and maintain the Kubernetes-based platform (EKS/GKE) that hosts Atlas Brain and its distributed agentic workers—handling millions of requests across SEO crawling, content generation, and ad optimization pipelines. Engineer for Zero-Touch Automate every aspect of infrastructure using Terraform, ArgoCD, and Go/Python. If you have to do it twice, it must be a script. Enable true "zero manual execution" at the infrastructure level. Scale Agentic Workflows - Optimize ML inference pipelines for real-time agent decision-making - Architect high-concurrency crawling systems that feed Atlas Brain's intelligence - Ensure sub-second latency for agent task execution (SEO, Content, AI Builder) - Handle high-frequency data pipelines: real-time bidding, SERP monitoring, content generation at scale Define Radical Reliability for AI Establish SLOs/SLIs specifically for AI execution success rates and agent task completion, not just "uptime." Design self-healing systems that preemptively resolve failures before they impact autonomous workflows. Observability for Agent Decisions Build distributed tracing and monitoring for complex agentic interactions—trace agent decision trees across SEO/AEO/Ads workflows, enabling rapid diagnosis of "why the agent made that choice." Implement OpenTelemetry, Prometheus, and Grafana for full visibility into autonomous execution. Safety & Guardrails Implement guardrails and safety controls for autonomous agent execution in marketing contexts—ensuring AI actions align with business rules, budget constraints, and compliance requirements. Design human-in-the-loop escalation paths for edge cases. Cost & Performance Governance Proactively optimize cloud spend and resource allocation (Karpenter/KEDA) as we scale to thousands of agencies. Balance performance with cost efficiency for unpredictable AI workloads. Technical Requirements Experience: 6+ years in Platform Engineering, SRE, or Infrastructure roles within high-growth SaaS environments—with proven experience supporting AI/ML systems at scale. Infrastructure as Code: Mastery of Terraform, ArgoCD, and GitOps workflows. Container Orchestration: Expert-level Kubernetes (EKS/GKE) networking, scaling, security, and multi-tenancy patterns. MLOps for Agents (Must-Have): - Hands-on experience with MLOps pipelines for autonomous agents - Model versioning and deployment strategies for continuous agent improvement - Prompt management and A/B testing of agent behaviors - Guardrails for safe tool execution and decision boundaries - Scaling AI inference services (LLMs, embeddings, classification models) Languages: Proficiency in Python for building custom platform tools and automation. Observability: Deep expertise in distributed tracing and monitoring for complex, event-driven systems—specifically for debugging AI agent decision chains. Data-Intensive Systems: Experience with high-frequency data pipelines, web crawling at scale, real-time processing, and low-latency requirements. Why This Is Different Unlike traditional SRE roles focused on keeping services up, you're building the infrastructure that enables autonomous AI to execute business-critical marketing tasks. Every millisecond of latency you eliminate, every self-healing mechanism you deploy, directly impacts whether Atlas Brain can truly replace manual agency work. This is not traditional SRE—you're building the autonomous nervous system for AI execution. What Success Looks Like - Atlas Brain executes millions of marketing tasks daily with <0.1% failure rate - Zero infrastructure-related incidents requiring manual intervention during business hours - Platform scales from hundreds to thousands of agency clients without reliability degradation - Complete observability into agent behavior: "We know not just that the agent acted, but why" Ready to build the platform that makes autonomous marketing execution a reality?

United States
$70K - $120K / month
Job Closed