Collibra logo
Collibra

United by Data™

Senior Platform Engineer

Platform EngineerPlatform EngineerFull TimeRemoteSeniorTeam 1,001-5,000Since 2008H1B SponsorCompany SiteLinkedIn

Location

United States

Posted

3 days ago

Salary

$168K - $210K / year

Seniority

Senior

Job Description

Senior Platform Engineer

Collibra

• Developer Enablement: Develop controllers and automations, work with development teams on refinements to platform capabilities • Platform Contribution: Contribute to the overall architecture of the platform infrastructure, collaborating with other infrastructure engineers using GitOps, IaC and Kubernetes • Operational Excellence: Participate in on-call rotations, troubleshoot complex service issues, implement security best practices, and maintain clear documentation (architecture, procedures) • Continuous Improvement: Stay current with platform engineering trends and infrastructure automation, identifying and implementing improvements

Job Requirements

  • 3+ years of experience in Platform Engineering, SRE, or infrastructure-focused roles with a Bachelor's degree in Computer Science or a related technical field, OR equivalent practical experience
  • Proven experience designing, building, and managing production services using Kubernetes and gitops / IaC at a scale of between tens and hundreds of Kubernetes clusters
  • Experience managing production workloads and infrastructure on major cloud platforms (AWS, GCP, Azure)
  • Hands-on experience operating Kubernetes clusters and managing containerized services in production
  • Demonstrable experience writing and maintaining Infrastructure as Code (IaC), preferably with Terraform, and proficiency in Golang or Python for automation
  • Preferred skills: CKA / CKAD, Istio, ArgoCD, deep experience with networks, linux and Kubernetes, experience with monitoring/logging tools and observability spans / traces (e.g., Datadog, Grafana, Honeycomb)
  • Demonstrated proficiency in leveraging AI tools (e.g., Claude, Gemini, ChatGPT, Copilot) to solve real-world business challenges, drive measurable outcomes, or streamline workflows
  • Must be eligible to work in the USA without requiring sponsorship
  • Because this role supports the US government, it is required that this candidate be a US citizen who resides on US soil
  • A bachelor’s degree or equivalent related working experience is required

Benefits

  • equity ownership at every level
  • bonus potential
  • a Flex Fund monthly stipend
  • pension/401k plans

Related Categories

Related Job Pages

More Platform Engineer Jobs

Sourcefit logo

Power Platform Developer

Sourcefit

Making Outsourcing in the Philippines Work for You

Full TimeRemoteTeam 1,001-5,000H1B No Sponsor

• Design, develop, and support solutions using the Microsoft Power Platform. • Build and maintain Canvas Apps and Model-Driven Apps. • Develop automated workflows using Power Automate. • Create dashboards and reports using Power BI. • Configure and manage Dataverse environments. • Gather and analyse business requirements from stakeholders. • Translate requirements into technical designs and functional solutions. • Integrate Power Platform solutions with Microsoft 365, Azure, SharePoint, Teams, Dynamics 365, and third-party applications. • Troubleshoot, diagnose, and resolve application issues. • Maintain technical documentation and solution designs. • Participate in testing, deployment, and release management activities. • Ensure solutions adhere to governance, security, and best practice standards. • Identify opportunities for process improvement and automation. • Provide support and training to end users where required.

Philippines

Role Description Our jobs involve developing, maintaining and integrating state of the art, mission critical solutions for core insurance and record keeping systems, in an international context for a leading European Insurance Group. The lead developer for Agentic AI Platform owns the technical implementation and framework development of the Agentic AI Platform, with a strong focus on Python-based platform components, reusable agentic AI patterns, and developer enablement. The role ensures that the platform codebase is modular, maintainable, secure, extensible, and supports efficient delivery of agentic AI use cases across the organization. - Develop and maintain Python-based platform capabilities, reusable components, SDKs, templates, and developer tools - Design and implement reusable framework capabilities for agents, tools, MCP servers, orchestration, memory, RAG, model access, and evaluation - Implement MCP servers end-to-end, including schema design, validation logic, action execution, error handling, and interface contracts - Extend agent runtime capabilities, including planning, tool-use, routing, reasoning support, context handling, and orchestration patterns - Design and evolve RAG and knowledge access logic, including chunking, retrieval, ranking, generation, and evaluation patterns - Implement agent memory and context management logic, including semantic, vector-based, and session-related memory concepts - Build test frameworks, evaluation harnesses, and enablement tooling that allow developers to test, validate, and improve agentic AI solutions - Ensure clean separation between shared platform/framework capabilities and use case-specific implementation logic - Evaluate, integrate, and extend open-source AI and agentic AI frameworks in line with the platform architecture - Drive code quality, modularity, refactoring, secure coding, documentation, and maintainability of the platform codebase Qualifications - Strong hands-on Python development experience - Experience developing reusable frameworks, SDKs, libraries, platform components, or developer tools - Deep understanding of open-source AI and agentic AI frameworks such as LangChain, LangGraph, Semantic Kernel, AutoGen, LlamaIndex, CrewAI, or comparable frameworks - Strong understanding of LLM-based agent architectures, including planning, tool use, routing, orchestration, reasoning patterns, and agent runtime design - Hands-on experience with MCP-style servers, action or skill servers, schema design, validation logic, and execution interfaces - Solid knowledge of RAG systems, including retrieval logic, chunking, ranking, generation, grounding, and evaluation - Good understanding of agent memory concepts, including vector memory, semantic memory, session context, and long-term memory patterns - Strong skills in API design, schema modeling, interface contracts, modular software design, and distributed systems - Experience with test automation, evaluation harnesses, quality gates, CI/CD, secure coding, and code quality practices - Familiarity with knowledge representation concepts such as ontologies, taxonomies, and knowledge graphs is beneficial - Fluent in English (both spoken and written) Requirements - Fluent in German (both spoken and written) - Knowledge of agile methodologies (Scrum, SAFe) - Experience in the Insurance industry Benefits - Very stable work environment, as part of a large Insurance Multinational Group - Flexible work program, respecting your own private time - Full-time, unlimited employment contract - Attractive remuneration package and lunch tickets - Christmas bonus and performance bonus - Special Days vouchers (Women’s Day, Children's Day, Christmas, Easter) - Private pension and private health insurance contribution - Extra vacation days - Time off when holidays fall on a weekend (up to 3 days/year) - Exchange of experience and training with international professionals, as a premise for personal development

Romania

Role Description We are seeking a PLM Platform Engineer with deep experience operating either PTC Windchill or Siemens Teamcenter (preferably both) in large enterprise environments. In this role you will own the technical operation of the PLM platform — installation, configuration, performance tuning, upgrades, integrations, and high availability — and partner with functional, engineering, and manufacturing teams to deliver a reliable, performant, and secure PLM ecosystem. The ideal candidate will bring strong PLM administration fundamentals, hands-on experience with PLM upgrades and migrations, and a measurement-driven approach to platform reliability. Key Responsibilities - Install, configure, and operate Windchill or Teamcenter environments across development, test, and production. - Lead PLM upgrades, patches, and platform migrations with minimal disruption. - Manage PLM application servers, web servers, database connectivity, and method servers. - Operate file vaults, replication services, and CAD data management subsystems. - Implement and tune HA/DR strategies for PLM environments, applying disciplined engineering practices and partnering closely with stakeholders to ensure outcomes are durable, well-documented, and aligned with broader team and platform standards. - Optimize PLM performance through query tuning, caching, indexing, and JVM tuning. - Manage user provisioning, security configurations, and audit support, applying disciplined engineering practices and partnering closely with stakeholders to ensure outcomes are durable, well-documented, and aligned with broader team and platform standards. - Operate PLM integration brokers and middleware connectors, applying disciplined engineering practices and partnering closely with stakeholders to ensure outcomes are durable, well-documented, and aligned with broader team and platform standards. - Develop automation scripts using shell, Python, or Ansible to reduce operational toil. - Monitor PLM health using native tooling and integrated observability platforms. - Provide hands-on post-go-live and hypercare support, working closely with operations teams to triage incidents quickly, identify root causes, and drive durable fixes that improve long-term system stability. - Maintain comprehensive, current technical documentation — including architecture diagrams, design decisions, configuration references, runbooks, and operational procedures — so that the system remains supportable, auditable, and easy to onboard new engineers onto over time. - Mentor and coach junior and mid-level engineers through code review, design review, pair programming, and structured knowledge sharing, helping the broader team grow in technical maturity and confidence over time. - Drive continuous improvement of the PLM platform, applying disciplined engineering practices and partnering closely with stakeholders to ensure outcomes are durable, well-documented, and aligned with broader team and platform standards. Qualifications - Bachelor’s degree in Computer Science, Engineering, or a related technical discipline. - Five or more years of PLM platform administration experience. - Hands-on experience with either PTC Windchill or Siemens Teamcenter in production. - Strong experience with PLM upgrades and migrations. - Working knowledge of Oracle and SQL Server database administration. - Strong Linux/Unix administration skills. - Experience operating HA/DR for PLM environments. - Familiarity with PLM integration brokers and middleware. - Scripting skills in shell, Python, or Ansible. - Excellent troubleshooting and documentation skills. Preferred Qualifications - Experience operating PLM on cloud platforms (AWS, Azure, OCI). - Exposure to infrastructure-as-code for PLM environments. - Familiarity with CI/CD patterns for PLM change management. - PTC or Siemens PLM certifications. - Experience with CAD integration patterns at scale. How to Apply Would you like to know more about this opportunity? For immediate consideration, please send your resume to [email protected] or contact us at (908) 676-4399. Learn more about Bright Vision Technologies at www.bvteck.com .

United States
100K - 150K / year

Role Description We are seeking a skilled Azure Cloud Engineer to design, deploy, and operate large-scale, secure, and resilient cloud platforms on Microsoft Azure. In this role you will own the end-to-end cloud engineering lifecycle, including: - Architecture - Infrastructure-as-code - Automation - Security hardening - Cost optimization - Observability - Ongoing operational excellence for production workloads The ideal candidate will combine deep technical expertise across Azure services with strong DevOps engineering practices, and will partner closely with application development, security, and SRE teams to deliver cloud-native solutions that meet demanding business requirements for scalability, reliability, and compliance. Qualifications - Bachelor’s degree in Computer Science, Engineering, or a related technical discipline. - Five or more years of cloud engineering experience, with at least three years focused on Microsoft Azure in production environments. - Strong hands-on experience with Azure core services, including compute, storage, networking, identity, and platform-as-a-service offerings. - Production-level experience with infrastructure-as-code tools such as Terraform, Bicep, or ARM templates. - Solid experience designing and operating Azure Kubernetes Service (AKS) clusters at scale. - Hands-on experience with Azure DevOps or GitHub Actions for CI/CD across infrastructure and applications. - Strong scripting skills in PowerShell, Bash, and Python, with the ability to write maintainable automation code. - Deep understanding of cloud security principles, identity management, and compliance frameworks. - Experience implementing monitoring, alerting, and observability strategies across distributed workloads. - Strong troubleshooting, communication, and documentation skills. Requirements - Design and implement enterprise-grade Azure cloud architectures spanning compute, networking, storage, identity, and data services. - Develop, maintain, and continuously improve infrastructure-as-code using Terraform, Bicep, or ARM templates. - Configure and manage Azure landing zones, virtual networks, subnets, route tables, and network security groups. - Implement secure identity, access management, and governance controls using Azure Active Directory. - Architect and operate Azure Kubernetes Service (AKS) clusters. - Deploy, scale, and tune Azure data and analytics platforms. - Build and operate comprehensive CI/CD pipelines using Azure DevOps or GitHub Actions. - Design and implement robust observability practices using Azure Monitor, Log Analytics, and Application Insights. - Drive Azure cost optimization initiatives. - Implement disaster-recovery and business-continuity strategies. - Strengthen security posture by integrating Microsoft Defender for Cloud and routinely remediating findings from compliance scans. - Collaborate closely with application teams to architect cloud-native solutions. - Develop automation scripts and tooling in PowerShell, Bash, and Python. - Mentor junior engineers and lead architecture reviews. Benefits - Competitive base salary commensurate with experience. - Full-time, direct W2 employment with Bright Vision Technologies. - 100% remote work opportunity. - Long-term, multi-year engagement aligned to the Bright Vision SOW delivery roadmap.

United States
$100K - $150K / year