Site Reliability Engineer
Location
EST (UTC-5)
Posted
5 days ago
Salary
$150K / year
Seniority
Mid Level
No structured requirement data.
Job Description
Site Reliability Engineer
TransFICC
Role Description TransFICC is hiring a Site Reliability Engineer to provide high-performance services to our customers. We develop an integration service product that enables our clients to have a flexible, hosted service without requiring their internal resources to respond to connectivity challenges across trading venues. You will be joining our SRE team and contributing to TransFICC’s automation culture. We are a multi-disciplinary team covering everything from desktop and laptop support to data centre provisioning of servers and vendor network connectivity. We all have areas we are stronger in, but we all share responsibility for the entire environment. We are seeking someone like-minded who can contribute to our team while also learning from us. About the Role – Your mission will be: - Building out the automated provisioning of TransFICC's servers and networks both in physical environments and Cloud, i.e. AWS, GCP - Evolving our Continuous Delivery pipeline for provisioning servers and switches, and deploying software. - Interacting with hardware vendors, telecom providers, and financial institutions. Qualifications - Need to know how to code. Basic programming in your language of choice is required. - Experience with a software automation tool like Ansible. - Experience as a sys admin or network engineer with a reasonable understanding of both. - Constructive, open-minded, and self-motivated. - Appreciate autonomy and be able to take the initiative. - Experience managing 3rd parties via email and phone, negotiating with vendors, purchasing hardware, and enabling remote work. Requirements - Authentication Protocols, i.e. SAML, OAuth2, AD, Kerberos (not essential but preferred). - Microsoft/Azure Active Directory (not essential but preferred). - Database HA - Clustering/replication, e.g. Postgres (not essential but preferred). Benefits - Up to $150,000 + Shares + Benefits.
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
Role Description We’re looking for a Senior Site Reliability Engineer (SRE) to help operate, harden and mature our production OKD / Kubernetes platforms. This is a hands-on engineering role focused on reliability, automation, observability, GitOps, CI/CD and secure platform operations. You’ll work across the full stack, from bare-metal and virtualisation through to Kubernetes control plane operations, ingress, identity, monitoring, developer platform tooling and application delivery. The role will play a key part in improving the operational maturity of our platform estate, supporting the migration from VMware to KVM, strengthening GitOps and CI/CD practices, and helping ensure our platforms remain secure, scalable and aligned to the needs of regulated customer environments. You’ll work closely with platform, application, AI, networking, security, QA and architecture teams to build reliable foundations that enable other engineering teams to deliver safely and at pace. This is not a ticket-handling role. It is a senior engineering position where you’ll be expected to own problems, drive improvements, and help shape how TIG operates critical cloud-native infrastructure. Qualifications - Strong experience running production Kubernetes environments, not just consuming or deploying into them. - Strong Linux fundamentals, including systemd, networking, storage and performance troubleshooting. - Experience with at least one Kubernetes distribution such as OKD, OpenShift, vanilla Kubernetes, Rancher, EKS, AKS or GKE. - Solid infrastructure as code experience, including Ansible plus Terraform or equivalent, alongside tools such as Helm and Kustomize. - GitOps and CI/CD experience managing full application and component lifecycles, using tools such as Argo CD, Flux, GitHub Actions or similar. - Prometheus, Grafana, Elastic Stack / LGTM, OpenTelemetry or similar. - Experience working with identity and access technologies such as OIDC, SAML, SCIM or Keycloak. - Experience with virtualisation or infrastructure platforms such as KVM, libvirt or VMware. - Scripting or tooling experience using Go, Python, shell scripting or similar. - Strong troubleshooting, problem-solving and analytical skills. - Experience working in secure, regulated or enterprise-scale environments. - Strong communication skills, with the ability to produce clear documentation, runbooks, post-mortems and technical guidance. - Eligible to hold UK SC clearance. Requirements - Operate, harden and extend production OpenShift / OKD / Kubernetes clusters across on-premises and hybrid environments. - Support the migration from VMware to KVM, helping modernise the underlying compute and storage layer. - Own and improve CI/CD processes across the full lifecycle of platform and application components. - Work with platform and application engineers to support cloud-native delivery using tools such as Helm and Kustomize. - Develop and mature GitOps deployment practices using tools such as Argo CD or Flux. - Maintain and improve core platform services including identity, ingress, observability, certificate management, service mesh and container registry capabilities. - Build and operate observability across logs, metrics, traces, alerting, SLOs and error budgets. - Improve platform hardening in line with secure and regulated environment requirements, including network policy, SELinux, image provenance, secret management and audit. - Automate repeatable operational tasks using tools such as Ansible, Terraform, Helm, Kustomize, Go, Python or equivalent technologies. - Lead incident response activity, support blameless post-mortems and drive systemic fixes. - Partner with networking and security teams on platform integration, segmentation, load balancing and accreditation evidence. - Create and maintain clear technical documentation, runbooks, design notes and operational guidance. - Mentor other engineers and act as a senior technical authority across cloud and Kubernetes operations. - Participate in an on-call rota, with appropriate compensation. Benefits - Private Medical - Health Cash Plan - 4x Life Assurance - Inclusive Culture: Enjoy an inclusive culture and environment. - Holiday: Generous holiday allowance. - Learning: Access to continuous learning and development opportunities. - Bonus Potential: Bonus potential based on performance and business-related factors. - Discounts: Discounts on a wide range of products and services. - Pension: Pension scheme contributions. - EV Car Scheme - Regular Pay Reviews - More Benefits: Explore additional benefits on our career site.
Role Description We’re looking for a Senior Cloud / Kubernetes SRE to help operate, harden and mature our production OKD / Kubernetes platforms. This is a hands-on engineering role focused on reliability, automation, observability, GitOps, CI/CD and secure platform operations. You’ll work across the full stack, from bare-metal and virtualisation through to Kubernetes control plane operations, ingress, identity, monitoring, developer platform tooling and application delivery. The role will play a key part in improving the operational maturity of our platform estate, supporting the migration from VMware to KVM, strengthening GitOps and CI/CD practices, and helping ensure our platforms remain secure, scalable and aligned to the needs of regulated customer environments. You’ll work closely with platform, application, AI, networking, security, QA and architecture teams to build reliable foundations that enable other engineering teams to deliver safely and at pace. This is not a ticket-handling role. It is a senior engineering position where you’ll be expected to own problems, drive improvements, and help shape how TIG operates critical cloud-native infrastructure. Qualifications - Strong experience running production Kubernetes environments, not just consuming or deploying into them. - Strong Linux fundamentals, including systemd, networking, storage and performance troubleshooting. - Experience with at least one Kubernetes distribution such as OKD, OpenShift, vanilla Kubernetes, Rancher, EKS, AKS or GKE. - Experience with infrastructure as code and automation, such as Ansible, Terraform, Helm or Kustomize. - Experience using GitOps tooling such as Argo CD or Flux in production environments. - Experience building or operating CI/CD pipelines for platform or application components. - Strong observability experience across logs, metrics and traces, using tools such as Prometheus, Grafana, Elastic Stack, OpenTelemetry or similar. - Experience working with identity and access technologies such as OIDC, SAML, SCIM or Keycloak. - Experience with virtualisation or infrastructure platforms such as KVM, libvirt or VMware. - Scripting or tooling experience using Go, Python, shell scripting or similar. - Strong troubleshooting, problem-solving and analytical skills. - Experience working in secure, regulated or enterprise-scale environments. - Strong communication skills, with the ability to produce clear documentation, runbooks, post-mortems and technical guidance. - Eligible to hold UK SC clearance. Requirements - Operate, harden and extend production OKD / Kubernetes clusters across on-premises and hybrid environments. - Support the migration from VMware to KVM, helping modernise the underlying compute and storage layer. - Own and improve CI/CD processes across the full lifecycle of platform and application components. - Work with platform and application engineers to support cloud-native delivery using tools such as Helm and Kustomize. - Develop and mature GitOps deployment practices using tools such as Argo CD or Flux. - Maintain and improve core platform services including identity, ingress, observability, certificate management, service mesh and container registry capabilities. - Build and operate observability across logs, metrics, traces, alerting, SLOs and error budgets. - Improve platform hardening in line with secure and regulated environment requirements, including network policy, SELinux, image provenance, secret management and audit. - Automate repeatable operational tasks using tools such as Ansible, Terraform, Helm, Kustomize, Go, Python or equivalent technologies. - Lead incident response activity, support blameless post-mortems and drive systemic fixes. - Partner with networking and security teams on platform integration, segmentation, load balancing and accreditation evidence. - Create and maintain clear technical documentation, runbooks, design notes and operational guidance. - Mentor other engineers and act as a senior technical authority across cloud and Kubernetes operations. - Participate in an on-call rota, with appropriate compensation. Benefits - Private Medical - Health Cash Plan - 4x Life Assurance - Inclusive Culture: Enjoy an inclusive culture and environment. - Holiday: Generous holiday allowance. - Learning: Access to continuous learning and development opportunities. - Bonus Potential: Bonus potential based on performance and business-related factors. - Discounts: Discounts on a wide range of products and services. - Pension: Pension scheme contributions. - EV Car Scheme - Regular Pay Reviews - More Benefits: Explore additional benefits on our career site.
Role Description We’re looking for a Senior Site Reliability Engineer (SRE) to help operate, harden and mature our production OKD / Kubernetes platforms. This is a hands-on engineering role focused on reliability, automation, observability, GitOps, CI/CD and secure platform operations. You’ll work across the full stack, from bare-metal and virtualisation through to Kubernetes control plane operations, ingress, identity, monitoring, developer platform tooling and application delivery. The role will play a key part in improving the operational maturity of our platform estate, supporting the migration from VMware to KVM, strengthening GitOps and CI/CD practices, and helping ensure our platforms remain secure, scalable and aligned to the needs of regulated customer environments. You’ll work closely with platform, application, AI, networking, security, QA and architecture teams to build reliable foundations that enable other engineering teams to deliver safely and at pace. This is not a ticket-handling role. It is a senior engineering position where you’ll be expected to own problems, drive improvements, and help shape how TIG operates critical cloud-native infrastructure. Qualifications - Strong experience running production Kubernetes environments, not just consuming or deploying into them. - Strong Linux fundamentals, including systemd, networking, storage and performance troubleshooting. - Experience with at least one Kubernetes distribution such as OKD, OpenShift, vanilla Kubernetes, Rancher, EKS, AKS or GKE. - Solid infrastructure as code experience, including Ansible plus Terraform or equivalent, alongside tools such as Helm and Kustomize. - GitOps and CI/CD experience managing full application and component lifecycles, using tools such as Argo CD, Flux, GitHub Actions or similar. - Prometheus, Grafana, Elastic Stack / LGTM, OpenTelemetry or similar. - Experience working with identity and access technologies such as OIDC, SAML, SCIM or Keycloak. - Experience with virtualisation or infrastructure platforms such as KVM, libvirt or VMware. - Scripting or tooling experience using Go, Python, shell scripting or similar. - Strong troubleshooting, problem-solving and analytical skills. - Experience working in secure, regulated or enterprise-scale environments. - Strong communication skills, with the ability to produce clear documentation, runbooks, post-mortems and technical guidance. - Eligible to hold UK SC clearance. Requirements - Operate, harden and extend production OpenShift / OKD / Kubernetes clusters across on-premises and hybrid environments. - Support the migration from VMware to KVM, helping modernise the underlying compute and storage layer. - Own and improve CI/CD processes across the full lifecycle of platform and application components. - Work with platform and application engineers to support cloud-native delivery using tools such as Helm and Kustomize. - Develop and mature GitOps deployment practices using tools such as Argo CD or Flux. - Maintain and improve core platform services including identity, ingress, observability, certificate management, service mesh and container registry capabilities. - Build and operate observability across logs, metrics, traces, alerting, SLOs and error budgets. - Improve platform hardening in line with secure and regulated environment requirements, including network policy, SELinux, image provenance, secret management and audit. - Automate repeatable operational tasks using tools such as Ansible, Terraform, Helm, Kustomize, Go, Python or equivalent technologies. - Lead incident response activity, support blameless post-mortems and drive systemic fixes. - Partner with networking and security teams on platform integration, segmentation, load balancing and accreditation evidence. - Create and maintain clear technical documentation, runbooks, design notes and operational guidance. - Mentor other engineers and act as a senior technical authority across cloud and Kubernetes operations. - Participate in an on-call rota, with appropriate compensation. Benefits - Private Medical - Health Cash Plan - 4x Life Assurance - Inclusive Culture: Enjoy an inclusive culture and environment. - Holiday: Generous holiday allowance. - Learning: Access to continuous learning and development opportunities. - Bonus Potential: Bonus potential based on performance and business-related factors. - Discounts: Discounts on a wide range of products and services. - Pension: Pension scheme contributions. - EV Car Scheme - Regular Pay Reviews - More Benefits: Explore additional benefits on our career site.
• менторити, розвивати та надавати технічну підтримку команді DevOps / SRE • будувати та розвивати практики DevOps і SRE в команді • брати участь у формуванні технічної стратегії команди та розвитку інженерної культури • розвивати, підтримувати та стратегічно вдосконалювати ігрову платформу • розвивати, масштабувати та супроводжувати системи моніторингу й observability • організовувати та брати участь в оплачуваних On-Call чергуваннях, покращувати процеси управління інцидентами • автоматизовувати інфраструктурні та рутинні процеси • взаємодіяти з командами розробки щодо надійності, масштабованості, покращення процесів доставки змін і CI / CD • менторити інженерів та навчати Kubernetes-native підходам • брати участь у прийнятті архітектурних рішень і розвитку хмарної інфраструктури
