Job Closed
This listing is no longer active.
A leading provider of risk and compliance solutions, DFIN - Donnelley Financial Solutions offers data insights, industry expertise, and insightful technology to
Principal Site Reliability Engineer - Remote
Location
United States
Posted
99 days ago
Salary
$0
Seniority
Lead
Job Description
Principal Site Reliability Engineer - Remote
DFIN - Donnelley Financial Solutions
Join a dynamic team at the pulse of global markets, where we deliver innovative software and service solutions for essential financial reporting and capital markets transactions. At DFIN, we are a values-driven organization that empowers you to build a fulfilling career while bringing your authentic self to work every day. Our "Win as One" mentality ensures that our team's success is directly linked to Client, Shareholder and Employee Satisfaction. Recognized as one of AMERICA'S MOST LOVED WORKPLACES® for five consecutive years and a Built In Best Places to Work for six years, we are committed to our employees' total well-being. Enjoy competitive compensation, a flexible workplace, comprehensive benefits, and opportunities for professional growth. Bring your passion and talents to DFIN - because being YOU thrives here. Summary: We are looking for technical team members at all levels who want to push themselves to deliver best in market SaaS solutions. We offer a challenging environment where you will have to grow, adapt and use your skills consistently. Our customers rely on us in the moments that matter. Engineering delivers on that promise. The Principal Site Reliability Engineer - Cloud is responsible for designing, building, securing, monitoring and maintaining our SaaS product cloud infrastructure so it is fast, cost effective, stable and optimized for our customers. SRE's at DFIN take on availability, performance, managing change, monitoring, response and are guardians of non-functional requirements. You either have a SaaS cloud infrastructure background in Azure or AWS with a programmatic, automated mindset or are someone that comes with a software engineering background with SaaS cloud infrastructure experience in Azure or AWS. The SRE goal is to build automated systems that reduce or eliminate manual work to keep our products up and running and performing optimally. We are looking for someone who thrives on collaboration within the team and across other groups and can lead colleagues independently to deliver solutions to complex problems. Responsibilities: - Champion and implement a culture to maintain performant, reliable, secure, cost-effective platform cloud infrastructure in DFIN SaaS products based on operationalized processes you define - Champion security of our cloud infrastructure collaborating with Security and Governance teams and using static and dynamic tooling - Champion and implement application and cloud infrastructure monitoring and alerting to prevent client impacting issues by ensuring system availability, performance and scalability to maintain SLOs and SLAs - Optimize cloud infrastructure and application performance at scale while maintaining effective cost controls - Automate cloud infrastructure buildout and maintenance including system operational runbooks - Dive deep into technology and stay on the forefront of the latest tools, technologies, and strategies; help evaluate, prototype, and integrate them into operationalized work processes - Perform with broad independence and deliver on project milestones and tasks you define on schedule while communicating progress regularly - Build strong relationships with SRE team members and software engineering teams to hold each other accountable for quality expectations - Learn continuously and apply lessons learned - Evangelize best practices, eliminate bottlenecks, and improve process - Participate in on-call duties 365/24/7 and lead the triage and RCA of production incidents Qualifications: - 8+ years experience designing, building, securing, monitoring and maintaining cloud infrastructure in Azure or AWS - 5+ years experience creating, configuring, maintaining and monitoring Kubernetes clusters (AKS or EKS) in cloud infrastructure to optimize application performance and reliability - 5+ years building and deploying Infrastructure as Code with Terraform or similar technology - 5+ years experience with common cloud networking, firewall and load balancing configuration - 5+ years experience writing software in any modern software language such as C# .NET, Java - 5+ years experience creating automated deployments with tools such as Harness, Azure DevOps, Ansible or Jenkins to manage Infrastructure as Code and software build and deployment in a continuous integration (CI) / continuous delivery (CD) environment - 5+ years experience implementing production performance, availability, and scalability monitoring and alerting using a tool such as New Relic, Dynatrace, DataDog or AppDynamics - 5+ years experience supporting public client facing revenue generating systems - Experiencing monitoring and preventing issues with databases and database queries (SQL) using tools like Solarwinds Database Performance Analyzer, Idera SQL Diagnostic Manager, or Redgate SQL Monitor - Experience planning, coordinating, developing and executing all stages of post deployment verification test scripts - Experience securing Windows or Linux systems in 24x7 production environment - BS in Computer Science or equivalent work experience It is the policy of Donnelley Financial Solutions to select, place, and manage all its employees without discrimination based on race, color, national origin, gender, age, religion, actual or perceived disability, veteran status, actual or perceived sexual orientation, genetic information or any other protected status. If you are a qualified individual w ith a disability or a disabled veteran, you have the right to request a reasonable accommodation if you are unable or limited in your ability to use or access jobs.dfinsolutions.com as a result of your disability. You can request a reasonable accommodation by sending an email to talentacquisition@dfinsolutions.com . At DFIN, protecting your identity is a top priority. Please be aware of scammers impersonating DFIN recruiters. DFIN recruiters will never request personal information via email or text. You will only receive a text from us if you've already been in contact. All automated messages will come from talentacquisition@dfinsolutions.com . If you ever have doubts about the legitimacy of any communication from us, please do not hesitate to reach out for verification via talentacquisition@dfinsolutions.com (this email is for general TA questions and is not used for updates on your application status). #BI-Remote
Benefits
- 401(K), 401(K) matching, Adoption Assistance, Childcare benefits, Commuter benefits, Company equity, Company-sponsored outings, Customized development tracks, Dedicated diversity and inclusion staff, Dental insurance, Disability insurance, Diversity manifesto, Documented equal pay policy, Volunteer in local community, Employee stock purchase plan, Family medical leave, Fitness stipend, Flexible Spending Account (FSA), Flexible work schedule, Generous parental leave, Generous PTO, Company-sponsored happy hours, Health insurance, Highly diverse management team, Job training & conferences, Open door policy, Life insurance, Mentorship program, Online course subscriptions available, Open office floor plan, Paid holidays, Paid industry certifications, Paid sick days, Onsite office parking, Partners with nonprofits, Performance bonus, Promote from within, Lunch and learns, Remote work program, Return-to-work program post parental leave, Team based strategic planning, OKR operational model, Continuing education available during work hours, Mandated unconscious bias training, Unlimited vacation policy, Vision insurance, Wellness programs, Some meals provided, Mental health benefits, Diversity employee resource groups, Hiring practices that promote diversity, Employee resource groups, Employee-led culture committees, Quarterly engagement surveys, Hybrid work model, Employee awards, Diversity recruitment program, Wellness days, Personal development training, Apprenticeship programs, Flexible time off, Floating holidays, Bereavement leave benefits, Hardship benefits
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
Site Reliability Engineer
DevsuDevsu is a technology agency that provides software development services, IT augmentation and staffing.
We are seeking a Site Reliability Engineer (SRE) with deep expertise in monitoring, observability, and reliability engineering to support systems running across on-premises infrastructure and Google Cloud Platform (GCP). This role is primarily responsible for designing, operating, and improving monitoring, alerting, and observability platforms, with a strong focus on Grafana and Kubernetes environments. As a secondary responsibility, this role provides backup coverage for the Application Support team during periods of resource constraints or major incidents, offering L2/L3 technical support when required. ResponsibilitiesMonitoring & Observability (Core Focus) - Own and operate the monitoring and observability stack across on-prem and GCP environments - Design, build, and maintain Grafana dashboards for infrastructure, Kubernetes, and applications - Define, tune, and maintain alerts to ensure high signal-to-noise ratio - Establish observability standards and best practices across teams - Improve visibility into system health, performance, and reliability Site Reliability Engineering - Apply SRE principles to improve availability, performance, and resilience - Define and track SLIs, SLOs, and error budgets - Participate in on-call rotations and SEV incident response - Lead or contribute to incident investigations and root cause analysis (RCA) - Drive preventative actions to reduce repeat incidents Kubernetes & Platform Reliability - Support and monitor Kubernetes environments (GKE and on-prem clusters) - Monitor cluster health, capacity, and resource utilization - Troubleshoot platform-level issues impacting application reliability - Collaborate with Platform and Engineering teams on reliability improvements Secondary Responsibilities (Backup Application Support) - These responsibilities are activated as needed, not part of day-to-day operations. - Provide L2/L3 application support coverage during: - Support team resource shortages - High-severity incidents (SEVs) - Peak support periods or escalations - Triage and troubleshoot application issues using existing runbooks and dashboards - Collaborate with Application Support and Engineering teams during incidents - Ensure all actions, findings, and resolutions are documented in ServiceNow (SNOW) - Strong experience as a Site Reliability Engineer or Reliability Engineer - Deep hands-on expertise with Grafana (dashboards, alerting, troubleshooting) - Solid experience with monitoring and observability systems - Production experience operating Kubernetes environments - Experience supporting systems in GCP and on-prem environments - Strong Linux systems and troubleshooting skills - Fluent English (written and spoken). - Ability to work in PST time zone. - Ability to participate in an on-call rotation that includes coverage for one weekend day. Time worked during the weekend is compensated with one day off during the week, in accordance with the established work schedule. Technology Stack: - Observability: Grafana, Prometheus, logging platforms - Containers: Kubernetes (GKE and on-prem) - Cloud: Google Cloud Platform (GCP) - Operations: Linux, networking, infrastructure monitoring - Incident Tools: PagerDuty, ServiceNow, Slack (or equivalents) Nice to have: - Experience supporting application teams during SEV incidents - Knowledge of capacity planning and performance tuning - Scripting skills (Python, Bash, etc.) - Experience with hybrid infrastructure environments At Devsu, we believe in creating an environment where you can thrive both personally and professionally. By joining our team, you’ll enjoy: - A stable, long-term contract with opportunities for career growth - Private health insurance - A remote-friendly culture that promotes work-life balance - Continuous training, mentorship, and learning programs to keep you at the forefront of the industry - Free access to AI training resources and state-of-the-art AI tools to elevate your daily work - A flexible Paid Time Off (PTO) policy as well as paid holiday days - Challenging, world-class software projects for clients in the US and LatAm - Collaboration with some of the most talented software engineers in Latin America and the US, in a diverse work environment Join Devsu and discover a workplace that values your growth, supports your well-being, and empowers you to make a global impact.
Senior DevOps Engineer
ChowNowThe only fair-for-all food ordering marketplace — no commissions for restaurants and no hidden fees for diners.
• As a Senior DevOps Engineer at ChowNow, you will be specifically responsible for building, improving, and growing our technology infrastructure. • You will help design and implement reproducible processes in the enterprise environment as well as support the application production environment. • You will own and support engineering user-facing technology as well as share responsibility for supporting the production operations.
Senior DevOps Engineer (Exol)
ExolSymbotic is an automation technology leader reimagining the supply chain with its end-to-end, AI-powered robotic and software platform. Symbotic reinvents the warehouse as a strategic asset for the world’s largest retail, wholesale, and food & beverage companies Applying next-gen technology, high-density storage and machine learning to solve today's complex distribution challenges Transforms the flow of goods and the economics of supply chain for its customers
Who we are With its A.I.-powered robotic technology platform, Symbotic is changing the way consumer goods move through the supply chain. Intelligent software orchestrates advanced robots in a high-density, end-to-end system – reinventing warehouse automation for increased efficiency, speed and flexibility. What we need Exol is seeking an experienced and security-focused Senior DevOps Engineer to join our growing team. In this role, you will bridge the gap between software development and IT operations, ensuring our infrastructure and software delivery pipelines are efficient, scalable, and secure. You will be the subject matter expert of our cloud infrastructure, heavily focused on Google Cloud Platform. You will work directly with software developers to automate software deployment workflows, infrastructure as code (IaC) pipelines, and ensure high availability for our software services. This is a fast-paced environment where agility is key. We need someone who can not only write clean, modular Terraform code but also strategize on cloud architectures and operational excellence. What we do Exol* is pioneering fulfillment as-a-service, offering outsourced warehousing operations and specializing in automated warehousing solutions. Our focus is on the efficient movement of goods in cases and pallets across all sectors, such as CPG, food and beverage, wholesale, and retail. *Exol is an independently managed joint venture between Symbotic and Softbank What you’ll do - Infrastructure as Code (IaC): Design, build, and maintain production-grade cloud infrastructure using Terraform. - You will be responsible for state management, module development, and ensuring our delivery pipelines are efficient, repeatable, and scalable. - Cloud Architecture: Architect and deploy secure, scalable solutions on GCP (GKE, Cloud Run, Compute Engine, Cloud SQL, VPCs, etc). - CI/CD Implementation: Build and optimize CI/CD pipelines (e.g., GitHub Actions, GitLab CI, or Jenkins) to enable seamless code deployment from development to production. - Multi-Cloud Strategy: Leverage your experience with other cloud providers (AWS or Azure) to assist with integrations, migrations, or disaster recovery strategies. - Reliability & Monitoring: Implement robust monitoring, logging, and alerting solutions (e.g., Prometheus, Grafana, Google Cloud Operations Suite) to ensure system health. - Collaboration: Act as an embedded consultant for the software development team, helping them containerize applications (Docker/Kubernetes) and troubleshoot issues. What we need - Bachelor’s degree in computer science or a related field preferred. - Minimum 8 years of DevOps or Cloud Engineering experience, with multiple years working in GCP. - Terraform Expertise: Deep proficiency in Terraform is non-negotiable. You must have experience writing custom modules, managing remote state, and preventing infrastructure drift. - Cloud Versatility: Demonstrated experience with multiple cloud providers is required. - Containerization: Strong experience with Docker or Kubernetes or GKE specifically. - Scripting: Proficiency in Python, Go, or Bash for automation tasks. - Environment: Proven track record working in a fast-paced software start-up environment; ability to context-switch and manage competing priorities effectively. Preferred Qualifications - GCP Professional Cloud Architect or DevOps Engineer certification. - Experience with "GitOps" workflows (e.g., ArgoCD, Helm Charts). - Proven knowledge of security compliance frameworks (SOC 2, ISO 27001) and deploying secure infrastructure. - Experience deploying and managing database platforms and storage lifecycles. Our Environment - Travel could be up to 10% of the time. Employee must have a valid driver’s license and the ability to drive and/or fly to client and other customer locations - The employee is responsible for owning a credit card and managing expenses personally to be reimbursed on a bi-weekly basis. - This is an in-warehouse role; you’ll spend time on the floor as well as in the office. - Flexibility to work multiple shifts (day, swing, night) or be on call depending on operational demands. - Ability to walk/stand for extended periods, climb stairs/ladders, and tolerate warehouse environmental conditions (temperature variations, noise, etc). #LI-JH2 #LI-Remote About Symbotic Symbotic is an automation technology leader reimagining the supply chain with its end-to-end, AI-powered robotic and software platform. Symbotic reinvents the warehouse as a strategic asset for the world’s largest retail, wholesale, and food & beverage companies. Applying next-gen technology, high-density storage and machine learning to solve today's complex distribution challenges, Symbotic enables companies to move goods with unmatched speed, agility, accuracy and efficiency. As the backbone of commerce the Symbotic platform transforms the flow of goods and the economics of supply chain for its customers. For more information, visit www.symbotic.com. We are a community of innovators, collaborators and pioneers who embrace our differences, because we know unique perspectives make us stronger and smarter. Every perspective matters. We depend on the collective voices of our employees, customers and community to help guide us as we build a better place to work – for you and the world. That’s why we’re proud to be an equal opportunity employer. We do not discriminate based on race, color, ethnicity, ancestry, religion, sex, national origin, sexual orientation, age, citizenship status, marital status, disability, gender identity, gender expression, veteran status, or genetic information. The base range for this position in the posted location is $147,000.00 - $202,400.00 however, base pay offered may vary depending on job-related knowledge, skills, and experience. The compensation package includes medical, dental, vision, disability, 401K, PTO and/or other benefits.
Junior Dev Ops Engineer
BlueVoyantBlueVoyant is a cloud-native cyber defense platform that delivers positive security outcomes that drive business results. The company converges external and int
Position: Junior Dev Ops Engineer Location: Remote, US or Canada Work Authorization: U.S. Citizenship required for all applicants (regardless of location) About the Position: BlueVoyant is seeking a Junior DevOps Engineer to join our Infrastructure Engineering team, responsible for building, operating, and scaling our multi‑cloud, multi‑region SaaS platform. In this role, you’ll support the systems and infrastructure that enable our services—from development through production—while growing your skills across the stack. You’ll gain hands‑on experience with cloud infrastructure, CI/CD pipelines, observability tooling, Kubernetes, and Software Engineering. This role is ideal for someone early in their career who brings strong fundamentals, curiosity, and a willingness to learn. You’ll work alongside engineers from both software and systems backgrounds, receiving training, mentorship, and support as you develop. About You: You are an early‑career engineer with a passion for automation, infrastructure, and improving operational reliability. You enjoy solving technical problems, learning new tools, and working collaboratively with experienced engineers. You ask good questions, take initiative, and thrive in environments where you can learn by doing. You don’t need to be an expert in everything; curiosity, strong fundamentals, and willingness to learn are most important. You are comfortable working across multiple domains—cloud services, CI/CD, container technologies, performance monitoring, and basic software engineering. You bring a customer‑minded approach, strong fundamentals, and a desire to contribute to reliable, scalable systems. Responsibilities: - Reduce operational workload by automating repeatable tasks. - Assist with deploying, supporting, and troubleshooting services in production. - Improve CI/CD pipelines using GitLab and Helm. - Contribute to cloud infrastructure using Terraform. - Support Kubernetes clusters and containerized workloads. - Create and maintain alerts and runbooks. - Contribute to observability across logs, metrics, and traces. - Read and write code to support internal tooling, applications, and services. - Participate in an on‑call rotation with full training and team support. Qualifications: - Bachelor’s degree in Computer Science or equivalent practical experience. (Candidates without degrees encouraged to apply.) - 1+ year of experience working with production or production‑like systems (projects, internships, co‑ops accepted). - Familiarity with at least one programming or scripting language (Python, Go, Java preferred; others acceptable). - Working knowledge of Linux/Unix fundamentals and networking basics (DNS, TLS/SSL, HTTP). - Some exposure to Kubernetes, Docker, and/or at least one major cloud provider (AWS, GCP, or Azure). - Basic familiarity with Infrastructure‑as‑Code concepts and tools (Terraform). - Ability to learn quickly, follow runbooks, collaborate effectively, and ask the right questions. - Customer‑focused approach to delivering reliable, scalable systems. Preferred Qualifications: - Experience with SQL databases such as PostgreSQL, Redis, Elasticsearch, or RabbitMQ. - Working knowledge of AWS, Azure, or GCP networking (PrivateLink, transit gateways, VPC peering, firewalls). - Familiarity with OpenTelemetry or other application performance monitoring tools. - Understanding of Unix system internals. - Relevant certifications (AWS/Azure/GCP, Kubernetes, Linux, networking, security) are beneficial but not required. About BlueVoyant At BlueVoyant, we recognize that effective cyber security requires active prevention and defense across both your organization and supply chain. Our proprietary data, analytics, and technology, coupled with deep expertise, works as a force multiplier to secure your full ecosystem. Accuracy! Actionability! Timeliness! Scalability! Led by CEO, Jim Rosenthal, BlueVoyant’s highly skilled team includes former government cyber officials with extensive frontline experience in responding to advanced cyber threats on behalf of the National Security Agency, Federal Bureau of Investigation, Unit 8200, and GCHQ, together with private sector experts. BlueVoyant services utilize large real-time datasets with industry leading analytics and technologies. Founded in 2017 by Fortune 500 executives, including Executive Chairman, Tom Glocer, and former Government cyber officials, BlueVoyant is headquartered in New York City and has offices in Maryland, Tel Aviv, San Francisco, London, Budapest, and Latin America. BlueVoyant uses AI-assisted tools within our applicant tracking system to help identify candidates whose experience and skills best match the requirements of a role. This technology provides hiring teams with additional insights to support fair and efficient hiring decisions. Please note that all applications are reviewed by a member of our hiring team, and final hiring decisions are made by humans, not AI. By submitting your application, you acknowledge that AI tools may assist in the evaluation of your resume as part of the recruitment process. For more information on how we process your personal data, please review our Candidate Privacy Notice available at https://www.bluevoyant.com/candidate-privacy-notice. All employees must be authorized to work in the United States. BlueVoyant provides equal employment opportunities to all employees and applicants for employment without regard to race, color, religion, sex, national origin, age, disability or genetics. In addition to federal law requirements, BlueVoyant complies with applicable state and local laws governing non-discrimination in employment in every location in which the company has facilities. Disclaimer: Please note that pursuant to contractual requirements and applicable law, in order for employees to perform work on some of the company’s federal contracts, U.S. citizenship is required. Accordingly, an employee’s ability to perform work on such contracts is contingent upon the company’s verification of the employee’s citizenship status. Furthermore, individuals may be subject to additional background checks and fingerprinting. BlueVoyant Candidate Privacy Notice To understand how we secure and manage your personal data upon submitting a job application, please see our Candidate Privacy Notice, which can be found here - Candidate Privacy Notice



