Job Closed
This listing is no longer active.
Workflow automation and experience capture platform.
Platform Engineer – Cloud Infrastructure, Reliability
Location
United States
Posted
137 days ago
Salary
$125K - $135K / year
Seniority
Senior
Job Description
Platform Engineer – Cloud Infrastructure, Reliability
PacerPro
• Gain a solid understanding of our deployment and network architecture. • Build infrastructure for various environments (Heroku, AWS) using a variety of tools. • Build observability tools to allow our development team to better understand the system, identify system bottlenecks, assess, and mitigate risks and vulnerabilities. • Strive for automation of operational tasks to reduce repetitive manual work and human error. • Contribute to the migration of our applications and infrastructure from Heroku to AWS, with a focus on reliability, performance, and smooth transitions. • Participate in an on-call rotation to respond to production incidents and to reduce MTTR and achieve SLA/SLO/SLI targets. • Participate in incident postmortems and perform root cause analysis. • Participate with cross-department stakeholders in the implementation of IT and Engineering policies for SOC2 compliance. • Adhere to best practices in coding standards, code test coverage, code reviews, CI/CD, and documentation.
Job Requirements
- BS in Computer Science, Computer Engineering, Information Technology or other computer-related discipline.
- Experience in system design, iterative development and deployment of new services.
- Experience using one or more IaaS/PaaS solutions (eg: Heroku/AWS/GCP/Azure)
- Proficiency in one or more scripting languages (eg: Ruby/Python/Go).
- Knowledge of Linux OS capabilities.
- Exposure to one or more Infrastructure as Code technologies (eg: Terraform/Ansible/Salt).
- Experience with one or more observability technologies (eg: LogEntries/NewRelic/Rollbar/Prometheus/Grafana/Datadog/Zabbix).
- Experience with CI/CD systems (e.g.: CircleCI/Jenkins/Github/Gitlab CI).
- Ability to effectively communicate and collaborate with both technical and non-technical team members.
- Flair for startup work style.
- Experience with relational databases at scale is a plus.
- Experience with Docker and Kubernetes clusters is a plus.
Related Guides
Related Categories
Related Job Pages
More Cloud Engineer Jobs
Intern, Cloud Engineer
Southern Poverty Law CenterThe SPLC is a catalyst for racial justice in the South and beyond
• Assist in provisioning and configuring virtual machines, storage, and network resources. • Gain hands-on experience with Azure including compute, storage, and networking. • Help maintain cloud infrastructure, including monitoring, and troubleshooting. • Provide assistance in managing Active Directory. • Create and read reports, present findings. • Learn cloud security and how it relates to cyber security.
Staff Software Engineer, Cloud Infrastructure
MyFitnessPalUnlock your healthy and find your happy with MyFitnessPal.
• Lead major technical initiatives, researching and guiding the implementation of high-quality technical solutions that improve MyFitnessPal’s developer and user experience • Work collaboratively with cross-functional peers to select for and solve the most impactful problems for the larger organization • Identify requirements for building an infrastructure that supports rapid, autonomous, and secure delivery of production changes • Mature the Infrastructure team’s capabilities through automation and efficient architectural design • Coach team members to grow technical skills and increase the success of the team • Provide thought leadership on industry best practices around design, testing, security, and deployment • Turn big ideas or complex problems into simple, elegant solutions
• Assess Complex environment Quickly to determine issues or potential issues • Excellent attention to detail whilst staying focused on the big picture (a successful project) • Effectively and professionally communicate with all levels of an organization • Motivated to work efficiently and resolve issues as they arise • A desire to improve skillset by earning industry certifications (We pay) • Good communicator (from interdepartmental to client C level to end user) • Top Notch troubleshooting skills • Great Documentation • Takes ownership of projects to completion • Ability to manage multiple priorities and follow through on projects to completion
• Design, implement, and automate components of large-scale distributed cloud systems. • Implement and support PAM solutions primarily on OpenStack, ensuring secure and reliable access management. • Build tools, automation, and workflows to improve availability, scalability, latency, and operational efficiency. • Work closely with engineering and delivery teams to deploy high-quality software in a fast-paced environment. • Monitor production and development environments and implement preventive and corrective measures to ensure platform reliability. • Participate in incident response, debugging, and root cause analysis for production issues. • Collaborate across teams to deliver consistent and reliable solutions aligned with. • Document designs, operational procedures, and troubleshooting guides clearly and effectively. • Contribute to improvements in reliability metrics such as availability, MTTD, and MTTR.




