Job Closed
This listing is no longer active.
Powering Change
Senior DevOps Engineer II
Location
United States
Posted
134 days ago
Salary
$133K - $147K / year
Seniority
Senior
Job Description
Senior DevOps Engineer II
MetroStar
• Working in an agile environment as part of a full design and development team, participating in agile ceremonies, interacting with and supporting our customer • Continuously improving the client infrastructure and platform for users • Supporting developer teams in deploying, maintaining, and troubleshooting web applications • Deploying new CI/CD workflows and improving the performance of existing ones • Serving as the subject matter expert for containerization of Node.js applications and DevSecOps and provide relevant advice to team members and customers
Job Requirements
- 7+ years of DevOps experience
- Fluent in developing server-side web applications in Node.js using frameworks like Next.js, Express, or Meteor.js
- Experience integrating web applications with large-scale content management APIs like Drupal or Contentful in decoupled or headless architectures
- Experience with deploying containerized applications
- Experience with deploying applications leveraging Kubernetes-based deployment patterns such as leveraging EKS, AKS, VMware Tanzu, RedHat OpenShift or Rancher Government Services RKE2
- Experience with CI/CD and automation technologies such as GitLab CI, Jenkins or Bitbucket Pipelines
- Experience managing, storing, and updating artifacts in artifact repositories
- Experience deploying services on at least one of the three major cloud providers (AWS, Azure, GCP)
- Experience working with systems like Datadog or Elastic Cloud for monitoring application and infrastructure uptime and metrics
- Hands-on experience administering Linux operating systems (configuring services, cron, bash scripting, hardening, etc.)
- Must be able to obtain and maintain a Public Trust
Benefits
- Health, dental, and vision insurance
- 401(k) retirement plan with company match
- Paid time off (PTO) and holidays
- Parental Leave and dependent care
- Flexible work arrangements
- Professional development opportunities
- Employee assistance and wellness programs
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
Senior Site Reliability Engineer – Infrastructure
Underdog FantasyUnderdog Fantasy describes itself as one of the fastest-growing sports companies on the market, bringing "fun, approachable contests and games to the masses." A
• Own and maintain the incident response process, including defining procedures, tools, and best practices • Guide teams in establishing and monitoring Service Level Objectives (SLOs), including setting up alerts and reporting systems • Lead capacity planning initiatives, focusing on both short and long-term scalability while optimizing costs • Develop and implement disaster recovery plans, including regular testing and regulatory compliance • Collaborate with teams on architecture decisions to ensure high availability and scalability • Manage launch and event planning for high-traffic occasions, focusing on infrastructure preparation and capacity management (a.k.a. Launch Readiness) • Act as an internal expert and consultant for monitoring tools like Datadog and Pagerduty and infrastructure like AWS and Kubernetes • Emphasis on automation and tooling to scale our workload • Contribute across codebases in Ruby, Python, Go, TypeScript, Swift, and Kotlin as needed to support the initiatives described above.
• Design, build, and operate reliable and scalable systems by defining and monitoring SLOs/SLIs • work directly on production infrastructure • collaborate closely with software engineers on system design and reliability improvements • actively develop automation for infrastructure and operational workflows to eliminate toil and reduce MTTR • participate in and lead incident response • drive blameless post-incident reviews with concrete follow-ups implemented in code and tooling • continuously analyze and optimize system performance and cost • provide data, insights, and recommendations to inform capacity planning • support security best practices through hands-on vulnerability remediation and threat mitigation
Senior Site Reliability Engineer
HashgraphHashgraph, formerly Swirlds Labs, is a software company home to some of the brightest minds in web3.
• Help design, build, and integrate key product features for enterprise businesses built on Hiero, for our private distributed ledger technology • Leverage distributed systems engineering experience, software development skills, and understanding of industry standard SRE and DevOps practices to deliver core platform services • Contribute to a highly scalable, mission-critical infrastructure product used by some of the largest companies in finance, supply chain, and healthcare industries.
DevOps Engineer / Site Reliability Engineer
TWO95 International, IncRecruitment and Staffing Soultion
**Job Title: Lead SRE (Site Reliability Engineer )** **Location: Remote Work** **Type: 6+ Month Contract to hire** **Rate: $Open /hr.** Pl forward updated resume to **deivy.malli****@two95intl.com** and include your rate requirement along with your contact details with a suitable time when we can reach you. **Responsibilities ** · Own uptime, SLAs, and overall reliability of cloud infrastructure and kiosks platform. · Lead incident response, root-cause analysis, and drive actionable postmortems. · Automate infrastructure, deployments, and operational tasks using modern IaC and scripting in collaboration with the Platform Engineering team. · Maintain and improve monitoring, alerting, and observability (Grafana, Prometheus, New Relic, etc). · Manage, operate and recommend improvement of mo · Execute and continuously improve disaster recovery and business continuity plans. · Partner with platform engineering, QA, and development teams to ensure operational readiness. · Establish and maintain runbooks, operational standards, and reliability best practices. · Provide leadership, mentorship, and clear communication during both normal operations and incidents. · Optimize cloud and Kubernetes environments for reliability, performance, and scalability.




