Job Closed
This listing is no longer active.
Hardware collaboration platform 🤖 Inspired by software development principles
Senior Infrastructure Engineer
Location
California + 1 moreAll locations: California | Massachusetts
Posted
108 days ago
Salary
0
Seniority
Senior
Job Description
Senior Infrastructure Engineer
AllSpice
• Work closely with a customer success manager to manage enterprise deployments and SSO integrations • Work closely with application developers to deploy infrastructure solutions to product problems • Automate new software deployments, data backups, and data recovery • Monitor and improve the performance and availability of our cloud infrastructure • Take part in an on-call rotation and incident response • Manage process for SOC 2 compliance & pentesting.
Job Requirements
- 6+ years of cloud infrastructure experience
- Bachelor's degree or higher in a technology-related field
- Ability to lead and collaborate with varied stakeholders, from engineers to customers, to move projects forward
- Can think in terms of the big picture but deliver on the details
- Ability to manage ambiguity gracefully, autonomy, and confidence in being self-directed.
Benefits
- Supportive and smart colleagues
- Flexible work
- Opportunity to make a big impact
- Competitive salary & equity
- Health
- Dental
- Vision
- Generous PTO
- Home office stipend.
Related Guides
Related Categories
Related Job Pages
More Infrastructure Engineer Jobs
Here at Scout Motors, we're carrying forward the heritage of one of the most iconic American vehicles in history. A vehicle dating back to 1960. One that forged the path for future generations of rugged SUVs and trucks and will do so once again. But Scout is more than just a brand, it’s a legacy steeped in a culture of exploration, caretaking, and hard work. The Scout brand is all about respect. Respect for the past and the future by taking an iconic American brand that hasn’t been around for a while, electrifying it, digitizing it, and loading it with American innovation. Respect for communities by creating a company that stands for its people and its customers. Respect for both work and play, with vehicles that are equally at home at a camp site, a job site, or on a Tuesday commute. And respect for our customers by developing two powertrains that meet their requirements — an all-electric powertrain as well as the Harvester™ range extender powertrain which includes a built-in gas-powered generator with an estimated 500 miles of combined range. What you’ll do Become part of an iconic brand that is set to revolutionize the electric pick-up truck & rugged SUV marketplace by achieving the following: Install, configure, and maintain AI platform and tools. Design, deploy, and operate infrastructure supporting ML and LLM workloads. Build, containerize, and deploy AI services using Docker. Build and maintain Machine Learning CI/CD pipelines for training, testing, and deployment. Deploy and manage AI services across cloud and on-prem environments. Automate infrastructure provisioning using Terraform and Infrastructure-as-Code best practices. Monitor system performance, reliability, and scalability of AI platforms. Implement observability for agentic AI systems to ensure reliability, transparency, and debuggability. Perform regular security updates, patching, access control, and vulnerability remediation. Integrate AI tools with APIs, data platforms, and internal systems. Research, evaluate, and prototype new AI, LLM, and MLOps tools. Troubleshoot infrastructure, data, and deployment issues. Research, test, and prototype emerging AI technologies. Document system architecture, deployment processes, and operational runbooks. Location & Travel Expectations : This role will be based out of the Scout Motors corporate headquarters in Charlotte, NC . This role may be remote to start but will transition to an in-office setting at the headquarters within 3-6 months of start date. This role is not eligible for remote work in New York City. This role requires 4-5 days per week in the office, with regular in-person meetings and events. Applicants should expect that the role will require the ability to convene with Scout colleagues in person and travel to participate in events on behalf of the company from time to time. What you’ll bring We expect all Scout employees to have integrity, curiosity, resourcefulness, and strive to exhibit a positive attitude, as well as a growth mindset. You’ll be comfortable with change and flexible in a fast-paced, high-growth environment. You’ll take a collaborative approach to achieve ambitious goals. Here's what else you'll bring: Bachelor's degree in computer science, information technology, or related field or equivalent work experience. 7+ years Practical experience with LLM platforms, LLMOps, and MLOps. Experience deploying and maintaining AI or data platforms in production. Hands-on experience with AI/automation tools. Experience designing and orchestrating agentic automation workflows. Strong experience with Terraform and Infrastructure-as-Code. Proven experience deploying applications using Docker (Docker files, image optimization, container lifecycle). Expertise with cloud platforms such as AWS. Expertise with CI/CD tooling (GitHub Actions, GitLab CI). Strong Knowledge of Kubernetes and orchestration. Solid understanding of networking. Experience supporting high-availability, production AI systems. Scripting skills (Python, PySpark, Bash). Knowledge of security best practices, patch management, and access controls. Collaborate with engineering, data, and product teams to support AI initiatives. Excellent problem-solving and troubleshooting skills. When a problem occurs, you run towards it not away. Effective communication and collaboration skills. You treat colleagues with respect. You have a desire for clean implementations but are also humble in discussing alternative solutions and options. A teaching and coaching approach to guiding engineers and teams in approaches. Minimum of High School Diploma, GED or equivalent required for all roles at Scout Motors, Inc. What you'll gain The benefits of joining Scout include the chance to build products and a company from the ground up . This is a chance to create something new and lasting – with an iconic brand at its foundation . In addition, Scout provides competitive compensation and benefits to support your physical, mental, and financial wellbeing. Program specifics are detailed in company policies and employee benefit guides, select highlights: Competitive insurance including: Medical, dental, vision and income protection plans 401(k) program with: An employer match and immediate vesting Generous Paid Time Off including: 20 days planned PTO, as accrued 40 hours of unplanned PTO and 14 company or floating holidays, annually Up to 16 weeks of paid parental leave for biological and adoptive parents of all genders Paid leave for circumstances related to bereavement, jury duty, voting time, or military leave Pay Transparency This is a full-time, exempt position eligible to receive a base salary and to participate in an annual performance bonus program. Final salary offered will be determined based on factors including but not limited to the candidate's skills and experience. The annual performance bonus program is preset and not candidate dependent. Initial base salary range = $120,000.00 - $145,000.00 Internal leveling code: IC8 Notice to applicants: Residing in San Francisco : Pursuant to the San Francisco Fair Chance Ordinance, Scout Motors will consider for employment qualified applicants with arrest and conviction records. Residing in Los Angeles : Scout Motors will consider for employment qualified applicants with criminal histories in a manner consistent with the Los Angeles Fair Chance Initiative for Hiring Ordinance. Residing in New York City : This role is not eligible for remote work in New York City. Equal Opportunity Scout Motors is committed to employing a diverse workforce and is proud to be an Equal Opportunity Employer . Qualified applicants will receive consideration without regard to race, color, religion, sex, national origin, age, sexual orientation, gender identity, gender expression, veteran status, disability, pregnancy, or any other characteristics protected by law. Scout Motors is committed to compliance with all ap p licable fair employment p ractice laws . If you require reasonable accommodation to complete a job application, pre-employment testing, or a job interview or to otherwise participate in the hiring process, please contact ScoutAccommodations@scoutmotors.com.
Senior Infrastructure Engineer
Pinpoint Applicant Tracking SystemPinpoint is the ATS that makes complex hiring simpler.
• Improve monitoring and alerting across infrastructure and application layers • Diagnose and reduce production instability, including load spikes and database bottlenecks • Strengthen our use of Datadog, particularly logging quality and alert signal-to-noise ratio • Improve capacity planning through testing, monitoring, and forecasting • Ensure new features ship with appropriate production metrics and reliability safeguards • Make pragmatic, risk-aware infrastructure decisions that prioritise stability and customer impact • Implement and maintain best practices across infrastructure security, compliance, and vulnerability management • Participate in on-call rotations, incident response, and post-incident analysis • Maintain clear and up-to-date infrastructure documentation • Improve our CI/CD pipeline and overall infrastructure performance
Lead Site Reliability Engineer - Infrastructure
Milestone Systems A/SAt Milestone System we are dedicated to making the world see. As a leading provider of data-driven video technology software, we empower people, businesses, and societies with innovative solutions that enhance security, efficiency, and insight.
We are seeking a Lead Site Reliability Engineer (Infrastructure) to join our fast-moving VSaaS engineering organization. This role carries responsibility for technical leadership and operational execution of the Infrastructure SRE team. You will own the reliability, scalability, and operability of our shared platform and production systems, while shaping how reliability engineering and SRE practices are applied across the organization and mentoring senior and staff engineers. You will work closely with product engineering and platform teams to ensure a seamless developer experience, while setting standards, driving priorities, and leading by example during incidents and high-impact operational work. This role requires a strong technical background in cloud infrastructure, distributed systems, CI/CD, and GitOps, along with hands-on development experience in Golang and/or Python, to improve developer workflows, automation, and long-term system reliability. This is a remote role in the United States. Role Overview Site Reliability Engineer - Infrastructure The Infrastructure team provides leadership, direction, and accountability for platform architecture, system design, and end-to-end implementation to meet and exceed product non-functional requirements, including quality, security, reliability, availability, and performance. Site Reliability Engineers enable Product Development teams to ship features with reliable velocity by owning the stability, scalability, and operability of the underlying infrastructure and shared services. What You Will Do: As a Lead Site Reliability Engineer, you will: Operate and evolve large-scale distributed systems, anticipating failure modes and proactively mitigating risks across production environments, while owning day-to-day production operations, including monitoring, alert triage, incident response, post-incident analysis, and critical incident coordination and documentation. Lead the design, build, and implementation of automation, orchestration, and operational tooling to improve efficiency, reliability, signal-to-noise ratio, and reduce recurring issues, minimizing service-impacting events. Set technical direction and influence platform strategy by defining platform architecture, system design, and documentation to guide development, testing, deployment, and long-term maintenance of complex distributed systems. Establish and enforce standards, operational rigor, and best practices for deploying, monitoring, managing, and operating cloud-native and distributed infrastructure environments. Lead the adoption and execution of modern CI/CD, GitOps, and cloud-native infrastructure practices, ensuring reliable, scalable, and traceable software and infrastructure releases. Mentor and develop senior and staff engineers, reinforcing SRE principles, DevOps practices, accountability, and operational excellence across the Infrastructure SRE team. Collaborate closely with product and engineering stakeholders, advocating for an SRE mindset and system-level thinking to maximize reliability, performance, availability, security, and scalability across shared platforms and services. Other duties as assigned are absorbed into the above ownership and operational responsibilities. What You Have: 10+ years of experience in site reliability engineering, infrastructure, or systems engineering, with deep ownership of large-scale production systems and demonstrated leadership of SRE or infrastructure teams, including setting technical direction and mentoring senior engineers. Strong hands-on experience designing and building automation and operational tooling using Golang and/or Python, with expert-level proficiency in Linux/Unix systems, shell scripting, and production troubleshooting. Advanced expertise in cloud-native and IaaS architectures, distributed systems, and container orchestration in production environments, including compliance, security, and network considerations. Expertise in architecting modular Terraform frameworks and Infrastructure-as-code (IaC) design patterns. Deep understanding of SRE and DevOps principles, including incident management, SLA/SLO ownership, automation, reliability engineering practices and leading incident response with post-incident analysis and preventive improvements. Strong experience with CI/CD pipelines, GitOps workflows, release tooling, and modern cloud-native infrastructure practices, ensuring reliable and traceable software and infrastructure changes. Hands-on experience operating Docker and Kubernetes environments, observability platforms (logging, monitoring, alerting), and SQL/NoSQL databases (e.g., Postgres, MongoDB, Graph DB), including performance tuning and operational troubleshooting. Skills / Training Desired Subject matter expertise in Google Cloud preferred; experience with other public cloud providers is also valuable. Demonstrated expertise in microservices lifecycle management, including integration, testing, deployment, and operational best practices, supported by advanced knowledge of software release tooling and CI/CD platforms such as GitLab, Jenkins, Cloud Build, ArgoCD, and Spinnaker. Deep understanding of the Docker and Kubernetes ecosystem, including orchestration, cluster management, and image lifecycle optimization. Strong experience with observability, logging, and monitoring tools such as ELK Stack, Prometheus, Stackdriver, Datadog, New Relic, or Dynatrace. Hands-on experience with algorithms, data structures, complexity analysis, and software/system design for large-scale distributed environments. Experience driving automation for operational efficiency, signal noise reduction, recurring issue mitigation, performance testing, capacity planning, and system optimization in production environments. Experience implementing security best practices and compliance considerations in infrastructure and platform design, along with the ability to influence cross-functional teams, evangelize SRE and DevOps practices, and foster a culture of reliability and operational excellence. Why Milestone? Milestone offers not only great benefits but also great culture. Employees here have flexible work environments, opportunities for further education, and the ability to effect change in our Organization directly. The annual salary for this position ranges from $160,000 to $180,000 range. Pay is based on the level, location, complexity, responsibility, and job duties of the specific position and is just one component of Milestone’s total compensation package. Additionally, we offer an attractive benefits package that includes medical/dental benefits, FSA or HSA, 401k with 6% Safe Harbor employer match, paid parental leave, generous PTO (20 days' vacation, 10 days paid sick time, and 12 company holidays), fully paid Short Term disability policy, fully paid Long Term disability policy, and Life Insurance. If you are selected for an interview, please feel welcome to speak to our Talent Partner about our compensation philosophy. All employees must complete a background check. Employees in fiscal roles are also required to undergo a credit check. All information obtained during these checks is handled confidentially and shared only with authorized personnel. Milestone is committed to creating a diverse and inclusive workplace and is proud to be an equal opportunity employer. Contact and application Please apply at our website: www.milestonesys.com We are looking forward to receiving your application
Our Vision: Machines Will Be Our Future Workforce At MachineFi Lab, we're not just envisioning the future; we're actively building it—today. We power the new reward economy by fostering a fairer, safer, and more rewarding Internet of Things (IoT). Central to our mission is the concept of Decentralized Physical Infrastructure Networks (DePIN), a paradigm shift leveraging blockchain technology for capital formation and human coordination on a global scale. By enabling contributions to real-world infrastructure — spanning wireless, mobility, compute, energy, storage, and beyond — we empower individuals to invest in and shape the foundation of our future society. Leveraging our cutting-edge blockchain infrastructure, a robust suite of DePIN Modules, and expertise in crafting blockchain-integrated devices, MachineFi stands at the forefront of the DePIN revolution. Are you a maverick? A digital renegade? Are you someone who challenges the status quo, believing, against all odds, that you can change the world? If so, MachineFi is for you. Join us, and be part of the movement shaping the infrastructure of tomorrow. As an Engineering Manager, you will be responsible for the team’s productivity, happiness, and growth. You will be setting and overseeing the team’s goals with the technical lead and contributing to the overall infrastructure strategy. In addition, we would expect you to have prior experience with decentralized systems, blockchain, Golang, cloud infrastructure, and Kubernetes. WHAT YOU’LL ACHIEVE: · You will help set the team objectives and make sure that the team is clear on what the objectives are and why · You will provide leadership, management, and technical vision for the software engineering team. · You will work with engineers on their growth and development, making sure that everyone receives timely feedback and has a clear growth plan · You will continuously remove blockers and obstacles, both internal and cross functional so that everyone can do their best work WHAT YOU’LL NEED TO BE SUCCESSFUL: · 5+ years of experience in project or program management, ideally in startups; proven track record of managing highly successful projects strongly preferred · 2+ years of experience leading or managing SWE/SRE teams and 5+ years of engineering experience overall · Be able to lead and mentor both junior and senior engineers · Experience with modern infrastructure, have been working as an SWE/SRE engineer before moving into management · Experience managing remote teams and strong experience coordinating across time zones. Our team is fully remote and highly distributed geographically, and so are our stakeholders · Strong passion for blockchain, MachineFi, and the overall future of Web3. It will be hard to be successful in this role if you don't understand what we are doing and why. Our Stack Golang, TypeScript, Solidity, Postgres, Github, Kubernetes, GCP About MachineFi and Our Culture: MachineFi Lab, IoTeX’s core developer, is a leading tech provider for Decentralized Physical Infrastructure Networks (DePIN), a Web3 category predicted to become a multi-trillion-dollar economy powered by billions of smart devices and trillions of sensors. Its team of over 60 research scientists and engineers released W3bstream, the world's first decentralized off-chain compute framework for smart devices and real-world data. It aims to provide advanced middleware and tools for Web2 businesses connecting to Web3 token incentives with real-world activity confirmed by user-owned smart devices, unlocking new business opportunities through its Proof-of-Anything technology, which can be used with several data sets, such as in location, activity, and humanity. MachineFi Lab’s easy-to-use tools for the creation of X-and-earn scenarios, such as play-and-earn, walk-and-earn, or sleep-and-earn—community-owned machine networks, such as smart cities, public utilities, and other physical infrastructure. Backed by nearly 20 prominent VCs, including Samsung Next, Jump Crypto, Draper Dragon, Xoogler Ventures, IOSG, Wemade, and Escape Velocity, MachineFi Lab is building advanced technology to bring the metaverse into the real world, and vice versa.


