Job Closed

This listing is no longer active.

TYLsemi

Silicon for AI Infrastructure

IT & Infrastructure Engineer

Infrastructure EngineerInfrastructure EngineerFull Time Remote Mid LevelTeam 11-50Since 2026Company Site LinkedIn

Location

India

Posted

89 days ago

Salary

Seniority

Mid Level

AI Observability/Monitoring AWS Linux AI/ML Python Shell NFS

Job Description

Role Description We are looking for a hands-on IT & Infrastructure Engineer to support and operate the compute, network, and EDA environments required for complex SoC design across digital and analog domains. This role will work closely with the IT & Infrastructure Architect to ensure reliable day-to-day operations while building scalable systems for EDA workflows, cloud infrastructure, and AI-enabled engineering environments. Key Responsibilities - EDA & Engineering Support - Install, configure, and maintain EDA tools and environments (Synopsys, Cadence, Siemens/Mentor) - Support engineers with: - Tool setup issues - Environment/debug problems - Flow execution challenges - Assist in EDA license management: - Monitoring usage - Basic forecasting inputs - Troubleshooting license issues - Compute & Systems Operations - Manage and maintain compute servers, clusters, and storage systems - Monitor system health, performance, and utilization - Support job schedulers (LSF, Slurm, etc.) and ensure smooth execution of workloads - Assist in managing cloud infrastructure (AWS or similar): - Instance setup and scaling - Basic cost tracking and optimization - Execute tasks related to cloud vs on-prem workloads under guidance - Network & IT Operations - Support network configuration and troubleshooting - Manage: - Linux systems and user environments - Access control and permissions - Backup and data management processes - Ensure uptime and responsiveness of infrastructure for engineering teams - AI Infrastructure Support - Assist in deployment and maintenance of AI/ML tools and platforms - Help manage: - API access and token usage - Resource allocation for AI workloads - Support implementation of AI usage policies and guardrails - Automation & Tooling - Write scripts (Python/Bash) to: - Automate routine tasks - Improve system efficiency - Simplify engineering workflows - Contribute to building repeatable and scalable infrastructure processes Qualifications - Bachelor’s degree in Computer Science, IT, Electronics, or related field - 3–7 years of experience in IT systems, infrastructure, or DevOps roles - Strong working knowledge of: - Linux system administration - Basic networking concepts - Scripting (Python, Bash, or similar) - Exposure to: - Compute clusters or server environments - Cloud platforms (AWS preferred) - Strong problem-solving and debugging skills Preferred Qualifications - Exposure to EDA environments (even at a basic level) - Familiarity with job schedulers (LSF, Slurm) - Experience supporting engineering teams or technical workloads - Basic understanding of AI/ML infrastructure or tools - Knowledge of storage systems (NFS, NAS, etc.) Key Attributes - Strong execution focus and willingness to get hands dirty - High responsiveness and support mindset toward engineering teams - Eagerness to learn EDA and semiconductor workflows - Attention to detail and reliability - Ability to work in a fast-paced startup environment Success Metrics - Fast resolution of infrastructure and tool issues - High system uptime and reliability - Smooth execution of EDA workflows and regressions - Improved efficiency through automation - Strong support satisfaction from engineering teams Growth Path This role is designed to grow into Senior Infrastructure Engineer, or Infrastructure/Platform Architect, with deeper ownership of EDA, cloud strategy, and AI platforms.

Related Categories

Infrastructure Engineer

Related Job Pages

Remote Full-time Jobs (US)Remote Python Jobs (US)More Remote Jobs

More Infrastructure Engineer Jobs

ML Infrastructure Engineer

Later

Headquartered in Vancouver, British Columbia, Canada, Later is a visual content marketing solutions firm dedicated to helping clients create successful campaign

Infrastructure Engineer89 days ago

Full Time Remote

Company Site

Title: ML Infrastructure Engineer Location: Los Angeles, California, Boston, MA; Vancouver, BC; Chicago, IL; and Vancouver, WA. Work Type: Remote Job Description: Later is the world's most intelligent influencer marketing company, built to give brands the confidence to create unforgettable campaigns. By combining real creator relationships, trusted intelligence, and expert guidance, Later removes fear and guesswork from one of marketing's most visible investments. Built on a native, AI-powered platform and more than a decade of proprietary data-including billions of social interactions, impressions, and $2.4B+ in verified influencer-driven purchases-Later helps teams understand what will work before they launch. By combining trusted insight with expert guidance, Later removes guesswork from influencer marketing, enabling brands to choose the right creators, execute fully managed campaigns, and drive meaningful growth across awareness, engagement, and revenue. Trusted by leading enterprise brands including Nike, Wayfair, Unilever, and Southwest Airlines, Later bridges creativity and performance so campaigns don't just look good-they deliver results. Learn more at later.com. About this position: We're looking for a Machine Learning Infrastructure Engineer to join our growing Data & Platform team and build the foundation that powers our AI and machine learning capabilities across Later's product portfolio. As our first dedicated ML Infrastructure Engineer, you will own the systems that support model experimentation, training, deployment, and monitoring at scale. This role is critical to accelerating our data science initiatives and enabling future AI innovation. You'll design and operate reliable, secure, and scalable ML infrastructure that empowers data scientists and engineers to ship high-impact models with confidence. If you're excited about building robust ML systems in a fast-moving environment-and want to define the standard for ML Ops at Later-this is your opportunity. What you'll be doing: Strategy - Define and own the long-term ML infrastructure roadmap, ensuring it supports both current experimentation needs and future AI initiatives. - Establish best practices for model lifecycle management, deployment standards, monitoring, and governance. - Identify infrastructure gaps and proactively design scalable solutions to enable high-velocity ML development. - Contribute to cross-functional technical planning, ensuring ML systems align with product and platform strategy. Technical/ Execution - Design, build, and maintain production-grade model deployment and inference systems using CI/CD pipelines, containerized services (Docker), and API frameworks (e.g., Flask). - Automate end-to-end ML lifecycle workflows including training pipelines, model validation, registry management, deployment, and rollback strategies. - Implement robust monitoring systems for model performance, latency, drift detection, and infrastructure health using tools such as CloudWatch, Prometheus, and Grafana. - Operate across AWS and GCP environments to manage training and inference workloads, including GPU-based infrastructure and BigQuery datasets. - Develop and maintain infrastructure-as-code (Terraform, CloudFormation) to ensure scalable, repeatable, and secure cloud environments. - Implement and optimize CI/CD workflows (e.g., GitHub Actions, GitLab CI, Bitbucket Pipelines) for ML and infrastructure automation. Team / Collaboration - Partner closely with Data Scientists, Analysts, Platform Engineers, and Product Engineers to support end-to-end ML workflows. - Translate data science experimentation needs into production-ready infrastructure solutions. - Serve as the technical bridge between ML experimentation and productized deployment. - Share knowledge and best practices to elevate ML maturity across teams. Research/Best Practices - Stay current on emerging ML Ops practices, tools, and frameworks to continuously improve system reliability and efficiency. - Evaluate and implement model-serving frameworks (e.g., TorchServe, Seldon, TensorRT) where appropriate. - Contribute to governance, reproducibility, and auditability standards for ML systems. - Experiment with new tooling and workflows to improve reproducibility, performance, and developer velocity. What success looks like: - ML models move from experimentation to production quickly and reliably, with minimal manual intervention. - CI/CD pipelines enable safe, repeatable deployments with clear rollback strategies. - Model performance, drift, and infrastructure health are proactively monitored and observable. - Infrastructure supports scalable GPU training and real-time inference without bottlenecks. - Data scientists report improved velocity, reproducibility, and confidence in deploying models. - ML systems are secure, compliant, and aligned with evolving product and AI strategy. What you bring: - 4+ years of experience in ML Ops, ML infrastructure, backend engineering, or related roles supporting production ML systems. - Experience working in cloud-native environments (AWS and/or GCP) with hands-on deployment of ML workloads. - Proven track record designing and implementing CI/CD pipelines for ML systems. - Strong experience with Amazon SageMaker, Docker, Flask-based APIs, and infrastructure automation tools. - Hands-on experience with ML lifecycle tooling such as MLflow, SageMaker Studio, or Weights & Biases. - Experience managing container orchestration platforms (Kubernetes, EKS, or GKE). - Strong programming experience in Python (additional experience in Go, Java, or Scala is a plus). - Experience working with infrastructure-as-code tools such as Terraform or CloudFormation. - Familiarity with observability tools such as CloudWatch, Prometheus, Grafana, Datadog, or centralized logging platforms. - Experience managing GPU-based workloads and scaling training/inference systems. - Familiarity with data infrastructure tools such as BigQuery and cloud-native data pipelines. - Bonus: Experience supporting LLMs or generative AI pipelines, distributed training systems, feature stores (e.g., Feast), real-time inference systems, or ML governance frameworks. - A mindset focused on automation, reliability, performance, and continuous improvement in fast-scaling environments. How you work: - Driven by Impact: You deliver results that matter-prioritizing high-value work, meeting deadlines, and adapting quickly while keeping outcomes clear. - Strategic & Customer-Centric: You anticipate risks and opportunities, connect decisions to long-term growth, and build trust through proactive insights. - Curious & Growth-Oriented: You seek knowledge, ask sharp questions, and apply learnings fast-challenging the status quo with a mindset of improvement. - Collaborative & Resilient: You thrive in change by staying resourceful, solution-focused, and positive-removing roadblocks, sharing insights, and keeping morale high. - Accountable & Honest: You own your work, hold yourself and others to a high bar, and use transparent feedback to drive growth. - Emotionally Intelligent: You build trust through empathy and collaboration, foster inclusion, and inspire others with grit, optimism, and integrity. Our approach to compensation: We take a market-based & data-driven approach to compensation. We leverage data from trusted third-party compensation sources to help us understand the market value of a role based on function, level, geographic location, and scope. We evaluate compensation bi-annually, including performance and market-related factors. Our salaries are benchmarked against market Total Cash Compensation for the geographic location of our job posting. Compensation for some roles is structured as On Target Earnings (OTE = base + commission/variable) while for others it is structured as Salary only. To comply with local legislation and ensure transparency, we share salary ranges on all job postings. Skills, experience and other factors help determine the final salary we offer which may vary from the original range posted. Additionally, all permanent team members are eligible to participate in various benefits plans as part of their overall compensation package. Salary Range: $ 145,000 -165,000 #LI-remote Where we work: We have offices in Boston, MA; Vancouver, BC; Chicago, IL; and Vancouver, WA. For select positions, we are open to hiring fully remote candidates. We post our positions in the location(s) where we are open to having the successful candidate be located.

View details: ML Infrastructure Engineer

California + 4 more

$145K - $165K / year

Apply

Software Engineer - Controls Infrastructure

Apptronik

Established in 2016 as an offshoot of the Human Centered Robotics Lab at the University of Texas at Austin, Apptronik focuses on creating general-purpose humano

Infrastructure Engineer89 days ago

Full Time Hybrid

Company Site

Title: Software Engineer – Controls Infrastructure Location: Austin, TX Job Description: Apptronik is a human-centered robotics company developing AI-powered robots to support humanity in every facet of life. Our flagship humanoid robot, Apollo, is built to collaborate thoughtfully with people, starting with critical industries such as manufacturing and logistics, with future applications in healthcare, the home, and beyond. We operate at the cutting edge of embodied AI, applying our expertise across the full robotics stack to solve some of society's most important problems. You will join a team dedicated to bringing Apollo to market at scale, tackling the complex challenges like safety, commercialization, and mass production to change the world for the better. Employer: Apptronik, Inc. Location: 11701 Stonehollow Drive, Suite 150, Austin, TX 78758 (May work from home on a hybrid schedule within commuting distance of the Austin, TX office. Requires up to 10% domestic and international travel to various unanticipated client sites.) Duties: Deliver state-of-the-art planning and controls algorithms on high performance humanoid robot hardware. Characterize and improve the quality of robot locomotion and manipulation. Collaborate with Autonomy, Perception, and other teams to enable a broad range of robot behaviors. Implement rigorous unit and integration testing of control algorithms and implementation. Maintain comprehensive and accurate architecture and design documentation. Deliver reliable software through code reviews, continuous integration, and automated testing. Minimum Requirements: Master’s degree (or foreign equivalent) in Electrical Engineering, Computer Engineering, Mechanical Engineering, Robotics Engineering, or a closely related field and two (2) years of experience in a related occupation with the following: implementing high-performance model-based or model-free controls of dynamic robots; troubleshooting hardware including high torque electric motors, cameras and LiDAR sensors for robot perception, end-effector for grasping and manipulation, as well as linear and rotary joints; applying robotics fundamentals including kinematics, dynamics, controls, and estimation; utilizing modern C++ and object-oriented programming in a Linux development environment; using standard CI tools such as Git while following rigorous documentation and testing standards; and working with common robotics and controls packages including ROS, URDF, MuJoCo, and Eigen. To Apply: Please visit our careers page at https://apptronik.com/careers. *This is a direct hire. Please, no outside Agency solicitations. Apptronik provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.

AI Apollo CI/CD C++Linux Git

View details: Software Engineer - Controls Infrastructure

Texas

Apply

IT Infrastructure Specialist

Trexon

Designed for Durability. Engineered for Excellence.

Infrastructure Engineer90 days ago

Full Time RemoteTeam 501-1,000H1B No Sponsor

Company Site LinkedIn

• Provide day-to-day IT support, including Service Desk coverage and resolution of escalated issues • Troubleshoot hardware, software, network, and system issues • Perform system upgrades, device imaging, and patch management • Support onboarding/offboarding and end-user system setup • Administer, maintain, and monitor servers, networks, and infrastructure systems • Support Active Directory, virtualization (VMware), and Windows Server environments • Perform network maintenance, monitoring, and troubleshooting • Manage hybrid cloud environments (Azure/AWS) • Develop, maintain, and support enterprise architecture standards and guidelines • Translate business requirements into scalable, secure, and cost-effective IT solutions • Evaluate and recommend emerging technologies for adoption • Assist in long-term infrastructure planning and roadmap development • Support solution design and implementation to ensure alignment with architectural standards • Maintain architecture documentation, system diagrams, and technical standards • Ensure systems meet performance, scalability, and disaster recovery requirements • Participate in change management and governance processes • Incorporate cybersecurity best practices into infrastructure and system design • Support vulnerability management, patching, and endpoint protection • Ensure compliance with regulatory and industry frameworks (ISO, NIST, ITAR, CMMC) • Collaborate with internal teams and vendors on security initiatives

AWS Azure Cloud Cyber Security Google Cloud Platform TCP/IP VMware

View details: IT Infrastructure Specialist

United States

Apply

Job Closed

Network and Infrastructure Engineer, Senior

X-energy

Delivering the next generation of advanced nuclear energy technology.

Infrastructure Engineer90 days ago

Full Time RemoteTeam 201-500Since 2009H1B Sponsor

Company Site LinkedIn

• Responsible for the network and on-premises virtual infrastructure development, security, and operations • Integrate network/computer/storage management system for Cloud Service Providers and on-premises Infrastructure-as-a-Service (IaaS) solutions into full-lifecycle operational service delivery and project management platforms • Design and deploy hardened and compliant office and enterprise Network Infrastructure supporting multi-tenant, virtualized and/or High-Performance Computing production environments • Develop and maintain security, availability, performance, and scalability, continuous improvement, and monitoring for highly available and redundant network and virtual infrastructure • Take ownership of project requirement development, deployment, and operational release management

Cloud Firewalls

View details: Network and Infrastructure Engineer, Senior

Maryland

$145K - $204K / year

Apply

Job Closed

IT & Infrastructure Engineer

Job Description

Related Guides

Related Categories

Related Job Pages

More Infrastructure Engineer Jobs

ML Infrastructure Engineer

Software Engineer - Controls Infrastructure

IT Infrastructure Specialist

Network and Infrastructure Engineer, Senior