LTG is a leader in corporate digital learning and talent management.
Infrastructure Engineer
Location
Colombia
Posted
110 days ago
Salary
0
Seniority
Senior
Job Description
Infrastructure Engineer
Learning Technologies Group plc
• Combine software and systems engineering to help build and run large-scale, distributed and fault-tolerant systems • Use automation and Infrastructure as a Code (IaC) to continuously improve the reliability, scalability, and performance of services deployed on AWS • Performance tuning and configuration of both Linux system and application parameters supporting highly concurrent web stacks • Manage infrastructure through code using configuration management and IaC templating software such as Terraform and Puppet • Document procedures and knowledge base articles throughout problem resolution and architecture development processes • Monitor the availability, performance and health of production systems in support of meeting service level objectives using monitoring systems such as Icinga, Prometheus, Grafana, CloudWatch, and Loki • Participate in emergency incident response on-call rosters • Practice blameless postmortems that lead to improvements in resiliency and reductions in alert fatigue
Job Requirements
- In depth experience of AWS services (RDS, EC2, Autoscaling groups, S3, Deploying Lambda, Aurora PostgreSQL, WAF, NAT GW, ALB)
- Analytical problem-solving methodology with an outstanding ability to communicate and document effectively
- Minimum of three years in Linux system administration with experience automating system processes with a variety of scripting languages or equivalent skills
- Practical experience analyzing and troubleshooting large-scale, multi-region deployments in a public cloud (i.e. AWS)
- Practical experience with IaC, CI/CD, structured configuration such as JSON or YAML, and version control solutions (git)
- Experience in cloud deployment and management tools (e.g. Terraform, Puppet, Chef, Ansible)
- Familiarity with one or more programming or scripting languages (Python/ PHP)
- Experience with LAMP stack: Linux, Apache, MariaDB/PostgreSQL/Aurora MySQL
- Experience in database administration along with a fundamental understanding of structured query language
- Knowledge of standard network/application protocols like HTTPS, SMTP, DNS, VPN
- A BS in Computer Science or a related field such as engineering or mathematics and 3+ years of work experience in Information Technology, or 5+ years of work experience in Information Technology overall
- Fluency in written and spoken English.
Benefits
- Flexible work arrangements
- Professional development opportunities
Related Guides
Related Categories
Related Job Pages
More Infrastructure Engineer Jobs
Senior Infrastructure Engineer
EXANTEGlobal prime broker backed by proprietary technology and dedicated service.
• Operate and maintain on-premises infrastructure based on bare-metal Debian Linux servers • Manage OS-level configuration, networking, and system services • Automate infrastructure provisioning and lifecycle management using Chef • Manage infrastructure state via Git repositories following GitOps principles • Standardize and continuously improve infrastructure deployment workflows • Manage and operate hybrid connectivity across GCP, AWS, Megaport, and on-prem data centers • Design and maintain VPC networking, routing, and firewall rules • Operate hybrid connectivity (Direct Connect, Cloud Interconnect, VPN) • Configure and support BGP routing between cloud and on-prem environments • Ensure high availability, redundancy, and fault tolerance of network connectivity • Troubleshoot network and connectivity issues across cloud and on-prem layers
• You’ll join the Support duty rotation and, as a Senior, will collaborate with Engineering on incidents and changes. • Proactively improve dashboards, alerts, and runbooks to prevent repeat incidents. • Contribute to knowledge sharing across Operations and Engineering, including training content, workshops, and PR reviews. Drive to upskill - better the team and yourself. • Accurately record, update, manage and resolve tickets using the call tracking system whilst keeping all parties (internal or external) informed of the tickets progression via phone and email. • Demonstrate a solid understanding of the underlying Platform to our customers and providing assistance in helping them leverage the service and products • Respond to incoming monitoring alerts, resolving or escalating as required in accordance with priorities and agreed service levels • Take decisive actions, and calculated risks, on technically complex incidents and tasks to ensure business speed and efficiency. • Lead by earning trust, speaking candidly, and benchmark against the best to identify where we can improve. • Disagree when appropriate and challenge the status quo. Commit wholly to decisions and plans once in motion. Be a technical expert, and drive the team to make the best decisions. • Deliver project tasks, improvements, and technical assessments in the right quality in a timely fashion. • Handle escalated customer support issues, providing solutions aligned with business SLA requirements • Design and implement automation scripts and tools to optimize processes. • Conduct root cause analysis for major incidents and recommend long-term fixes. • Collaborate with cross-functional teams for service improvements • Responding to critical incidents during out of business hours, and be on-call as required.
**Job Summary: ** We are hiring a Lead Infrastructure & Cloud Engineer with a strong Wintel infrastructure foundation and current, hands-on capability in modern cloud infrastructure across Azure (primary) and AWS. This role exists to close a capability gap: we have deep on-prem expertise, and we need a leader who can define and drive modern cloud standards, guide technical direction, and uplift the team. You’ll operate as a technical lead with an architecture mindset: creating reference designs, setting guardrails, making pragmatic trade-offs (security, resilience, cost), and leading delivery across infrastructure and hybrid cloud. This is not a DevOps role, you will collaborate with DevOps and engineers, but your focus is infrastructure/platform, governance, reliability, and technical leadership. **Job Responsibilities:** **Cloud & Hybrid Architecture (Azure & AWS)** - Own the target-state hybrid cloud architecture and roadmap (12–24 months), aligning security, resilience, and cost requirements. - Define reference architectures and standards: landing zones, network patterns, identity patterns, logging/monitoring, backup/DR, and environment separation. - Lead design and implementation of secure cloud networking: VNets/VPCs, routing, VPN, ExpressRoute/Direct Connect, Private Link/Endpoints, load balancers, WAF where needed. - Own cloud governance foundations: subscriptions/accounts, management groups, RBAC, naming/tagging, logging, budgets and policy guardrails. **Modern Cloud Operations (Hands-on Leadership)** - Ensure cloud platforms, services, and workloads remain on supported, secure versions; implement drift detection and lifecycle management. - Establish platform observability: Azure Monitor/Log Analytics/App Insights, CloudWatch, OpenTelemetry where used; improve alert quality and operational readiness. - Build and maintain backup/DR posture with tested RTO/RPO, runbooks, and regular restore/DR exercises. - Drive FinOps discipline: cost allocation, tagging compliance, rightsizing, reservations/savings plans, and cost anomaly detection. **Security, Governance & Incident Readiness** - Ensure security controls are in place and effective (least privilege, secure baselines, encryption, key management, vulnerability/patch posture). - Log & telemetry onboarding: own onboarding of data/log sources and integration with the SIEM (e.g., Microsoft Sentinel/Splunk) in partnership with Security. - Lead incident response for infrastructure/cloud events: triage, investigation, reporting, RCA, and implementation of preventative controls and guardrails. - Manage, document, and audit configuration changes; champion “repeatable by design” changes and reduce configuration drift. **Wintel & Core Infrastructure Leadership** - Provide technical leadership across core infrastructure services: Windows Server, AD DS, DNS/DHCP, certificates/PKI, and integration with Entra ID. - Guide virtualisation/storage teams (VMware/Hyper-V, SAN/storage) towards cloud-aligned standards for resilience, security, and lifecycle. **Leadership and Uplift** - Act as the technical authority for infrastructure and hybrid cloud lead technical decisions and drive outcomes. - Mentor and upskill engineers on modern cloud infrastructure practices; run knowledge sessions and codify standards into reusable patterns. - Provide input during design and architectural discussions with DevOps and software teams; unblock delivery with clear, pragmatic guidance.
• Build data pipelines that scrub PII, create research datasets, and power the research portal for educational AI studies • Architect the path toward self-hosted and on-device model deployments for privacy and global accessibility • Design and implement model orchestration systems that intelligently route requests across multiple AI providers (OpenAI, Anthropic, AWS Bedrock, open-source models) • Build cost optimization infrastructure - implement conversation compression, prompt caching, and smart model selection to keep AI accessible • Create comprehensive observability systems for ML operations - track costs, latency, quality, and usage patterns across thousands of applications • Design and implement infrastructure for fine-tuning and deploying custom models • Build monitoring and alerting systems that help us maintain reliability as AI interactions scale




