Bright Vision Technologies is a forward-thinking software development company dedicated to building innovative solutions that help businesses automate and optimize their operations. We leverage cutting-edge technologies to create scalable, secure, and user-friendly applications.

Site Reliability Engineer

EngineerEngineerFull Time Remote Senior Company Site

Location

Alabama + 48 more

Posted

64 days ago

Salary

Seniority

Senior

Bachelor DegreeDistributed Systems Observability/Monitoring Prometheus Grafana OpenTelemetry Datadog Python Shell Kubernetes CI/CD Mode Java Linux AWS Azure GCP Istio Linkerd Consul

Job Description

Title: Site Reliability Engineer (SRE) Location: Remote Job Description: Bright Vision Technologies is a forward-thinking software development company dedicated to building innovative solutions that help businesses automate and optimize their operations. We leverage cutting-edge technologies to create scalable, secure, and user-friendly applications. As we continue to grow, we’re looking for a skilled Site Reliability Engineer (SRE) to join our dynamic team and contribute to our mission of transforming business processes through technology. This is a fantastic opportunity to join an established and well-respected organization offering tremendous career growth potential. Site Reliability Engineer (SRE) Job Title: Site Reliability Engineer (SRE) Location: 100% Remote (Continental United States) Position Type: In-house Bright Vision Technologies SOW engagement (no third-party client or vendor) Experience: 5+ years Sponsorship: No new H1B sponsorship available. H1B transfers welcomed for qualified candidates. Employment Type: Full-time, direct W2 with Bright Vision Technologies (no C2C, no 1099, no third-party) Engagement: Long-term, multi-year, aligned to the Bright Vision SOW delivery roadmap Compensation: Competitive base salary commensurate with experience, plus benefits. Employment Terms & Visa Policy This is a 100% remote, full-time, direct W2 position with Bright Vision Technologies. This role is part of Bright Vision Technologies’ in-house Statement of Work (SOW) engagement. The client, end customer, and employer for this position is Bright Vision Technologies — there is no third-party client, vendor, or implementation partner involved. We do not engage in C2C, 1099, or third-party arrangements for this role. BUT STRICTLY NO C2C/1099/3RD PARTY COMPANIES. ALL OUR ROLES ARE W2 AND NO 3RD PARTY BROKERING PLEASE. Candidates must be willing to work directly as a full-time W2 employee of Bright Vision Technologies and contribute to our in-house SOW deliverables. No new H1B sponsorship is available for this role. However, candidates who are currently on a valid H1B visa and require a transfer are welcome to apply. We will support H1B transfers for qualified candidates. For every role, a technical coding assessment is mandatory. Please apply only if you are confident in your technical abilities and hands-on experience. Job Summary We are seeking an experienced Site Reliability Engineer to ensure the availability, performance, and operational excellence of large-scale distributed systems in production. As an SRE you will live at the boundary between development and operations, applying strong software engineering principles to infrastructure and operations problems, and continually pushing the platform toward higher reliability with lower operational toil. The ideal candidate will combine deep systems knowledge with strong programming skills, a measurement-driven mindset, and the discipline to design, automate, and operate complex services so that reliability becomes a first-class engineering deliverable rather than a reactive concern. Key Responsibilities - Define, instrument, and continually refine service-level objectives (SLOs), service-level indicators (SLIs), and error budgets for critical services, and use those measures to drive concrete engineering and prioritization decisions. - Lead incident response and resolution for production issues, acting as a calm and effective incident commander when needed, and ensuring high-quality post-incident reviews that drive lasting improvements. - Design and implement comprehensive monitoring, logging, and tracing strategies using Prometheus, Grafana, OpenTelemetry, ELK/EFK, Datadog, or similar tooling so that operators have rich, actionable visibility into system behavior. - Build and maintain robust on-call processes, runbooks, and escalation paths that reduce mean time to detect and mean time to resolve while protecting the well-being of the engineers on rotation. - Automate operational toil aggressively by writing production-grade tooling in Python, Go, Bash, or similar languages, replacing manual workflows with reliable, auditable automation. - Architect and operate large-scale Kubernetes clusters and container-based workloads, including autoscaling, capacity planning, network policy, and integration with service meshes. - Design CI/CD pipelines that promote safe, frequent, and observable releases, supported by automated testing, canary deployments, feature flags, and progressive rollout strategies. - Lead capacity planning and performance engineering activities, building models that predict growth and stress, and validating those models through load testing and chaos experiments. - Partner closely with application development teams to embed reliability practices early in design — including failure-mode analyses, graceful degradation patterns, and dependency hardening. - Strengthen the platform’s resiliency through chaos engineering, fault injection, dependency isolation, retries, timeouts, circuit breakers, and well-tested failover paths. - Drive continuous improvement of security posture in collaboration with security teams, including patch management, vulnerability remediation, and secure-by-default platform defaults. - Contribute to the technical roadmap for reliability tooling, observability platforms, and developer-experience improvements that reduce friction and improve outcomes for engineering teams. - Mentor engineers across the organization on SRE practices and foster a strong, blameless culture of operational excellence. Required Qualifications - Bachelor’s degree in Computer Science, Engineering, or a related technical discipline. - Five or more years of SRE, DevOps, or production engineering experience supporting large-scale distributed systems. - Strong programming skills in at least one of Python, Go, or Java, with the ability to build robust automation and tooling. - Deep, hands-on experience operating Linux at scale, including networking, performance tuning, and systems-level troubleshooting. - Production experience operating Kubernetes and container-based workloads. - Strong working knowledge of observability tooling such as Prometheus, Grafana, OpenTelemetry, ELK/EFK, or commercial equivalents. - Hands-on experience designing and operating CI/CD pipelines for both infrastructure and applications. - Solid understanding of distributed system design, including consistency models, partitioning, and failure semantics. - Demonstrated experience leading incident response and conducting effective post-incident reviews. - Excellent communication and documentation skills. Preferred Qualifications - Experience defining and operationalizing SLOs and error budgets in real production environments. - Exposure to chaos engineering practices and tools such as Chaos Monkey, Gremlin, or Litmus. - Hands-on experience with at least one major cloud platform (AWS, Azure, or GCP). - Background in capacity planning, performance engineering, or large-scale load testing. - Familiarity with service mesh technologies such as Istio, Linkerd, or Consul. How to Apply Would you like to know more about this opportunity? For immediate consideration, please send your resume to [email protected] or contact us at (908) 676-4399. Learn more about Bright Vision Technologies at www.bvteck.com. We recognize that our people are our strength, and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company. We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. We also make reasonable accommodations for applicants’ and employees’ religious practices and beliefs, as well as mental health or physical disability needs. Bright Vision Technologies is an Equal Opportunity Employer, including Disability/Veterans. Position offered by “No Fee Agency.” Equal Employment Opportunity (EEO) Statement Bright Vision Technologies (BV Teck) is committed to equal employment opportunity (EEO) for all employees and applicants without regard to race, color, religion, sex, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, veteran status, or any other protected status as defined by applicable federal, state, or local laws. This commitment extends to all aspects of employment, including recruitment, hiring, training, compensation, promotion, transfer, leaves of absence, termination, layoffs, and recall. BV Teck expressly prohibits any form of workplace harassment or discrimination. Any improper interference with employees' ability to perform their job duties may result in disciplinary action up to and including termination of employment.

Related Categories

Engineer

Related Job Pages

Engineer Jobs in Alabama Remote Full-time Jobs (US)Remote Python Jobs (US)More Remote Jobs

More Engineer Jobs

Lead Full Stack Engineer

Addison Group

Addison Group was founded as a temporary staffing firm in 1999 by a group of “visionary industry leaders." The founders sought out to recruit the best candida

Engineer64 days ago

Full Time Remote

Company Site

Role Description The Lead Full Stack Engineer will help lead a cross-functional team to design, implement, and maintain the entire application stack, working in TypeScript from the back-end (serverless Node.js) through the front-end (ReactJS). The ideal candidate has significant experience in creating mature, supportable systems, as well as strong collaboration and leadership skills. - Be a primary contributor to the development of a full-stack TypeScript application - Implement, maintain, and support a back-end using Azure Functions - Implement a secured data abstraction layer using GraphQL, SQL, and NoSQL - Assist with creating the application front-end using ReactJS, building reusable components - Define error handling and alerting patterns to maximize the supportability of the application - Collaborate with the DevSecOps team to continuously improve observability and optimize software delivery - Author unit tests, maintaining a high code coverage standard - Work with the product team and key business partners to create and refine user stories - Translate designs and wireframes into clean, scalable, and secure high-quality code, meeting functional requirements and architectural direction - Develop software documentation and maintain architectural decision records for developed source code - Conduct peer reviews of code for standard pull request activity - Aid in support of production systems while driving continuous improvement to increase system resiliency and security - Tackle complex technical implementations - Provide guidance and mentorship for team members - Be a key contributor to architectural planning and maintain associated documentation - Influence technical strategy and own implementation and delivery against roadmap - Help establish and champion code and pull request standards Qualifications - 10+ years of overall web development experience - 3+ years of ReactJS development - thorough understanding of React and its core principles - Strong proficiency in JavaScript and substantial experience with TypeScript - Solid experience and expertise with Node.js - Significant experience selecting and employing resiliency patterns as well as error handling and logging strategies - Substantial experience driving elements of resiliency, observability, and scalability - Professional experience with GraphQL - SQL experience (queries, schemas, experience with ORMs, etc.) - Experience designing event-driven solutions (service bus, event hubs, etc.) - Experience implementing and supporting third-party integrations - Experience building and shipping reliable/performant software, including proactively planning and supporting effective alerting and monitoring - Experience in Application Performance Management (APM) - Understanding of and capability to drive security best practices - Expertise in web application debugging, including security and performance - Experience working with cloud platforms, Azure a plus - Demonstrated aptitude at creating practical unit tests - Strong Git source control experience - Experience working with DevOps teams, processes, and tools to deliver software frequently and reliably - Proficiency and desire to understand the business use cases that drive software features - High School Diploma required. Possess a bachelor's degree in computer science or related discipline, or equivalent experience. - Demonstrate self-motivation, a strong work ethic, and the ability to prioritize tasks and collaborate effectively with multiple teams to achieve efficient outcomes. Requirements - Must be authorized to work in the United States - Ten years of overall web development experience - React - 3+ years - GraphQL - Professional experience - SQL/MySQL or PostgreSQL - Unit Testing - Cloud experience - Third-Party Integration (API) - Backend is flexible: .net is ok if no Node - Typescript - Resiliency, error handling, logging, observability, scalability, security - Highly effective communicator and lead of a junior development team Benefits - This role is eligible for medical, dental, vision, 401(k), and PTO of 6-weeks

TypeScript Node.js React Azure Functions GraphQL SQL NoSQL Observability/Monitoring JavaScript Git MySQL PostgreSQL .NET

View details: Lead Full Stack Engineer

United States

$150K - $175K / year

Apply

Job Closed

Sr. Energy Storage Commissioning Engineer

ThinkBAC Consulting

"Linking the BEST with the BEST"

Engineer64 days ago

Full Time RemoteTeam 1-10H1B No Sponsor

Company Site LinkedIn

Role Description This is a remote position. Sr. Energy Storage Commissioning Engineer, Grid Infrastructure - Join a trailblazing BESS (Battery Energy Storage) company backed by top-tier private equity. - Lead the charge in utility-scale energy storage development, execution, and operations. - Own the technical engineering, construction oversight, and commissioning execution for utility-scale BESS projects. - Oversee the design, construction, and commissioning of BESS projects in partnership with third-party engineering firms and EPC contractors. - Review and negotiate interconnection agreements alongside the Transmission and Interconnection teams. - Lead RFQ/RFP development with procurement. - Work directly with utilities and consultants. - Travel to project sites throughout the construction and commissioning lifecycle (~30% travel). If you've built high-voltage systems for renewable energy grid infrastructure projects and want to apply that experience to manifest the future of the grid for generations to come, this role was built for you. They are committed to creating more grid infrastructure solutions and are offering comprehensive compensation packages, including: - Competitive base salary - Open PTO policy - Flex work hours - Health benefits - Opportunity to work with a transparent Executive Leadership Team Qualifications - B.S. or M.S. in Electrical Engineering (Power Engineering preferred) - 5-7+ years in project engineering, EPC, construction management, or commissioning in the power industry - Hands-on experience with high-voltage utility substations, switchyards, and transmission systems - Strong understanding of battery energy storage fundamentals and value proposition - Familiarity with ISO/RTO commissioning compliance and LGIA/PPA structures - Proficiency in AutoCAD, Bluebeam, MS Project, and SharePoint - Ability to travel ~30% across project sites - Comfort with AI-enabled tools for engineering productivity - A practical, data-driven mindset for ambiguous, fast-evolving market environments - Ideal candidates have worked in roles such as Sr. Project Engineer, Commissioning, Commissioning Manager, High Voltage Commissioning Engineering Supervisor, HV Commissioning Engineer, Lead High Voltage Commissioning Engineer, BESS Commissioning Engineer, or similar Requirements - Technical deliverables for design, construction, and commissioning of utility-scale BESS projects - High-voltage substation, switchyard, and transmission system commissioning oversight - Interconnection agreement (LGIA) negotiation and execution support - RFQ/RFP packages for EPC contractors and engineering consultants - Cross-functional partnership with Transmission, Interconnection, Procurement, and Development teams - On-site presence at projects across the country (~30% travel) - Stakeholder participation in ERCOT, MISO, SPP, and PJM queue reform and large load policy discussion Benefits - Comprehensive compensation packages - Competitive base salary - Open PTO policy - Flex work hours - Health benefits - Opportunity to work with a transparent Executive Leadership Team Company Description This is an opportunity to join a rapidly expanding national footprint with a double-digit GW portfolio, including one of the largest fleets of grid-connected standalone battery storage assets in the country.

AutoCAD AI

View details: Sr. Energy Storage Commissioning Engineer

United States

$155K / year

Apply

Senior Engineer

Stellantis

Engineer64 days ago

Full Time Remote

Role Description As a Node.js Developer, you will play a key role in the migration and redesign of several legacy Erlang programs, transforming them into modern backend APIs built with Node.js and TypeScript to serve frontend applications and downstream services. Your mission will focus on: - Analyzing existing Erlang-based services exposing APIs and processing data stored in Riak. - Redesigning and rewriting these services as scalable, well-structured APIs (REST and/or event-driven) in Node.js / TypeScript. - Migrating data and access patterns from Riak to more standard AWS-managed databases such as RDS or DocumentDB, including data modeling and performance optimization. - Ensuring backward compatibility where required, while improving reliability, maintainability, and observability. - Building APIs optimized for frontend consumption, with clear contracts, performance constraints, and security best practices. You will work within a cloud-native AWS environment, combining synchronous APIs with asynchronous messaging to communicate with other services, using technologies such as Kafka, SQS, or RabbitMQ. This role is central to the modernization of the platform, reducing technical debt and enabling future scalability. Qualifications - Master’s Degree in Computer Science (or equivalent). - 5-10 years of experience in backend development, with strong expertise in Node.js and TypeScript. - Solid experience designing APIs for frontend applications (REST, BFF, contract-first approaches). - Experience working on legacy system migration or large refactoring projects; knowledge of Erlang is a plus but not mandatory. - Strong understanding of data modeling and database migration, ideally from NoSQL systems to RDS and/or DocumentDB. - Hands-on experience with messaging systems (Kafka, SQS, RabbitMQ or similar). - Good knowledge of AWS services and cloud-native architectures. - Strong awareness of backend security concerns (OAuth, JWT, IAM, API security). - Strong testing culture: unit, integration, and migration testing. - Fluent in English (mandatory if working remotely or outside France); French is a plus. - You are comfortable working in a transformation context, pragmatic, quality-driven, and enjoy collaborating in small, autonomous teams. Requirements - Analyze existing Erlang-based services (APIs, data flows, Riak usage) to understand business logic and technical constraints. - Design and develop backend APIs for frontend consumption using Node.js and (REST and/or BFF patterns). - Lead the migration and refactoring of legacy Erlang programs into maintainable, well-tested Node.js services. - Redesign data models and access patterns, and migrate data from Riak to AWS-managed databases such as RDS or DocumentDB. - Implement asynchronous communication with other services using Kafka, SQS, or RabbitMQ, depending on use cases. - Ensure API performance, reliability, and scalability in a cloud-native AWS environment. - Implement security best practices (authentication, authorization, API contracts, rate limiting). - Write and maintain unit, integration, and migration tests to guarantee functional parity and safe deployments. - Improve observability (logging, metrics, tracing) and operational readiness. - Actively contribute to architecture decisions, code reviews, and technical documentation. - Participate in agile ceremonies and collaborate closely with frontend, product, and platform teams. Benefits - Opportunity to work on cutting-edge technologies. - Collaborative and agile work environment. - Professional development and growth opportunities. - Flexible working arrangements.

Node.js Erlang API TypeScript AWS Amazon RDS Performance Optimization Observability/Monitoring Apache Kafka Amazon SQS RabbitMQ NoSQL OAuth JWT Amazon IAM

View details: Senior Engineer

India

Apply

Job Closed

Senior Data Engineer

Deduction

Engineer64 days ago

Full Time Remote

Role Description We’re looking for a senior engineer with strong data instincts. This role sits at the intersection of backend systems, data pipelines, and AI-powered workflows. You’ll have broad ownership to design and build systems that ingest, transform, and operationalize data—powering both user-facing features and internal intelligence systems. This is not a narrow “data engineer” role. You’ll work across the stack, but with a deep bias toward data modeling, pipelines, and system design. - Turn ambiguous product and operational workflows into well-structured systems combining data pipelines, backend services, and AI-driven logic - Design and build agentic AI systems that reason, take actions, and orchestrate multi-step workflows (not just simple LLM integrations) - Develop and operate production-grade data pipelines powering both user-facing features and internal systems - Build backend services primarily in TypeScript, with pragmatic use of Python where appropriate - Design APIs and system boundaries that enable modular, composable AI and data workflows - Partner with product and operations to translate real-world processes into AI-augmented, scalable systems - Use AI-assisted development as a default—from prototyping to debugging to system design - Improve reliability, observability, and performance across services - Own architectural decisions within your domain, balancing speed and long-term maintainability Qualifications - Proven experience owning and operating data-heavy systems in production - Strong experience with data pipelines (ETL/ELT, event-driven, or workflow-based systems) - Proficiency in TypeScript (primary language); other typed language experience is a plus - Generalist mindset with strong data modeling and system design instincts - Ability to go from zero → one → scale with minimal guidance - Experience in fintech, tax, or regulated environments a strong plus - Familiarity with security and compliance considerations (sensitive data, auditability, access controls) a strong plus Requirements - You actively use AI tools as a core part of your development workflow (not optional) - You believe the software engineer’s job is fundamentally changing—and you lean into that shift

AI LLM TypeScript Python Observability/Monitoring ETL

View details: Senior Data Engineer

United States

Apply

Site Reliability Engineer

Job Description

Related Guides

Related Categories

Related Job Pages

More Engineer Jobs

Lead Full Stack Engineer

Sr. Energy Storage Commissioning Engineer

Senior Engineer

Senior Data Engineer