Site Reliability Engineer

Location

Alabama + 48 moreAll locations: Alabama | Alaska | Arizona | Arkansas | California | Colorado | Connecticut | Delaware | Florida | Georgia | Idaho | Illinois | Indiana | Iowa | Kansas | Kentucky | Louisiana | Maine | Maryland | Massachusetts | Michigan | Minnesota | Mississippi | Missouri | Montana | Nebraska | Nevada | New Hampshire | New Jersey | New Mexico | New York | North Carolina | North Dakota | Ohio | Oklahoma | Oregon | Pennsylvania | Rhode Island | South Carolina | South Dakota | Tennessee | Texas | Utah | Vermont | Virginia | Washington | West Virginia | Wisconsin | Wyoming

Posted

6 days ago

Salary

0

Seniority

Senior

Job Description

Site Reliability Engineer

Bright Vision Technologies

Title: Site Reliability Engineer (SRE) Location: Remote Job Description: Bright Vision Technologies is a forward-thinking software development company dedicated to building innovative solutions that help businesses automate and optimize their operations. We leverage cutting-edge technologies to create scalable, secure, and user-friendly applications. As we continue to grow, we’re looking for a skilled Site Reliability Engineer (SRE) to join our dynamic team and contribute to our mission of transforming business processes through technology. This is a fantastic opportunity to join an established and well-respected organization offering tremendous career growth potential. Site Reliability Engineer (SRE) Job Title: Site Reliability Engineer (SRE) Location: 100% Remote (Continental United States) Position Type: In-house Bright Vision Technologies SOW engagement (no third-party client or vendor) Experience: 5+ years Sponsorship: No new H1B sponsorship available. H1B transfers welcomed for qualified candidates. Employment Type: Full-time, direct W2 with Bright Vision Technologies (no C2C, no 1099, no third-party) Engagement: Long-term, multi-year, aligned to the Bright Vision SOW delivery roadmap Compensation: Competitive base salary commensurate with experience, plus benefits. Employment Terms & Visa Policy This is a 100% remote, full-time, direct W2 position with Bright Vision Technologies. This role is part of Bright Vision Technologies’ in-house Statement of Work (SOW) engagement. The client, end customer, and employer for this position is Bright Vision Technologies — there is no third-party client, vendor, or implementation partner involved. We do not engage in C2C, 1099, or third-party arrangements for this role. BUT STRICTLY NO C2C/1099/3RD PARTY COMPANIES. ALL OUR ROLES ARE W2 AND NO 3RD PARTY BROKERING PLEASE. Candidates must be willing to work directly as a full-time W2 employee of Bright Vision Technologies and contribute to our in-house SOW deliverables. No new H1B sponsorship is available for this role. However, candidates who are currently on a valid H1B visa and require a transfer are welcome to apply. We will support H1B transfers for qualified candidates. For every role, a technical coding assessment is mandatory. Please apply only if you are confident in your technical abilities and hands-on experience. Job Summary We are seeking an experienced Site Reliability Engineer to ensure the availability, performance, and operational excellence of large-scale distributed systems in production. As an SRE you will live at the boundary between development and operations, applying strong software engineering principles to infrastructure and operations problems, and continually pushing the platform toward higher reliability with lower operational toil. The ideal candidate will combine deep systems knowledge with strong programming skills, a measurement-driven mindset, and the discipline to design, automate, and operate complex services so that reliability becomes a first-class engineering deliverable rather than a reactive concern. Key Responsibilities - Define, instrument, and continually refine service-level objectives (SLOs), service-level indicators (SLIs), and error budgets for critical services, and use those measures to drive concrete engineering and prioritization decisions. - Lead incident response and resolution for production issues, acting as a calm and effective incident commander when needed, and ensuring high-quality post-incident reviews that drive lasting improvements. - Design and implement comprehensive monitoring, logging, and tracing strategies using Prometheus, Grafana, OpenTelemetry, ELK/EFK, Datadog, or similar tooling so that operators have rich, actionable visibility into system behavior. - Build and maintain robust on-call processes, runbooks, and escalation paths that reduce mean time to detect and mean time to resolve while protecting the well-being of the engineers on rotation. - Automate operational toil aggressively by writing production-grade tooling in Python, Go, Bash, or similar languages, replacing manual workflows with reliable, auditable automation. - Architect and operate large-scale Kubernetes clusters and container-based workloads, including autoscaling, capacity planning, network policy, and integration with service meshes. - Design CI/CD pipelines that promote safe, frequent, and observable releases, supported by automated testing, canary deployments, feature flags, and progressive rollout strategies. - Lead capacity planning and performance engineering activities, building models that predict growth and stress, and validating those models through load testing and chaos experiments. - Partner closely with application development teams to embed reliability practices early in design — including failure-mode analyses, graceful degradation patterns, and dependency hardening. - Strengthen the platform’s resiliency through chaos engineering, fault injection, dependency isolation, retries, timeouts, circuit breakers, and well-tested failover paths. - Drive continuous improvement of security posture in collaboration with security teams, including patch management, vulnerability remediation, and secure-by-default platform defaults. - Contribute to the technical roadmap for reliability tooling, observability platforms, and developer-experience improvements that reduce friction and improve outcomes for engineering teams. - Mentor engineers across the organization on SRE practices and foster a strong, blameless culture of operational excellence. Required Qualifications - Bachelor’s degree in Computer Science, Engineering, or a related technical discipline. - Five or more years of SRE, DevOps, or production engineering experience supporting large-scale distributed systems. - Strong programming skills in at least one of Python, Go, or Java, with the ability to build robust automation and tooling. - Deep, hands-on experience operating Linux at scale, including networking, performance tuning, and systems-level troubleshooting. - Production experience operating Kubernetes and container-based workloads. - Strong working knowledge of observability tooling such as Prometheus, Grafana, OpenTelemetry, ELK/EFK, or commercial equivalents. - Hands-on experience designing and operating CI/CD pipelines for both infrastructure and applications. - Solid understanding of distributed system design, including consistency models, partitioning, and failure semantics. - Demonstrated experience leading incident response and conducting effective post-incident reviews. - Excellent communication and documentation skills. Preferred Qualifications - Experience defining and operationalizing SLOs and error budgets in real production environments. - Exposure to chaos engineering practices and tools such as Chaos Monkey, Gremlin, or Litmus. - Hands-on experience with at least one major cloud platform (AWS, Azure, or GCP). - Background in capacity planning, performance engineering, or large-scale load testing. - Familiarity with service mesh technologies such as Istio, Linkerd, or Consul. How to Apply Would you like to know more about this opportunity? For immediate consideration, please send your resume to [email protected] or contact us at (908) 676-4399. Learn more about Bright Vision Technologies at www.bvteck.com. We recognize that our people are our strength, and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company. We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. We also make reasonable accommodations for applicants’ and employees’ religious practices and beliefs, as well as mental health or physical disability needs. Bright Vision Technologies is an Equal Opportunity Employer, including Disability/Veterans. Position offered by “No Fee Agency.” Equal Employment Opportunity (EEO) Statement Bright Vision Technologies (BV Teck) is committed to equal employment opportunity (EEO) for all employees and applicants without regard to race, color, religion, sex, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, veteran status, or any other protected status as defined by applicable federal, state, or local laws. This commitment extends to all aspects of employment, including recruitment, hiring, training, compensation, promotion, transfer, leaves of absence, termination, layoffs, and recall. BV Teck expressly prohibits any form of workplace harassment or discrimination. Any improper interference with employees' ability to perform their job duties may result in disciplinary action up to and including termination of employment.

Related Categories

Related Job Pages

More Engineer Jobs

Full TimeRemoteTeam 1-10H1B No Sponsor

Role Description This is a remote position. Sr. Energy Storage Commissioning Engineer, Grid Infrastructure - Join a trailblazing BESS (Battery Energy Storage) company backed by top-tier private equity. - Lead the charge in utility-scale energy storage development, execution, and operations. - Own the technical engineering, construction oversight, and commissioning execution for utility-scale BESS projects. - Oversee the design, construction, and commissioning of BESS projects in partnership with third-party engineering firms and EPC contractors. - Review and negotiate interconnection agreements alongside the Transmission and Interconnection teams. - Lead RFQ/RFP development with procurement. - Work directly with utilities and consultants. - Travel to project sites throughout the construction and commissioning lifecycle (~30% travel). If you've built high-voltage systems for renewable energy grid infrastructure projects and want to apply that experience to manifest the future of the grid for generations to come, this role was built for you. They are committed to creating more grid infrastructure solutions and are offering comprehensive compensation packages, including: - Competitive base salary - Open PTO policy - Flex work hours - Health benefits - Opportunity to work with a transparent Executive Leadership Team Qualifications - B.S. or M.S. in Electrical Engineering (Power Engineering preferred) - 5-7+ years in project engineering, EPC, construction management, or commissioning in the power industry - Hands-on experience with high-voltage utility substations, switchyards, and transmission systems - Strong understanding of battery energy storage fundamentals and value proposition - Familiarity with ISO/RTO commissioning compliance and LGIA/PPA structures - Proficiency in AutoCAD, Bluebeam, MS Project, and SharePoint - Ability to travel ~30% across project sites - Comfort with AI-enabled tools for engineering productivity - A practical, data-driven mindset for ambiguous, fast-evolving market environments - Ideal candidates have worked in roles such as Sr. Project Engineer, Commissioning, Commissioning Manager, High Voltage Commissioning Engineering Supervisor, HV Commissioning Engineer, Lead High Voltage Commissioning Engineer, BESS Commissioning Engineer, or similar Requirements - Technical deliverables for design, construction, and commissioning of utility-scale BESS projects - High-voltage substation, switchyard, and transmission system commissioning oversight - Interconnection agreement (LGIA) negotiation and execution support - RFQ/RFP packages for EPC contractors and engineering consultants - Cross-functional partnership with Transmission, Interconnection, Procurement, and Development teams - On-site presence at projects across the country (~30% travel) - Stakeholder participation in ERCOT, MISO, SPP, and PJM queue reform and large load policy discussion Benefits - Comprehensive compensation packages - Competitive base salary - Open PTO policy - Flex work hours - Health benefits - Opportunity to work with a transparent Executive Leadership Team Company Description This is an opportunity to join a rapidly expanding national footprint with a double-digit GW portfolio, including one of the largest fleets of grid-connected standalone battery storage assets in the country.

United States
$155K / year

Role Description As a Node.js Developer, you will play a key role in the migration and redesign of several legacy Erlang programs, transforming them into modern backend APIs built with Node.js and TypeScript to serve frontend applications and downstream services. Your mission will focus on: - Analyzing existing Erlang-based services exposing APIs and processing data stored in Riak. - Redesigning and rewriting these services as scalable, well-structured APIs (REST and/or event-driven) in Node.js / TypeScript. - Migrating data and access patterns from Riak to more standard AWS-managed databases such as RDS or DocumentDB, including data modeling and performance optimization. - Ensuring backward compatibility where required, while improving reliability, maintainability, and observability. - Building APIs optimized for frontend consumption, with clear contracts, performance constraints, and security best practices. You will work within a cloud-native AWS environment, combining synchronous APIs with asynchronous messaging to communicate with other services, using technologies such as Kafka, SQS, or RabbitMQ. This role is central to the modernization of the platform, reducing technical debt and enabling future scalability. Qualifications - Master’s Degree in Computer Science (or equivalent). - 5-10 years of experience in backend development, with strong expertise in Node.js and TypeScript. - Solid experience designing APIs for frontend applications (REST, BFF, contract-first approaches). - Experience working on legacy system migration or large refactoring projects; knowledge of Erlang is a plus but not mandatory. - Strong understanding of data modeling and database migration, ideally from NoSQL systems to RDS and/or DocumentDB. - Hands-on experience with messaging systems (Kafka, SQS, RabbitMQ or similar). - Good knowledge of AWS services and cloud-native architectures. - Strong awareness of backend security concerns (OAuth, JWT, IAM, API security). - Strong testing culture: unit, integration, and migration testing. - Fluent in English (mandatory if working remotely or outside France); French is a plus. - You are comfortable working in a transformation context, pragmatic, quality-driven, and enjoy collaborating in small, autonomous teams. Requirements - Analyze existing Erlang-based services (APIs, data flows, Riak usage) to understand business logic and technical constraints. - Design and develop backend APIs for frontend consumption using Node.js and (REST and/or BFF patterns). - Lead the migration and refactoring of legacy Erlang programs into maintainable, well-tested Node.js services. - Redesign data models and access patterns, and migrate data from Riak to AWS-managed databases such as RDS or DocumentDB. - Implement asynchronous communication with other services using Kafka, SQS, or RabbitMQ, depending on use cases. - Ensure API performance, reliability, and scalability in a cloud-native AWS environment. - Implement security best practices (authentication, authorization, API contracts, rate limiting). - Write and maintain unit, integration, and migration tests to guarantee functional parity and safe deployments. - Improve observability (logging, metrics, tracing) and operational readiness. - Actively contribute to architecture decisions, code reviews, and technical documentation. - Participate in agile ceremonies and collaborate closely with frontend, product, and platform teams. Benefits - Opportunity to work on cutting-edge technologies. - Collaborative and agile work environment. - Professional development and growth opportunities. - Flexible working arrangements.

India

Role Description We’re looking for a senior engineer with strong data instincts. This role sits at the intersection of backend systems, data pipelines, and AI-powered workflows. You’ll have broad ownership to design and build systems that ingest, transform, and operationalize data—powering both user-facing features and internal intelligence systems. This is not a narrow “data engineer” role. You’ll work across the stack, but with a deep bias toward data modeling, pipelines, and system design. - Turn ambiguous product and operational workflows into well-structured systems combining data pipelines, backend services, and AI-driven logic - Design and build agentic AI systems that reason, take actions, and orchestrate multi-step workflows (not just simple LLM integrations) - Develop and operate production-grade data pipelines powering both user-facing features and internal systems - Build backend services primarily in TypeScript, with pragmatic use of Python where appropriate - Design APIs and system boundaries that enable modular, composable AI and data workflows - Partner with product and operations to translate real-world processes into AI-augmented, scalable systems - Use AI-assisted development as a default—from prototyping to debugging to system design - Improve reliability, observability, and performance across services - Own architectural decisions within your domain, balancing speed and long-term maintainability Qualifications - Proven experience owning and operating data-heavy systems in production - Strong experience with data pipelines (ETL/ELT, event-driven, or workflow-based systems) - Proficiency in TypeScript (primary language); other typed language experience is a plus - Generalist mindset with strong data modeling and system design instincts - Ability to go from zero → one → scale with minimal guidance - Experience in fintech, tax, or regulated environments a strong plus - Familiarity with security and compliance considerations (sensitive data, auditability, access controls) a strong plus Requirements - You actively use AI tools as a core part of your development workflow (not optional) - You believe the software engineer’s job is fundamentally changing—and you lean into that shift

United States
TEKsystems logo

SIEM Engineer

TEKsystems

We're partners in transformation. We help clients activate ideas and solutions to take advantage of a new world of opportunity. We are a team of 80,000 strong, working with over 6,000 clients, including 80% of the Fortune 500, across North America, Europe and Asia.

Engineer6 days ago
ContractRemoteTeam 10,001H1B No Sponsor

Role Description We are seeking an experienced Coralogix SIEM Engineer to serve as the hands-on technical owner. The engineer will plan, implement, configure, and maintain the instance within a multi-tenant Coralogix organization shared across multiple SOCs. This role must be fluent in both Coralogix platform administration and federal regulatory constraints. Beyond Coralogix platform ownership, this role will contribute to the broader SecOps technology stack strategy, helping the SOC evolve its security operations capabilities across detection, incident management, and platform integration. Responsibilities - Coralogix Platform Administration: Full Platform Administrator within the shared multi-tenant SOC organization. - Enterprise Log Collection Pipeline Architecture & Operations: Design, implement, and maintain log collection pipelines for multiple networks with distinct architectural constraints. - Detection Engineering. - Incident Management & SLA Instrumentation. - SecOps Technology Stack Contribution. Qualifications - 10+ years of hands-on cybersecurity engineering experience, with at least 5 years in SIEM platform engineering, administration, or log management. - Demonstrable, hands-on Coralogix experience, including platform administration, DataPrime query language, alert development (threshold, anomaly, flow, ratio), Parsing Rules engineering, TCO Optimizer configuration, and log pipeline design. - Proven experience architecting and managing enterprise-scale logging pipelines, including OpenTelemetry Collector (OTEL) deployment in agent/gateway models. - Experience onboarding and integrating diverse log sources: cloud-native APIs (AWS CloudTrail, VPC Flow Logs, S3/SNS/SQS), Kubernetes/EKS workloads, Windows/Linux endpoints, and network/security appliances (Palo Alto, Check Point, NetScaler, Citrix). - Experience designing log pipelines with data masking, field redaction, or sensitive data handling requirements. Requirements - Coralogix: DataPrime, GROK/regex Parsing Rules, alert types (threshold/anomaly/flow/ratio/metric), TCO Optimizer, Subsystem/Scope/RBAC administration, SSO/SAML configuration, API key management, Cases, SLO configuration, Olly AI agent, Streama ML. - Log collection: OpenTelemetry Collector, Fluentd, Fluent Bit, or equivalent; reverse proxy architectures (Caddy 2, Nginx) for constrained-network log forwarding. - cx_security log normalization schema and Coralogix Integration/Extension Package deployment. - AWS logging architecture: CloudTrail, VPC Flow Logs, CloudWatch, S3-based log delivery, SNS/SQS event pipelines. - Endpoint telemetry: Windows Event Logs (Sysmon, WEF), Linux auditd, EDR log integration. - Network/security appliance log sources: Palo Alto (PAN-OS), Check Point, NetScaler/Citrix. - Scripting and automation: Python, Bash, or equivalent for pipeline tooling, API integrations, and operational scripting. - Federal logging requirements: OMB M-21-31 logging tiers, NIST 800-53 AU controls, audit log management. - Experience operating in federal or regulated environments with multi-tenant data isolation requirements. - Understanding of NIST RMF, ATO processes, and ISSO collaboration in federal cybersecurity programs. Desired Qualifications - Experience with SOAR platforms and webhook-based alert orchestration integrated with Coralogix (ServiceNow, PagerDuty, Jira, Slack). - Familiarity with AWS GovCloud logging architecture, cross-account log aggregation, and FedRAMP-compliant configurations. - Experience with UEBA platforms (e.g., Exabeam) and integrating behavioral analytics output with SIEM normalization pipelines. - Knowledge of MITRE ATT&CK framework and its application to detection coverage mapping and gap analysis. - Experience supporting ATO/RMF processes, security control assessments, or security authorization activities. - Prior experience in DoED, DoD, Federal HVA, or IRS/FTI-regulated environments. - Relevant certifications such as: - Coralogix Certified Engineer or equivalent platform certification. - GIAC GCED, GCIH, GCIA, or similar security operations certifications. - AWS Security Specialty or equivalent cloud security certification. - CISSP, CISM, or Security+ (supplementary). - Demonstrated ability to communicate technical platform decisions to non-technical stakeholders and drive adoption across a matrixed program organization. Job Type & Location This is a Contract to Hire position based out of Herndon, VA. Pay and Benefits The pay range for this position is $70.00 - $81.00/hr. Eligibility requirements apply to some benefits and may depend on your job classification and length of employment. Benefits are subject to change and may be subject to specific elections, plan, or program terms. If eligible, the benefits available for this temporary role may include the following: - Medical, dental & vision. - Critical Illness, Accident, and Hospital. - 401(k) Retirement Plan – Pre-tax and Roth post-tax contributions available. - Life Insurance (Voluntary Life & AD&D for the employee and dependents). - Short and long-term disability. - Health Spending Account (HSA). - Transportation benefits. - Employee Assistance Program. - Time Off/Leave (PTO, Vacation or Sick Leave). Workplace Type This is a fully remote position. Application Deadline This position is anticipated to close on May 26, 2026.

United States
$70 - $81 / hour