Job Closed
This listing is no longer active.
Innodata, with over 35 years of expertise, is a trusted leader in data solutions and AI innovation. The company specializes in training and deploying generative
Senior Data Engineer – Real-Time & Distributed Systems, GCP
Location
New Jersey
Posted
113 days ago
Salary
0
Seniority
Senior
Job Description
Senior Data Engineer – Real-Time & Distributed Systems, GCP
Innodata
• Design, build, and optimize scalable data pipelines for batch and real-time processing • Develop and maintain event-driven architectures for high-throughput systems • Ensure data reliability, performance, and low-latency processing across distributed environments • Collaborate with data scientists and application teams to enable analytics and AI use cases • Implement best practices in performance tuning, monitoring, and cost optimization
Job Requirements
- Advanced proficiency in Python for backend and large-scale data processing
- Strong experience building and managing big data pipelines in production environments
- Hands-on expertise with workflow orchestration tools such as Airflow or Google Cloud Composer
- Proven experience in batch and streaming data processing using: Apache Spark Apache Beam (Dataflow)
- Experience designing and operating event-driven systems using Pub/Sub
- Strong understanding of distributed systems architecture and scalability patterns
- Experience managing globally distributed, low-latency datasets
- Hands-on experience with NoSQL databases and/or Google Cloud Spanner
- Strong knowledge of system reliability, fault tolerance, and performance optimization
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
• Design and develop our Teamified AI data architecture & API using Python (Django Framework) and other relevant python libraries • Manage our RDBMs [SQL], NoSQL [MongoDb] and Vector [Pinecone] databases and conduct proper data governance & research • Collaborate with our leadership, product experts and software architect to integrate AI solutions into our Teamified Platform • Optimize AI augmentation models, context builders and prompt builders for performance, scalability, and efficiency. • Conduct research and stay up-to-date on the latest advancements in AI and data engineering.
Data Migration Analyst
Mark43Mark43 is a trusted leader in public safety technology, providing innovative solutions to help law enforcement and public safety agencies save time, ensure comp
• Lead pre-migration discovery sessions with public safety agencies. • Guide agencies through completion of migration questionnaires and data documentation. • Serve as the primary migration point of contact for agency stakeholders. • Clearly explain differences between legacy systems and Mark43 data structures and functionality. • Bridge gaps between how data existed historically and how it will function within Mark43. • Partner closely with the Mark43 Data Migration Engineering team to translate agency requirements into technical migration deliverables. • When applicable, collaborate with third-party migration contractors to support accurate data mapping, transformation, and execution. • Ensure alignment between customer expectations and technical implementation. • Support project planning and track migration milestones alongside project Program Managers. • Identify data risks, inconsistencies, and timeline concerns early in the process. • Facilitate structured migration validation rounds with agencies. • Track validation feedback and drive issue resolution in coordination with engineers and contractors. • Participate in user acceptance testing and quality assurance efforts. • Contribute to improving migration questionnaires, documentation, and validation workflows. • Support scalable and repeatable migration processes within Professional Services.
Data Center Engineer - Remote
EVOTEKToday’s Emerging Technology will be Tomorrow’s Competitive Advantage
Join EVOTEK: North America’s Premier Digital Business Enabler As North America's premier enabler of secure digital business, we integrate cutting-edge technical expertise across data center, network, security, cloud, and communications domains. By delivering cohesive digital solutions, we help businesses drive measurable impact and accelerate their transformation. Our award-winning culture is the cornerstone of everything we do. Recognized multiple times by Inc. Magazine as a "Best Place to Work", we’re proud to create an environment where innovation and collaboration thrive. Locally, we’ve been honored by The San Diego Business Journal as a "Best Place to Work" more than seven times, and our excellence is reflected in accolades like CRN's "Solution Provider 500", "Tech Elite 250", and "Top 150 Growth Companies”. We’ve also earned a spot among CRN’s "Triple Crown” award winners. If you’re ready to be part of a team that values innovation, culture, and business impact, EVOTEK is the place for you. Role Summary The Data Center Engineer will be responsible in design, implementation, support, and migration enterprise infrastructure across on-premises, hybrid, and cloud environments. This role focuses on virtualization, containerization, and automation to deliver secure, resilient, and scalable infrastructure solutions. Responsibilities - Deliver billable professional services engagements across hybrid data center, virtualization, and cloud environments - Execute implementation, migration, upgrade, and modernization projects for enterprise clients - Deploy and support VMware, hyperconverged, storage, and Kubernetes platforms - Lead technical workstreams during project lifecycle phases (plan, design, implement, optimize, document) - Troubleshoot and resolve complex infrastructure issues across compute, storage, and data center networking domains - Develop client-facing deliverables including implementation documentation, cutover plans, validation reports, and knowledge transfer materials - Provide regular engagement status updates to project managers and client stakeholders - Ensure solutions are deployed according to scope, best practices, and performance expectations - Contribute to repeatable delivery methodologies, standards, and service offerings - Desire and commitment to achieve and maintain vendor and industry related certifications
• Design, build, and maintain production-grade data pipelines using Airflow and AWS services like Lambda, DynamoDB • Own data ingestion from internal systems and third-party integrations (e.g., Google, Bing, external APIs). • Manage data storage and movement across S3, Snowflake, Snowpipe, and DynamoDB. • Write and maintain custom Python code that runs reliably in production. • Work across dev, staging, and production environments with proper deployment and rollback practices. • Partner with analytics, data science, and product teams to design reliable, usable data models. • Review code, mentor junior engineers, and help establish best practices for data quality, reliability, and observability. • Identify and fix performance, cost, or reliability issues in existing pipelines.



