At Cloudera, we believe that data can make what is impossible today, possible tomorrow.
Senior Curriculum Developer – GenAI Applications, Data Engineering
Location
India
Posted
2 days ago
Salary
0
Seniority
Senior
Job Description
Senior Curriculum Developer – GenAI Applications, Data Engineering
Cloudera
• Design and develop technical training focused on Generative AI, Agentic AI workflows, and AI application development • Create instructor-led guides, lab manuals, Jupyter notebook-based exercises, and on-demand video content • Develop technical assessments, certification questions, and demo environments • Lead and conduct technical ML/AI and Agent development workshops directly for customers and internal teams • Build and maintain labs using Python, AI/ML frameworks, Vector databases, and NVIDIA-accelerated data pipelines • Partner with Product, Engineering, and Solution Architecture teams to ensure training reflects the latest platform capabilities • Iterate on course content based on learner feedback, instructor input, and emerging industry trends like LLMOps
Job Requirements
- Total 6+ years of data and application development experience
- 4+ years in curriculum development, technical training, AI/ML/Data Engineering, or MLOps
- Proven ability to conduct hands-on workshops
- Simplify complex AI concepts for technical and non-technical audiences
- Expertise in Agent frameworks (CrewAI, LangChain, NVIDIA AI-Q), Vector databases, RAG pipelines, and Git/version control
- Familiar with OpenAI v1 API
- Experience with NVIDIA AI Agent & Data Engineering stack
- Experience & Competencies
- Advanced Architecture Knowledge of Data Lakehouse architectures, Apache Spark, Kafka, and Airflow
- Familiarity with AI governance, Responsible AI, and NVIDIA OpenShell for secure agent runtimes
- Exposure to Cloudera environments, LMS platforms, and DevOps/CI/CD practices
- Experience with certification development and video recording/editing tools.
Benefits
- Generous PTO Policy
- Support work life balance with Unplugged Days
- Flexible WFH Policy
- Mental & Physical Wellness programs
- Phone and Internet Reimbursement program
- Access to Continued Career Development
- Comprehensive Benefits and Competitive Packages
- Paid Volunteer Time
- Employee Resource Groups
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
• Lead the migration to Microsoft Fabric. • Rebuild existing pipelines, workflows, and workspaces from our Azure environment. • Expand and scale the platform to create value for users. • Design and build ingestion for new sources, create and validate transformation code, and surface data in intuitive semantic layers. • Maintain and improve data trust. • Work in a medallion architecture. • Put engineering quality first. • Connect data to consumption. • Consider storage and lifecycle.
Data Engineer
SuperStaffComprehensive BPO, RPO, and Call Center Outsourcing Solutions for Growing Businesses
Role Description The Data Engineer will be part of the Professional Services team, working directly on customer implementations for our B2B sales intelligence platform, helping wholesale distributors get the most out of their data. This is a hands-on role where you'll build data pipelines, configure customer instances, troubleshoot data issues, and collaborate with Customer Success Managers to deliver high-quality solutions. - Data Ingestion & ETL: Build Python/SQL pipelines to ingest invoices, orders, and catalogs. - Perform historical backfills via SFTP/API and manage Airflow DAGs. - Instance Configuration: Set up custom fields, product filtering logic, and sales workflows. - Manage SSO and user provisioning for new rollouts. - AI-Augmented Engineering: Leverage AI coding assistants (Copilot, Cursor) and LLMs to accelerate Python/SQL script generation, data mapping, and debugging. Build ETL pipelines and manage Airflow DAGs. - Customer Communication & Projects: Act as a technical point of contact. Translate complex data issues into clear updates for customers. Own project milestones from kickoff to "Go Live." - Integration & Automation: Build Workato recipes and connect customer ERPs via APIs/webhooks to ensure real-time data flow. - QA & Troubleshooting: Triage HubSpot support tickets, debug data discrepancies in large data sets, and deploy production fixes. - Documentation: Maintain customer data mappings and internal technical runbooks. Qualifications - Bachelor’s degree in Computer Science, Information Technology, or related field (or equivalent experience). - At least 3 years of experience in data engineering, ETL, or related roles. - Strong SQL skills for data analysis and validation. - Proficiency in Python scripting for data processing. - Familiarity with cloud platforms, preferably GCP (e.g., Cloud Storage, Cloud Functions). - Understanding of REST APIs, JSON/CSV data formats, and Git workflows. - Strong communication skills with the ability to explain technical concepts to non-technical stakeholders. Requirements - Experience with workflow orchestration tools (e.g., Apache Airflow). - Familiarity with integration platforms (e.g., Workato, Zapier). - Exposure to CRM tools (e.g., HubSpot, Asana). - Knowledge of eCommerce or sales data models (e.g., products, orders, opportunities). Technical Environment - Data & Analytics: Python, SQL, BigQuery, Looker. - Pipeline & Orchestration: Airflow, GCP (Cloud Storage). - Automation & Integration: Workato, webhooks, APIs. - Infrastructure: GCP, Kubernetes, GitHub Actions. - Systems & Tools: HubSpot, Salesforce, SFTP/FTP, ERP systems. - Project Management: Asana, GitHub. Benefits - Competitive Salary: COP $6,000,000. - Stability: Indefinite contract with all mandatory Colombian legal benefits. - Flexibility: 100% Work From Home (Remote only in Colombia). - Growth: Continuous training, development programs, and clear opportunities for professional advancement. - Culture: A collaborative, supportive, and diverse international team environment.
• Map, extract, cleanse, and transform data (patients, inventory, claims, orders, payers, etc.) • Lead and manage full-cycle data migrations from legacy DME or HME systems into NikoHealth’s platform • Collaborate with implementation consultants and product teams to ensure data accuracy and integrity • Troubleshoot and resolve data discrepancies during migration
Data Engineer
DataSpringDataSpring is the trusted data connector at the core of healthcare. For more than 25 years, we have powered the industry with the largest and most complete healthcare data foundation in the U.S., including more than 4.8 million provider data records sourced directly from providers and member data representing 75% of covered lives supplied by health plans. By improving how essential information flows across the system, DataSpring helps healthcare operate more efficiently, accurately, and with greater confidence.
Role Description The Data Engineer will support the design, development, and maintenance of data pipelines and data models that power DataSpring's analytics and data platforms. This role focuses on implementing scalable data solutions, ensuring data quality, and collaborating with senior engineers and stakeholders to deliver reliable data for reporting and analysis. This position provides an opportunity to build hands-on experience with Databricks, Azure SQL, and modern data engineering practices while contributing to enterprise data initiatives and discussions. The Data Engineer is a full-time, remote, exempt position and reports to the Sr. Director, Data Engineering & Architecture. Qualifications - 1–3 years of experience in a data engineering or analytics engineering role, including internships or academic projects. - Demonstrated success contributing to data modernization or migration initiatives in cloud environments. - Prior experience working with healthcare or other regulated data environments is highly desirable. - Bachelor’s degree in Computer Science, Information Systems, Data Engineering, or a related field. - Azure Data Engineer Associate or related certification (preferred). - Coursework or certification in AI/ML (preferred but not required). Requirements - Build and maintain ETL/ELT pipelines across Databricks, Azure SQL, and downstream gold-layer models supporting priority projects. - Support development and enhancement of enriched data models, including field-level enrichment logic, recency rules, and provider-level enrichment flags. - Assist in maintaining data logic, including reconciliation between source and target data sources and resolution of duplication and data discrepancies. - Assist in implementing medallion architecture patterns (bronze → gold), ensuring data quality, traceability, and performance at scale. - Support identification and resolution of systemic data quality issues, including null handling, soft deletes, authorization flags, and incorrect organizational mappings. - Support implementation of rules for data in collaboration with product, governance, and engineering stakeholders. - Assist in documenting (Confluence, mapping workbooks) to serve as a single source of truth for enrichment logic and data behavior. - Support collaboration with vendors and partners for vendors providing detailed queries, validation logic, and corrective guidance on upstream data issues. - Collaborate with product owners and engineering teams to ensure data models align with product defined use cases. - Support UAT and release readiness by preparing data, validating counts, and resolving last-mile data defects under tight timelines. Benefits - Competitive compensation and a comprehensive benefits package for full-time employees. - Medical, dental, and vision coverage. - 401(k) with company contributions and matching. - Paid parental leave. - Tuition assistance. - Generous paid time off. - Commitment to investing in our people and supporting professional growth over time. Company Description DataSpring is the trusted data connector at the core of healthcare. For more than 25 years, we have powered the industry with the largest and most complete healthcare data foundation in the U.S., including more than 4.8 million provider data records sourced directly from providers and member data representing 75% of covered lives supplied by health plans. By improving how essential information flows across the system, DataSpring helps healthcare operate more efficiently, accurately, and with greater confidence.



