Naman is a Data Engineer with 9+ years of experience in developing and deploying ETL pipelines and managing data solutions. He has a strong background in data ingestion, processing, and optimization across various cloud platforms.
Led development efforts for ETL pipelines handling over 1 billion rows of data.
Achieved a 40% reduction in query processing time through performance tuning in Snowflake.
Managed Kubernetes deployments for critical data infrastructure, ensuring high availability.
Developed a client upgrade system capable of upgrading over 350 clients weekly.
Configured automated ETL jobs with comprehensive monitoring and alert systems.
Successfully handled and ingested over 1 billion rows of data from telecom networks.
Reduced query processing time by 40% in Snowflake through performance tuning.
Developed and deployed a client upgrade system capable of upgrading over 350 clients/customers weekly using Ansible and Jenkins.
Overview: Developed an application mediation layer to ingest and transform huge volumes of data from telecom network sites for efficient storage and analytical use. Responsibilities: Developed ETL pipelines from scratch using Python, PySpark, Airflow, and Cassandra, and deployed them in Snowflake. Managed Kubernetes deployment for PySpark and Cassandra. Created data models and conducted POCs for data storage technologies (NoSQL or NewSQL). Configured ODI's scheduling framework for automated ETL jobs with monitoring and alert systems.
Key outcomes:
Handled and ingested over 1 billion rows of data successfully.
Overview: Contributed to a Security Incident and Event Management (SIEM) tool that collects logs from hundreds of sources for analytics to detect cyber-attacks and vulnerabilities. Responsibilities: Developed and containerized an AWS CloudTrail connector/ingester agent using Docker for easy deployment. Created Python-based connectors to fetch and ETL data from multiple input sources. Developed a client upgrade system using Ansible and Jenkins, capable of upgrading over 350 clients/customers weekly. Implemented performance tuning for ODI mappings and robust error handling for ETL processes.
Key outcomes:
Capable of upgrading over 350 clients/customers in a single week.
Overview: Led a critical initiative to implement a real-time data integration solution using Oracle Data Integrator (ODI) and Oracle GoldenGate for synchronizing data between operational systems and a central data repository. Responsibilities: Designed and implemented a real-time ETL pipeline for streaming transactional data from multiple source systems (ERP, CRM) ensuring minimal latency. Incorporated data validation rules and quality checks to ensure data accuracy and consistency. Optimized ODI mappings and GoldenGate processes for high throughput and minimal latency, setting up continuous monitoring dashboards.
Key outcomes:
Achieved a 40% reduction in query processing time through performance optimization.
Overview: Designed and implemented a scalable cloud-based data warehouse solution on Snowflake to support business intelligence and analytics. Responsibilities: Designed and deployed Snowflake data warehouse architecture, optimizing for cost and scalability. Developed and optimized ETL pipelines using Snowflake, Kafka, and Airflow for data ingestion and transformation. Created and maintained star and snowflake schemas for effective querying. Optimized query performance using clustering keys, materialized views, and partitioning, achieving a 40% reduction in query processing time.
Key outcomes:
Achieved a 40% reduction in query processing time through performance optimization.
Naman
Data Engineer