Yuvraj is a Data Engineer with 5+ years of experience in big data processing and analytics, specializing in AWS and MLOps. He has a proven track record in designing real-time data pipelines and optimizing machine learning workflows.
Designed and implemented multiple real-time financial data pipelines on AWS
Developed automated PySpark workflows for large-scale data processing
Successfully transitioned data processing strategies to AWS Glue jobs
Developed automated PySpark workflows for analyzing large-scale credit card transaction data
Successfully transitioned data processing strategies from SQL queries to AWS Glue jobs, significantly improving ETL processes
Overview: This project designed and implemented a real-time financial data pipeline using the Finnhub API. Responsibilities: Designed and implemented a comprehensive real-time financial data pipeline on AWS. Utilized Databricks for scalable data ingestion and processing, supporting fintech applications like algorithmic trading.
Key outcomes:
Successfully designed and implemented a real-time financial data pipeline on AWS
Created a scalable data engineering solution for real-time financial market data
Overview: This project focused on transitioning an organization's data from on-premises systems to cloud-based platforms. Responsibilities: Developed automated PySpark workflows for analyzing large-scale credit card transaction data on AWS S3. Optimized data storage and processing strategies to leverage cloud-based analytics.
Key outcomes:
Successfully migrated an organization's data from on-premises to cloud platforms
Developed automated PySpark workflows for large-scale credit card transaction data
Overview: This project involved extracting, processing, and analyzing credit card transaction data using PySpark. Responsibilities: Created comprehensive data pipelines for banking operations. Used Databricks on AWS to process and transform data from multiple sources.
Key outcomes:
Created comprehensive data pipelines specifically for banking operations
Developed PySpark programs to efficiently extract and analyze delta data from S3
Overview: This project developed a comprehensive set of data pipelines for various aspects of banking operations, including employee incentives and performance ranking. Responsibilities: Implemented data pipelines using PySpark, Airflow, and Nutanix Object-store as the Datalake.
Key outcomes:
Implemented data pipelines for calculating customer delinquency rates for credit cards and loans
Successfully processed and consolidated data from multiple sources for banking operations
Overview: This project involved enhancing data transformation strategies by transitioning from traditional SQL queries to AWS Glue jobs. Responsibilities: Successfully transitioned data processing from SQL queries on Redshift to AWS Glue jobs.
Key outcomes:
Successfully transitioned data processing strategy from SQL queries to AWS Glue jobs, resulting in improved ETL
Enhanced data processing efficiency through the development of PySpark jobs for transformation and aggregation
Yuvraj
MLOps