Yogeshwaran is a Data Engineer with 4+ years of experience in full-stack development, specializing in data engineering and cloud technologies. He has a strong background in building scalable data pipelines and integrating AI/ML models into workflows.
Developed and maintained REST APIs using Flask for various functionalities.
Implemented robust CI/CD pipelines with Jenkins and Azure DevOps for automated testing.
Designed and built scalable data pipelines on Azure, leveraging Apache Spark for efficient data processing.
Integrated ChatGPT API for automated table data extraction from PDFs.
Managed user access to Azure environments based on Roles and Access Policy.
Built Trust Capital — buying/selling bonds + IPO + financial services + REST API with Flask + PostgreSQL + Azure Cloud as Python Developer
Built Digital Surgery Platform — device tenants engaging with DSP + processing/analyzing data + React UI with file search API + Python backend as Data Engineer
Built PlagBot — sophisticated plagiarism detection tool with ML models trained on existing articles + multi-file format collection (PDF + Word + text) as Data Engineer
Built Retriever — PDF data extraction + comprehensive data management + ChatGPT API for table data extraction + automated workflows as Data Engineer
Overview: This project involves buying and selling bonds, IPO, and other financial services. Responsibilities: Developed a REST API using Flask for server-side programming.
Key outcomes:
Developed REST API for financial services
Managed financial data using PostgreSQL
Overview: This platform enables device tenants to engage with DSP to process and analyze data from various sources. Responsibilities: Created a React-based user interface leveraging a file search API, with a Python backend for clinical user access to ingested files.
Key outcomes:
Developed interactive UI and video streaming platform
Ensured API reliability through comprehensive unit and integration testing
Implemented automated CI/CD pipelines for efficient deployments
Overview: PlagBot is a sophisticated tool designed to detect and prevent plagiarism using machine learning models. Responsibilities: Collected data from various file formats, utilized tools for parsing and extracting textual information, and cleaned and preprocessed extracted data.
Key outcomes:
Built plagiarism detection tool with ML models
Processed diverse file formats for data extraction
Performed data cleaning, preprocessing, and loading
Overview: This project focuses on data extraction from PDFs and comprehensive data management using AI and cloud data services. Responsibilities: Implemented and managed the use of the ChatGPT API to extract table data from PDFs and developed workflows for accurate data extraction.
Key outcomes:
Integrated ChatGPT API for automated data extraction from PDFs
Built robust data cleaning pipelines using Databricks and Spark
Managed structured data across Azure Data Warehouse
Yogeshwaran
Data Engineer