Pallavi is a Data Engineer with 5+ years of hands-on experience in data engineering across AWS, GCP, and Azure platforms. She specializes in building robust data pipelines and optimizing ETL processes using Snowflake, Python, and DBT.
Led the adoption of DBT to streamline end-to-end data engineering workflows.
Successfully transitioned data processing from SQL queries on Redshift to AWS Glue jobs, improving ETL efficiency.
Implemented CI/CD pipelines for data warehousing and automated data processing using GitHub Actions.
Developed robust integration pipelines on Azure Data Factory and implemented PySpark jobs on Azure Databricks.
Migrated merchandising data from Hadoop to Google Cloud Platform (GCP), optimizing processes with Jenkins and Apache Airflow Composer, reducing processing time by 25%.
Implemented Business Intelligence solutions involving data migration from SQL Server to GCP BigQuery, ensuring data integrity through rigorous quality checks.
Overview: This project involved transitioning data processing from Redshift SQL queries to AWS Glue jobs, aiming to improve ETL efficiency for a retail company. Responsibilities: Successfully transitioned data processing from SQL queries on Redshift to AWS Glue jobs, significantly improving ETL processes. Developed PySpark jobs for transformation and aggregation reports, enhancing data processing efficiency. Implemented integration pipelines with various data sources using AWS Glue.
Key outcomes:
Successfully transitioned data processing from SQL queries on Redshift to AWS Glue jobs, significantly improving ETL processes.
Overview: This project focused on optimizing data processing and engineering workflows using modern tools and methodologies for a retail company, specifically leveraging DBT. Responsibilities: Led data processing initiatives using DBT, overseeing the complete end-to-end data engineering workflow. Wrote code for ETL processes in Snowflake, ensuring seamless data flow from raw to transformed to aggregated layers.
Key outcomes:
Streamlined end-to-end data engineering processes by leading the adoption of DBT.
Overview: This project focused on streamlining data integration and automating processing tasks for a retail company using Azure services. Responsibilities: Developed integration pipelines on Azure Data Factory, ensuring smooth data flow. Implemented PySpark jobs for data transformation on Azure Databricks, enhancing processing efficiency.
Key outcomes:
Developed robust integration pipelines on Azure Data Factory for smooth data flow.
Overview: This project implemented a merchandising solution for a retail company, focusing on migrating data from Hadoop to Google Cloud Platform (GCP) using Hive-to-BigQuery conversion. Responsibilities: Demonstrated expertise in migrating data from Hadoop to Google Cloud Platform (GCP) for a merchandising solution.
Key outcomes:
Successfully migrated merchandising data from Hadoop to Google Cloud Platform (GCP).
Overview: This project centered on implementing a Business Intelligence (BI) solution for a retail company, focusing on migrating data from SQL Server to Google Cloud Platform (GCP) using BigQuery. Responsibilities: Implemented a Business Intelligence solution with a focus on migrating data from SQL Server to Google Cloud Platform (GCP) using BigQuery.
Key outcomes:
Implemented a Business Intelligence solution focused on data migration from SQL Server to GCP using BigQuery.
Pallavi
Data Engineer