Mohit Bansal

Data Engineer
Gurgaon, IN.

About

Data Engineer with 2 years of experience in developing and optimizing large-scale ETL pipelines using tools like Redshift, Databricks, Snowflake, and Hadoop.Proven ability to reduce processing times and automate complex data workflows using PySpark and cloud-based services. Adept at designing scalable, efficient data systems that enhance business decision-making.

Work

ZS Associates
|

Technology Associate

Highlights

Designed and implemented an end-to-end sales crediting system for 340+ U.S. sales territories

Designed and executed an incentive program based on sales performance enhancing motivation among sales teams by rewarding top performers

Developed and orchestrated ETL pipelines using Apache Airflow, reducing data refresh times by 80%.

Optimized data processing with Spark techniques like caching and broadcast joins.

Migrated legacy scripts to event-driven AWS architecture using Glue, Lambda, and DynamoDB

Automated business logic in AWS Redshift with scheduled jobs (daily, weekly, quarterly), reducing manual interventions by 90%.

Design and Innovation Center, MHRD, Govt. of India
|

Data Science Intern

Highlights

Analyzed 500,000+ global health records using Python and machine learning to identify trends in disease prevalence.

Developed 3 predictive models (Random Forest, XGBoost, Logistic Regression), improving accuracy by 15%.

Increased engagement by 40% through the deployment of a Tableau dashboard with 10+ interactive visualizations

Education

UIET Panjab University

Bachelor of Technology

Information Technology

Grade: 8.67/10

Government Model Senior Secondary School

Degree

Class XII

Grade: 83.4/100

Awards

PySpark End to End Developer Course

Awarded By

Udemy

Solved 900+ coding problems on LeetCode

strengthening algorithmic and problem-solving skills.

Represented research paper at the IEMIS 2022 International Conference.
Orchestrated a flawless global go-live for a top pharmaceutical client under aggressive timelines, earning recognition from the ZS Chairman for exceptional execution.

Skills

Languages

SQL, Python, Pyspark, C++.

Data Processing

Spark, Hive, EMR, Airflow, Databricks.

Data Storage & Warehousing

Amazon S3, AWS Redshift, Hadoop, Snowflake.

Data Analysis

NumPy, Pandas, Matplotlib, Tableau, MS Excel, MS Access, PowerPoint.

Cloud & Tools

AWS Glue, Lambda, DynamoDB, EC2.

Projects

Early Parkinson Disease Detection Using Audio Signal Processing