I’m a Data Engineer & Analyst based in San Diego, CA, with hands-on experience across data engineering, machine learning, NLP, and business intelligence. I specialize in designing scalable ETL pipelines, automating data workflows, and developing interactive dashboards that transform raw data into actionable business insights.
My work bridges data engineering and analytics, leveraging Python, SQL, and cloud platforms like Azure, AWS, and Snowflake to enhance data quality, scalability, and decision-making across organizations. I’ve also built and deployed predictive models and NLP-driven pipelines, integrating machine learning into production-ready data systems that improve forecasting and operational efficiency.
I’m passionate about solving complex data challenges, collaborating with cross-functional teams, and building reliable, automated systems that deliver measurable results. Whether it’s reducing processing time by 40%, maintaining 99% data uptime, or improving forecast accuracy by 25%, I focus on creating data solutions that drive lasting business value.
Transforming Data into Measurable Impact
"The best data solutions don't just provide answers — they reveal opportunities. I build intelligent data systems that power decision-making, streamline workflows, and unlock business growth through engineering, analytics, and automation."
Results-Driven
Delivering measurable impact through data pipelines, automation, and analytics — improving efficiency, accuracy, and scalability across cloud platforms.
Collaborative
Bridging engineering precision with business strategy. Partnering with cross-functional teams to turn technical insight into actionable business outcomes.
Innovative
Applying cutting-edge tools, machine learning, and cloud technologies to design elegant, scalable, and future-ready data solutions.
Thank you for exploring my work. I'm passionate about building intelligent, data-driven systems that create lasting value. Let's connect and discuss how data engineering can drive innovation for your organization.
I bring a comprehensive toolkit spanning data engineering, AI/ML, NLP, cloud platforms, and business intelligence. My technical expertise enables me to build end-to-end data systems that are scalable, intelligent, and sustainable — driving measurable impact and clarity across organizations.
Databases & Querying
SQL Server (T-SQL, SSRS, SSIS), PostgreSQL, MySQL, Snowflake, MongoDB — building reliable data foundations through optimized queries and efficient data modeling.
ETL & Data Warehousing
Airflow, Azure Data Factory, Alteryx, Spark, AWS Redshift, AWS Glue — orchestrating seamless data pipelines that scale reliably to millions of records.
AI, Machine Learning & NLP
TensorFlow, PyTorch, scikit-learn, spaCy, NLTK — designing and deploying machine learning, NLP, and GenAI models for forecasting, classification, and automation. Focused on agentic systems, ethical AI, and AI for real-world impact.
BI & Analytics
Power BI (DAX, Data Models), Tableau, Excel (Pivot Tables, Power Pivot) — creating intelligent dashboards that transform data into actionable insights and strategic decisions.
Cloud Platforms
Azure SQL Database, Azure Synapse Analytics, AWS (S3, RDS, Lambda) — architecting cloud-native and hybrid solutions that integrate data pipelines and AI workloads.
Programming & Automation
Python, PowerShell, Docker, Hive — automating workflows, optimizing ETL performance, and reducing manual processes through smart scripting.
Data Quality, Governance & Sustainable Data Science
Validation Frameworks, Forecasting, KPI Reporting — ensuring data accuracy, ethical handling, and sustainability in analytics and AI operations.
Data Strategy & Visualization Storytelling
Blending analytical insights with visual storytelling to help organizations interpret data effectively. Turning KPIs, forecasts, and performance trends into compelling, stakeholder-ready narratives that drive business growth.
Professional Experience
My career spans data engineering, analytics, and machine learning roles where I've consistently delivered measurable impact through technical innovation and strategic collaboration.
Data Engineer
Mars Neuro Tek
Sep 2025 – Present | Poway, CA, Hybrid
I design and optimize real-time ETL pipelines in Python & SQL, integrating CRM feeds to maintain 99% data uptime. My Power BI dashboards visualize critical KPIs and client metrics, reducing manual reporting efforts by 60%.
I've implemented automated validation scripts that increased data reliability by 25% and partnered with leadership to translate complex business questions into analytical models, achieving 15% faster forecasting cycles. I also established data architecture standards across AWS S3 and Azure SQL environments.
Data Engineer & Analyst
San Diego State University
Oct 2024 – Sep 2025
I automated ETL workflows using Airflow, SQL Server, and Python, cutting data load time by 40%. My Power BI dashboards for leadership engagement analytics were adopted across multiple departments.
I reconciled datasets across SQL Server, Snowflake, SAS, and SAP systems, improving data accuracy by 20%. I also documented comprehensive ETL logic and data dictionaries to ensure scalability and transparency for future teams.
Machine Learning Engineer
San Diego State University
Jan 2023 – May 2024
I developed deep learning models using TensorFlow and PyTorch on 500GB of spectroscopic data, achieving 95% precision. My preprocessing pipelines reduced training time by 35%.
I collaborated closely with physicists and data scientists to integrate ML outputs into research workflows, bridging the gap between advanced analytics and practical scientific applications.
Data Analyst
Udyog Mart
Aug 2021 – Jun 2022
I built Power BI and Tableau dashboards that improved decision-making efficiency by 30%. My SQL + Alteryx ETL workflows increased data throughput by 35%.
I developed forecasting models that reduced supply-chain delays by 25%, directly impacting operational efficiency and customer satisfaction.
Data Engineer
Servify
Jan 2021 – Jun 2021
I developed scalable ETL pipelines using MongoDB, PostgreSQL, and AWS Redshift, boosting processing speed by 75%. My Python validation scripts improved data accuracy by 80%.
I migrated legacy dashboards to Power BI and Tableau, improving reporting turnaround time by 40% and providing stakeholders with more interactive and insightful visualizations.
Featured Projects
I've delivered data solutions across diverse domains — from financial analytics to machine learning forecasting — consistently achieving measurable improvements in accuracy, efficiency, and business outcomes.
Strategic & Financial Dashboard
McDonald's vs Burger King Comparative Analysis
I built a comprehensive Power BI dashboard integrating multi-source financial data using Python and DAX for competitive analysis. The solution delivered over 15 actionable insights.
This dashboard improved analysis turnaround by 50%, enabling faster strategic decision-making. The automated data integration pipeline ensures real-time relevance.
Tech Stack: Power BI, Python, DAX, SQL
Banana Harvest Forecasting
Advanced Machine Learning Project
I developed sophisticated forecasting models including GRU, Hybrid RNN–XGBoost, and Bi-LSTM architectures using TensorFlow and PyTorch on 1 million records, optimizing performance through hyperparameter tuning.
The solution improved forecasting accuracy by 25% and reduced prediction error by 20–25%, providing reliable harvest projections for resource planning and logistics optimization.
I built a comprehensive predictive pipeline using SQL, Python, and AWS SageMaker, implementing K-Means clustering for segmentation and Random Forest models for churn prediction.
The analytics engine improved customer retention by 20% and satisfaction scores by 15%, delivering significant ROI through targeted intervention strategies.
Tech Stack: SQL, Python, AWS SageMaker, K-Means, Random Forest
Cloud-Driven Revenue Insights Platform
Multi-Property Analytics Solution
I designed a Python + SQL ETL pipeline to unify data from multiple property sources into a centralized analytics platform. This reduced data load time by 40% through optimized batch processing.
My Power BI dashboards improved revenue visibility by 25% across properties, enabling data-driven pricing strategies that contributed to a 10% revenue increase.
Tech Stack: Python, SQL, ETL, Power BI
Plastic Detection & Classification
AI for Environmental Sustainability
I developed a deep learning model for real-time identification and classification of various plastic types from visual data, crucial for recycling and waste management processes.
This solution achieved 92% accuracy in distinguishing plastic waste, significantly enhancing the efficiency and effectiveness of automated sorting systems for environmental sustainability efforts.
Tech Stack: Python, TensorFlow, Keras, OpenCV
Interactive Sales Growth Dashboard
Retail Analytics & Visualization
I designed and implemented an interactive sales dashboard in Tableau, integrating diverse retail data sources (transactions, inventory, customer demographics) to provide a holistic view of performance.
The dashboard led to a 15% improvement in sales strategy formulation and a 10% reduction in inventory discrepancies, empowering stakeholders with real-time insights for informed decisions.
Tech Stack: Tableau, SQL, Excel
Face Detection for Attendance System
AI-Powered Automation in Education
I developed a real-time face detection system using pre-trained deep learning models for automated student attendance tracking in classrooms, reducing manual entry errors and improving accuracy.
This system achieved an attendance recording accuracy of over 97%, significantly streamlining administrative tasks and providing valuable insights into student presence patterns.
Tech Stack: Python, OpenCV, Dlib, Flask
Vibrational Stark Effect of HFIP Isomers
Computational Chemistry & Molecular Data Modeling
I conducted extensive computational chemistry simulations using Gaussian to model the Vibrational Stark Effect (VSE) in HFIP isomers, analyzing spectral shifts induced by electric fields.
The project successfully validated theoretical predictions with experimental data, contributing to a deeper understanding of molecular interactions and achieving 88% correlation between predicted and observed shifts.
Advanced coursework in machine learning, statistical modeling, big data analytics, and data engineering. Completed thesis research in deep learning applications for scientific data analysis.
B.E. in Computer Engineering
University of Mumbai
2018–2022
Foundation in computer science, algorithms, databases, and software engineering. Graduated with honors, focusing on data structures and system design.
Professional Certifications
Microsoft Certified: Power BI Data Analyst — Expertise in DAX, data modeling, and advanced visualizations
Machine Learning with Python — freeCodeCamp comprehensive certification
Data Analysis with Python — freeCodeCamp data manipulation and analysis
Machine Learning — Stanford University (Credential ID: 5QF9KKQL4VGS)
MATLAB for Engineers — MathWorks technical computing certification
Diploma in AI + Machine Learning — Squad Infotech specialized training
2
Graduate Degrees
Advanced education in Data Science and Computer Engineering
7+
Certifications
Professional credentials across cloud, analytics, and machine learning
3+
Years Experience
Hands-on expertise building production data systems
Impact by Numbers – Data That Drives Results
My work turns raw data into measurable business outcomes — improving reliability, scalability, and decision-making across organizations.
40%
Data Load Time Reduction
Achieved through optimized ETL pipelines and batch-processing frameworks. 92% probability of consistently maintaining <40% load time
15%
Revenue Increase
Enabled data-driven pricing strategies and real-time forecast analytics. 88% forecast accuracy improvement probability across test periods
97%
System Accuracy
Delivered by AI automation in attendance tracking and plastic detection systems. 97% ± 2% model confidence on validation datasets
30%
Sales Strategy Efficiency Improvement
Driven by interactive dashboard visualizations and predictive metrics. 85% stakeholder adoption probability across departments
25%
Forecasting Accuracy Increase
Achieved through hybrid machine-learning models and data optimization. 93% model stability probability over 5 evaluation cycles
20%
Customer Retention Boost
Through predictive segmentation and targeted engagement models. 89% retention improvement probability across sample cohorts
Every metric represents not just a number — but a probability of continued impact through data engineering excellence.
Let's Connect
I'm always interested in discussing data engineering opportunities, collaborative projects, or technical challenges. Whether you're looking for expertise in building scalable ETL pipelines, creating insightful analytics dashboards, or implementing machine learning solutions, I'd love to hear from you.
📍 Location
San Diego, CA Available for hybrid and remote opportunities