Data Scientist | Business Analyst | Tableau Developer
Driving business growth & bridging gaps through data-driven insights
A highly skilled and experienced Data Scientist & Business Analyst with expertise in strategic planning, data analytics / visualization, ML / AI, and project management. With a proven track record of developing and executing strategic initiatives within logistics, manufacturing, and semiconductor industries, I specialize in transforming complex data into actionable insights that drive business growth.
Outside of work, I enjoy deep diving into published research, personal data/ML projects, hiking, camping, and home DIY projects.
Bellevue University, Bellevue NE, 2025
Fitchburg State University, Fitchburg MA, 2015
January 2018 - August 2018
May 2019 - April 2022
April 2022 - Oct 2024
Examines the surge in U.S. airport complaints since 2020, focusing on Orlando International Airport (MCO), where CLEAR’s Expedited Passenger Screening Program has driven dissatisfaction, unlike at airports like Dallas Love Field. Using Department of Transportation data, it proposes to MCO Board Members via PowerPoint that decommissioning CLEAR could cut complaints, boost morale, and raise profits, despite some unfiled complaint data gaps.
View ProjectMachine learning model to detect phishing URLs using XGBoost, achieving high accuracy (99.5%) on a balanced dataset of over 200,000 URLs. This was done by engineering 48 features, such as "suspicious keywords" and the presence of "s" at the end of "http". Python code was written to enable real-world deployment in a browser extension to enhance cybersecurity.
View ProjectExploring customer sentiment about Nike through discussions on Reddit (2022 to 2025). This analysis seeks to understand the root of customer perception issues through Reddit comments using sentiment analysis and topic modeling. Results will help inform Nike’s product strategy, branding, and investor relations by identifying key pain points and areas of customer enthusiasm.
View ProjectExploring the link between rising U.S. pedestrian deaths and increasing SUV sales. Building on a 2004 study showing SUVs are over twice as deadly to pedestrians as cars, EDA confirms SUV sales strongly correlate with deaths, outpacing cell phone subscriptions in regression models. The study missed deeper analysis of handheld device bans and lacked data on death causes (e.g., distracted driving). Assumptions about passenger car sales were disproven, and data inconsistencies posed challenges.
View ProjectLeveraging NiFi, HDFS, Hive, Spark, Spark MLlib, and HBase—to process and predict solar flare magnitudes (Max PFU) using NOAA Space Weather Prediction Center data from 1976 to 2024. Solar flares, massive solar radiation bursts, can disrupt Earth’s technology, making accurate, real-time forecasts vital for risk management. The pipeline ingests data via NiFi from a GitHub-hosted CSV, stores it in HDFS, processes it with Hive and PySpark (cleaning, transforming features like latitude), and trains a Decision Tree Regression model in Spark MLlib, achieving an R² of 0.75 and RMSE of 50.23, with latitude as the top predictor. Performance metrics are stored in HBase, demonstrating a scalable framework for real-time solar flare prediction.
View Project