Additional Data Science Projects

(2024 IronViz Challenge) Optimizing Confectionery Restocking using Forecasted Revenue in the US

This project aims to help CMU Dining Services optimize confectionery restocking decisions by forecasting revenue trends for different confectionery types. By analyzing historical revenue data and projected demand, we provide insights on which confectionery items (e.g., chocolate, ice cream, pastries) should be prioritized, especially for late-night availability in on-campus stores and vending machines. This helps ensure that students have access to convenient, affordable snacks during late study hours while minimizing waste and overstock.

Key Goals

  • Student Well-being: Ensure that essential confectionery items are available late at night, providing a quick energy boost and affordable snack options for students studying late.
  • Inventory Optimization: Use projected revenue data to adjust stock levels, balancing demand with cost efficiency to prevent waste and financial losses.
  • Support for Dining Services: Provide actionable insights to CMU Dining Services, helping them align inventory and promotional efforts with student demand patterns.
View Deployed App Watch on YouTube View on GitHub

(2024 Fall) Predicting DonorsChoose Project Funding Success Based on Poverty Levels

This project aim to improve funding success for underfunded classroom projects on DonorsChoose.org, particularly those from low-income areas. Using machine learning, we plan to develop a model to predict which projects are less likely to reach their funding goals within four months. By identifying these projects, we will provide actionable insights for DonorsChoose.org to prioritize promoting them to potential donors, ultimately working towards reducing educational inequality and ensuring that classrooms in high-need areas receive essential resources.

Key Goals

  • Educational Equity: Prioritized projects from high-poverty areas to ensure resources are allocated efficiently and equitably.
  • Machine Learning Model: Built a model to predict which projects are unlikely to be funded, helping DonorsChoose.org prioritize and promote them.
  • Stakeholder Impact: Focused on teachers, students, donors, and educational policymakers to enhance transparency, optimize donations, and reduce educational inequality.

Collaborators: Karissa Dunkerley, Sen Feng

Download ML Plan View on GitHub

(2024 Spring) Music Recommendation for Mental Health

This project explored the potential of music to enhance mental health by performing Exploratory Data Analysis (EDA), data preprocessing, feature selection, and applying machine learning techniques to a dataset of music preferences for individuals with different mental health conditions. The project was conducted at Carnegie Mellon University (January 2023 - May 2024).

  • Data Analysis and Preprocessing: Performed exploratory data analysis, data cleaning, and feature selection to uncover patterns in music preferences across various mental health conditions.
  • Machine Learning for Prediction: Applied machine learning models to predict mental health conditions and explored the potential of personalized music therapy as a supplement to traditional treatments.
  • Key Insights: The project revealed that individuals reporting mental health improvements tended to listen more to K-pop and rap, while those with no improvements favored lofi and metal. Depression showed the highest prediction accuracy among the conditions studied.

Collaborators: Amy Deng

View on GitHub

(2024 Spring) Customer Spending Behavior Analysis

This project involved conducting time series analysis and Principal Component Analysis (PCA) to identify key predictors of customer spending. Additionally, the analysis explored how education and marital status impacted spending behavior, providing insights for targeted advertising strategies. The project was conducted at Carnegie Mellon University (January 2023 - May 2024).

Key Features

  • Time Series Analysis and PCA: Applied time series analysis and PCA to uncover key predictors of customer spending behavior.
  • Demographic Insights: Analyzed the effects of education and marital status on spending habits, offering actionable insights for targeted advertising.
  • Predictive Analytics for Marketing: The findings highlighted opportunities for data-driven marketing strategies based on customer spending behavior and demographics.

Collaborators: Xinfei Cen, Yuting Wang, Camellia Wang

View Project Submission

(2023 Fall) Social Media User Relationship Modeling

This project involved building and fine-tuning a PostgreSQL database to model social media user relationships. Advanced SQL and Python scripting were used to ensure robust data management and application functionality. The project was conducted at Carnegie Mellon University (November 2023 - December 2023).

Key Features

  • PostgreSQL Database Implementation: Designed and fine-tuned a PostgreSQL database to model complex social media user relationships.
  • Advanced SQL and Python Scripting: Employed advanced SQL queries and Python scripts to manage data efficiently and enhance application functionality.
  • Robust Data Management: Ensured effective data management and optimized performance for modeling user interactions and relationships on social media platforms.

Collaborators: Eun Seok Kim

View on GitHub