Hey there, I am Stephen

👨‍💻 ~ I am Stephen

About Me

Welcome! I'm dedicated to exploring how data can drive solutions to complex problems and empower decision-making. Here, I share insights and perspectives from my work in data and computer science, with a focus on making an impact through practical, data-driven solutions. Thanks for joining me on this journey.

In my freetime, I enjoy basketball, chess, weightlifting, surfing, and skating. Please feel free to explore my website and reach out to me if you have any questions!

Resume Repo
profile-image

Experience


Northrop Grumman

Jun 2022 - Sep 2024

During my internships as a Software Engineer at Northrop Grumman, I refined modular level testing for cutting-edge navigation systems. My contributions to theory and optimization achieved full accuracy and coverage.

MATLAB Simulink Website
profile-image

UCLA Statistical Consulting

Mar 2023 - Jun 2023

I provided data-driven insights to MLB Medical Services to reduce pitcher UCL injuries. By addressing class imbalance with techniques like ROSE and SMOTE, I enhanced model predictions using XGBoost, logistic regression, and random forest.

Python R Supervised Learning Website
profile-image

Tsiang Statistics Lab - UCLA

Sep 2023 - Dec 2023

At UCLA's Tsiang Lab, I wrote 'Drafting Success,' which uses collegiate statistics to predict NBA success. To gather data, I developed a Python algorithm to extract crucial data. This effort culminated in robust machine learning models, marking a high point in predictive accuracy.

R Supervised Learning Drafting Success Repo Website
profile-image

Project Kairos

Mar 2021 - Dec 2021

As a financial data analyst, I provided data-centric guidance for retail arbitrage strategies, resulting in over $20k in returns. By quantitatively inferring optimal sell points based on seasonality, supply, and market volatility, I increased profits.

R Website
profile-image

Electrum Homes

May 2021 - Sep 2021

As a Data Analyst Intern at Electrum Homes, I created a web-scraping algorithm, significantly enhancing the team's productivity. Considering many factors, I found deals for the company that had significant upside.

R Website
profile-image

Projects


A Million Kirbys Fail at Walking

I deployed a reinforcement learning agent to complete multiple levels of a platformer game. By implementing a genetic algorithm, I enhanced the neural network’s performance, while utilizing sightlines to enable precise obstacle reactions, significantly speeding up level completion.

Python Reinforcement Learning Repo
profile-image

K-Cluster Gaussian Hotspots

The project introduces an advanced supervised learning algorithm that skillfully employs clustering techniques, specifically Gaussian Mixture Models (GMM), to pinpoint hotspots within each class, significantly enhancing model interpretability. By integrating Expectation-Maximization (EM) for optimal parameter estimation and employing a robust bootstrapping algorithm for model selection, this method not only achieves a notable reduction in prediction costs but also provides deep probabilistic insights. These insights give a clearer understanding of the data's inherent distributions, setting a new benchmark for efficiency and analytical depth in predictive modeling.

Python Supervised Learning Unsupervised Learning Repo
results-image

IMDB Sentiment Analysis

This project analyzes IMDB reviews to classify sentiments as positive or negative. Through preprocessing steps like data cleaning and TF-IDF, followed by dimensionality reduction with PCA, various machine learning models were evaluated, including Logistic Regression, KNN, LDA, QDA, and Random Forests. The study meticulously assessed model performance, revealing key insights into predictive accuracy and model selection for sentiment analysis, thereby providing a comprehensive approach to understanding and improving sentiment classification techniques.

R Natural Language Processing IMDB Sentiment Analysis Repo
profile-image

Cyborgs

In this project, the players find themselves in a 2D arena game where they have to survive against an array of killer cyborgs controlled via a partially damaged transmitter. The game intricately blends action and strategy, challenging players to use walls to their advantage, and smartly command cyborgs to outmaneuver threats. With its unique blend of character-based graphics and interactive gameplay, the project offers an immersive experience.

C++ Repo
profile-image

Education


University of Pennsylvania (UPenn)

Overview

  • Master's: Computer & Information Technology

Math Coursework

  • Mathematical Foundations of Computer Science (Discrete Math)

Computer Science Coursework

  • Software Development
Transcript

University of California, Los Angeles (UCLA)

Overview

  • Major: Statistics & Data Science
  • GPA: 3.88/4.0

Statistics Coursework

  • STATS 10 - Introduction to Statistical Reasoning
  • STATS 100A - Introduction to Probability
  • STATS 100B - Introduction to Mathematical Statistics
  • STATS 100C - Linear Models
  • STATS 101A - Introduction to Data Analysis and Regression
  • STATS 101B - Introduction to Design and Analysis of Experiment
  • STATS 101C - Introduction to Statistical Models and Data Mining
  • STATS 102B - Introduction to Computation and Optimization for Statistics
  • STATS 102C - Introduction to Monte Carlo Methods
  • STATS 140XP - Practice of Statistical Consulting Part 1
  • STATS 141XP - Practice of Statistical Consulting Part 2
  • STATS 112 - Statistics: Window to Understanding Diversity

Math Coursework

  • Math 31A: Differential and Integral Calculus
  • Math 31B: Integration and Infinite Series
  • Math 32A: Calculus of Several Variables
  • Math 32B: Calculus of Several Variables
  • Math 33A: Linear Algebra and Applications

Computer Science Coursework

  • CS 31: Introduction to Computer Science I
  • CS 32: Introduction to Computer Science II (Data Structures and Algorithms)
  • CS M146: Introduction to Machine Learning
  • PIC 16A: Python with Applications I
  • PIC 16B: Python with Applications II
  • STATS 20 - Introduction to Statistical Programming with R
  • STATS 102A - Introduction to Computational Statistics with R
Transcript

Hobbies

Basketball

I've played basketball for as long as I can remember, and still play pickup basketball any chance I get.

Chess

I first got into chess from a program my elementary school ran, and since then I have played consistently even participating in Chess Club at UCLA.

Weightlifting

I began in highschool when my friend invited me to the gym. Lifting is always a great addition to my day.

Surfing

I began this year, but have already had the time of my life catching waves (or usually wiping out) with my buddies.

Skateboarding

I began recently, and as you can tell there's a lot of room for improvement 😅.

Github LinkedIn

© 2024, Stephen Yu.