Skip to content
View kevinyi901's full-sized avatar

Block or report kevinyi901

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
kevinyi901/README.md

About Me

I'm a Data Scientist and a current graduate student at UC Berkeley's Master in Data Science (MIDS) program projected to graduate in 2025.
Currently, I am working as a Data Scientist for National Capitol Contracting, helping deliver strategic insights and building automation for Naval Sea Systems Command at Pearl Harbor Naval Shipyard. Before going into data science, I was an Army Infantry Officer and later managed multiple construction projects in Hawaii as a consulting Construction Manager/Project Engineer. I am a graduate of the United States Military Academy with an undegraduate degree in Civil Engineering.

Projects

In my time at MIDS, I have had the opportunity to work on several projects. Most work is ongoing, but here are some of the completed projects that I can share with you:

SQL and NoSQL for analyzing customer sales information and making recommendations for future expansion of food delivery distribution sites
  • Course: Data Engineering
  • Description: Project 1: Data Wrangling to load sales data from a third-party sales channel with preliminary analytics. Used AWS and a Docker cluster running Anaconda and PostgreSQL. Project 2: Created a Neo4J graph database for the Bay Area BART system to identify future distribution locations for a food delivery service. Used Graph Path to identify the shortest path from a central supply store to distribution nodes, a centrality algorithm to determine the most influential BART station to service existing customers, and a community detection algorithm to identify BART station communities. Identified additional BART station locations for future store expansion.
  • Technology: SQL, Python, NoSQL Graph Database, Linux CLI, Docker Containers, Graph Path, Centrality, Community Detection Algorithms
  • Links to the repository: [https://github.com/kevinyi901/W205_DataEngineering]
Hypothesis Testing and Multiple Variable Large Sample Linear Regression
  • Course: Statistics for Data Science
  • Description: Project 1: A project exploring, visualizing, and conducting hypothesis testing on whether Republican voters or Democrat voters have more difficulty voting. Project 2: A project evaluating if one's occupation impacts the amount of hours worked weekly using general census data.
  • Technology: R Studio, T-Test, Classical Linear Model Assumption Testing, Multi-Variate Linear Regression
  • Links to the repository: [https://github.com/kevinyi901/W203_Statistics]
Exploratory Data Analysis of Colon and Lung Cancer
  • Course: Introduction to Data Science Programming
  • Description: A project cleaning, exploring, and visualizing 2008-2019 data from the CDC to identify racial, geographical, and gender trends in lung and colon cancer in America.
  • Technology: Python, Pandas, Plotly, Matplotlib, Seaborn
  • Links to the repository: [https://github.com/kevinyi901/W200]
Experimental Research Design Report
  • Course: Research Design and Applications for Data Analysis
  • Description: A project developing a research design report that will produce valuable and actionable insight for predicting grocery product sales using existing prediction models and social media data.
  • Links to the repository: [https://github.com/kevinyi901/W201]

Popular repositories Loading

  1. DataScienceEcosystem.ipynb DataScienceEcosystem.ipynb Public

  2. tesrepo tesrepo Public

    Jupyter Notebook

  3. fprojectSPACEX fprojectSPACEX Public

    Final Capstone Project

    Jupyter Notebook

  4. kevinyi901 kevinyi901 Public

    Config files for my GitHub profile.

  5. W200 W200 Public

    Exploratory Data Analysis of Colon and Lung Cancer

    Jupyter Notebook

  6. W201 W201 Public