Skip to content

Collection of some programming assignments for the Practicum by Yandex Data Science Bootcamp.

License

Notifications You must be signed in to change notification settings

renee127/data_science_bootcamp

Repository files navigation

data_science_bootcamp

Collection of some programming assignments completed for Practicum's Data Scientist professional training program.

Project Name Notebook Description Dependencies Sprint Number
classifying_churn classifying_churn.ipynb Used machine learning and data balancing techniques to create a predictive model for churn producing an AUC-ROC higher than the target AUC-ROC (0.93 versus goal of > 0.88). NumPy, Pandas, matplotlib, seaborn, math, time, functools, re, IPython.display, sklearn, catboost, lightgbm, xgboost, random, sys 15 (final)
computer_vision computer_vision.ipynb Use supplied photos to build and test a regression model to predict age on a continuous scale. Pandas, Seaborn, matplotlib, tensorflow, keras 14
ml_for_text ml_for_text.ipynb Train a model for classifying positive and negative reviews with a F1 score of at least 0.85. NumPy, Pandas, matplotlib, seaborn, re, math, tgdm 13
time_series time_series.ipynb Use historical data on taxi orders to predict peak hours using RMSE as the metric. NumPy, Pandas, matplotlib, sciPy, seaborn, time, math, statsmodels, sklearn, IPython, sys, catboost, lightgbm, xgboost 12
numerical_methods numerical_methods.ipynb Generate a model that predicts the value of a car based on historical data (such as trims, prices, milage, technical specs) NumPy, Pandas, matplotlib, seaborn, time, math, sklearn, random, sys, catboostregressor, decisiontree 11
linear_algebra linear_algebra.ipynb Use ML to categorize customers, identify customers likely to receive an insurance benefit, and use data masking. NumPy, Pandas, math, seaborn, matplotlib, sklearn, IPython, sys 10
ml_in_industry ml_in_industry.ipynb Find the ML model that best predicts the two target values given the predictor variables present for gold extraction from ore. NumPy, Pandas, math, seaborn, matplotlib, sklearn, random, sys 9
ml_in_business ml_in_business.ipynb Use machine learning and boostrapping to select a region with the highest profit margin given a selection of masked features. NumPy, Pandas, math, seaborn, matplotlib, sklearn, scipy, random, sys 8
supervised_ml supervised_ml.ipynb Predict customer churn for a bank. NumPy, Pandas, math, matplotlib, sklearn, random, sys 7
machine_learning machine_learning.ipynb Creat a ml model that recommends an appropriate plan based on data about the behavior of subscribers who've already switched (with accuracy > 75%) NumPy, Pandas, sklearn, sys 6
sql sql.ipynb Identify top neighborhoods in terms of drop-offs for a new ride sharming company. NumPy, Pandas, matplotlib, seaborn, scipy 5
hypothesis_testing hypothesis_testing.ipynb Preliminary analysis of platform, genre, and ESRB ratings to determine any patterns that influence sales. NumPy, Pandas, matplotlib, sciPy, seaborn 4
statistical_data_analysis statistical_data_analysis.ipynb Analysis of phone plans, revenue, and retetion to produce recommendations for the marketing team. NumPy, Pandas, matplotlib, sciPy 3
exploratory_data_analysis exploratory_data_analysis.ipynb Use EDA to study data collected over the last few years from online advertisements and determine which factors influence the price of a vehicle. NumPy, Pandas, matplotlib 2
credit_scoring 15_credit_scoring_sprint_1.ipynb Create a credit score for potential customers for a loan examining marital status and number of children as features. NumPy, Pandas 1

Authors

Renee Raven

License

This project is licensed under the MIT License - see the LICENSE file for details

About

Collection of some programming assignments for the Practicum by Yandex Data Science Bootcamp.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published