notebooks

History

Name		Name	Last commit message	Last commit date
parent directory ..
data		data
img		img
01_fetch_data.ipynb		01_fetch_data.ipynb
02.1_clean_data.ipynb		02.1_clean_data.ipynb
02.2_create_sample_data.ipynb		02.2_create_sample_data.ipynb
02.3_feature_engineering.ipynb		02.3_feature_engineering.ipynb
02.4_descriptive_analysis.ipynb		02.4_descriptive_analysis.ipynb
03_select_base_model.ipynb		03_select_base_model.ipynb
04_make_pipeline.ipynb		04_make_pipeline.ipynb
README.md		README.md
sample_model.joblib		sample_model.joblib

README.md

Password Strength Prediction Project

This project focuses on predicting the strength of passwords using machine learning techniques. The project is divided into multiple notebooks, each covering a specific aspect of the project. Below is a summary of each notebook:

Notebook 01: Fetch data

In this notebook, we will fetch and preprocess the "rockyou.txt" dataset to analyze password strength. This dataset will serve as the foundation for our "Passwordometer" project.

Notebook 02: Data Exploration

In this notebook, we explore the dataset to gain insights and understanding about the data. It includes visualizations and statistical analysis to identify patterns and trends in the password data.

Notebook 02.1: Clean data

In this notebook, we will clean the password dataset obtained in the previous notebook. We will remove invalid passwords and perform basic data cleaning steps to ensure the quality and integrity of the data.

Notebook 02.2: Create Sample Data

In this notebook, we will create a stratified sample of the clean password dataset obtained in the previous notebook. The stratified sample will ensure that we have representative samples from different password strength levels, allowing us to perform accurate analysis and modeling.

Notebook 02.3: Feature Engineering

In this notebook, we will perform feature engineering on the stratified sample data obtained in the previous notebook. Feature engineering involves creating new meaningful features from the existing data that can improve the performance of our password strength prediction model.

Notebook 02.4: Descriptive Analysis

In this notebook, we will perform descriptive analysis on the transformed sample data obtained in the previous notebook. Descriptive analysis involves exploring and summarizing the data to gain insights into the distribution, relationships, and patterns of the variables.

Notebook 03: Select Base Model

In this notebook, we will select a base model for password strength prediction. We will evaluate various regression models using performance metrics such as Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), R-squared (R2), and the training time (TT). The model with the best performance will be chosen as the base model for further improvement.

Notebook 04: Make Pipeline

In this notebook, we will create a machine learning pipeline using scikit-learn. The pipeline will include data preprocessing steps and a decision tree regressor as the base model. The pipeline will be trained and evaluated on the password strength prediction task.

Feel free to explore each notebook in order to gain a comprehensive understanding of the project and the steps involved in password strength prediction.

For more details and code implementation, please refer to the respective notebooks.

Note: The notebooks are organized in a sequential manner to provide a logical flow of the project. It is recommended to follow the notebooks in the given order to fully grasp the concepts and reproduce the results.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

notebooks

notebooks

README.md

Password Strength Prediction Project

Table of Contents

Notebook 01: Fetch data

Notebook 02: Data Exploration

Notebook 02.1: Clean data

Notebook 02.2: Create Sample Data

Notebook 02.3: Feature Engineering

Notebook 02.4: Descriptive Analysis

Notebook 03: Select Base Model

Notebook 04: Make Pipeline

Files

notebooks

Directory actions

More options

Directory actions

More options

Latest commit

History

notebooks

Folders and files

parent directory

README.md

Password Strength Prediction Project

Table of Contents

Notebook 01: Fetch data

Notebook 02: Data Exploration

Notebook 02.1: Clean data

Notebook 02.2: Create Sample Data

Notebook 02.3: Feature Engineering

Notebook 02.4: Descriptive Analysis

Notebook 03: Select Base Model

Notebook 04: Make Pipeline