Data processing in Python
- numpy - nd arrarys
- pandas - dataframes
- vaex - out of core dataframes
- dask - multi core and distributed parallel execution
- koalas - pandas API on Apache Spark
- faker - generate fake data
- missingno - visualize missing data
- impyute - missing data imputation
- imbalanced-learn - re-sampling for imbalanced data
- flanker - email address data parser
- pandas-profiling - profile reports from a pandas DataFrame
- thefuzz - fuzzywuzzy, fuzzy string matching
- pandera - Data testing and validation
- great expectations - Data validation
- Ray - Scale
- pandas-gbq - Pandas Google Cloud
- polars - Blazingly fast DataFrames
- kalman-and-bayesian-filters