A nonprofit organization that survives on fundraising will start contacting people to request donations. However, there are few collaborators and the latest attempts have not yielded an expected result. For this reason, the organization decided to carry out a data mining project based on the success criteria of the project: better assertiveness in financial funding. Knowing the remuneration of an individual can help the organization (since it is possible to redirect a percentage of the Income Tax to donations through the FIA - Childhood and Adolescence Fund) to make the most appropriate requests for a request for support and collaboration, or even if they really should get in touch with the person, then this is the criterion for successful mining.
The goal is to build a model that can predict whether an individual earns more than $ 50,000.
So how outgoing artifacts will be a rating for which people to contact, people who earn over $ 50,000, to ask for donations.
Data source from UCI ML Repository: Census Income
- Decision Tree (baseline)
- Support Vector Machine
- AdaBoost
- AdaBoost Tunning
Data Exploration, Data Cleansing, Feature Engineering, Modeling and Evaluation
This project is tested with:
Requisite | Version |
---|---|
Python | 3.9.7 |
Pip | 21.2.4 |
I recommend using Python venv.
pip install --require-hashes -r requirements.txt