The 79 DAPT SAO IH Hotel Booking describes two datasets with hotel demand data. One of the hotels is a resort hotel and the other is a city hotel. Both datasets share the same structure. Each observation represents a hotel booking. Both datasets comprehend bookings due to arrive between the 1st of July of 2015 and the 31st of August 2017, including bookings that effectively arrived and bookings that were canceled.
Name | Data Type | Description |
---|---|---|
hotel |
category | Type of hotel |
is_cancelled |
binary | Target Value indicating if the booking was canceled (1) or not (0) |
lead_time |
number | Number of days that elapsed between the entering date of the booking into the PMS and the arrival date |
stays_in_weekend_nights |
number | Number of weekend nights (Saturday or Sunday) the guest stayed or booked to stay at the hotel |
stays_in_week_nights |
number | Number of week nights (Monday to Friday) the guest stayed or booked to stay at the hotel |
adults |
number | Number of adults |
children |
number | Number of children |
babies |
number | Number of babies |
meal |
category | Type of meal booked |
country |
category | Country of origin |
market_segment |
category | Market segment designation |
distribution_channel |
category | Booking distribution channel |
is_repeated_guest |
binary | Value indicating if the booking name was from a repeated guest (1) or not (0) |
previous_cancellations |
number | Number of previous bookings that were cancelled by the customer prior to the current booking |
previous_bookings_not_canceled |
number | [Number of previous bookings not cancelled by the customer prior to the current booking |
reserved_room_type |
category | Code of room type reserved |
assigned_room_type |
category | Code for the type of room assigned to the booking |
booking_changes |
number | Number of changes/amendments made to the booking |
deposit_type |
category | Indication on if the customer made a deposit to guarantee the booking |
agent |
category | ID of the travel agency that made the booking |
company |
category | ID of the company/entity that made the booking or responsible for paying the booking. ID is presented instead of designation for anonymity reasons |
days_in_waiting_list |
number | Number of days the booking was in the waiting list before it was confirmed to the customer |
customer_type |
category | Type of booking |
adr |
number | Average Daily Rate calculated by dividing the sum of all lodging transactions by the total number of staying nights |
required_car_parking_spaces |
number | [Number of car parking spaces required by the customer |
total_of_special_requests |
number | Number of special requests made by the customer |
reservation_status_date |
date | Date at which the last status was set |
arrival_date |
date | Date of arrival |
id_booking |
number | ID of booking |
Booking cancellations often directly impact a hotel's bottom line: many times the room(s) reserved are not booked again, resulting in lower occupation and, subsequently, revenue. To hedge this risk, hotels often demand a booking deposit, often calculated as a percentage of the reservations full price. However, this practice can directly impact demand as some customers might look for different hotels with no deposit (or a smaller one). Our goal is to build a model that, by predicting whether a booking will be cancelled or not, can be used by the hotel to implement different risk-aware strategies for calculating the deposit size.
The input
folder contains only the cleaned dataset. For original datset please refer to the notebook.
The notebooks
folder contains the notebook.ipynb Which contains:
- EDA
- Feature enginnering
- Comparison of 8 models, namely:
- the basic model
Dummy
Decision Tree
,Gaussian Naive Bayes
,Random Forest
,Historgram Gradient Boosting
,XGBoost
,Logistics Regression
,KNeighbors
- the basic model
- Model Optimization
- Fitting the best Model(s)
- Evaluation of the best Model(s)
- Feature Importance and extract the high features.
- ROC Curve
- Evaluation of Model(s) with high features
The model is fitted with clean data and saved in binary mode in train.py file.
I am using Flask/gunicorn on linux ubuntu, in order to deploy the model. To deploy this model with Flask/gunicorn, please use:
pipenv run gunicorn --bind 0.0.0.0:9696 predict:app
I used pipenv for the virtual environment. In order to use the same venv as me, do use:
pip install pipenv
To replicate the environment, on your command line, use
pipenv install numpy pandas scikit-learn flask gunicorn requests
Note:
If you want to train the xgboost
model instead of Histogram Gradient Boosting
please add xgboost
as command on pipenv installation.
Note:
To perform the following steps you should logon to your DockerHub Account ( Logi & Password
)
I have built the model and pushed it to dajebbar/hotel-booking-model:latest. To use it just
docker pull dajebbar/hotel-booking-model:latest
Or in order to take the model from the docker container I built, just replace
FROM python:3.9-slim
#with
FROM dajebbar/hotel-booking-model:latest in the dockerfile.
If you choose to build a docker file locally instead, here are the steps to do so:
- Create a Dockerfile as such:
FROM python:3.9-slim
ENV PYTHONUNBUFFERED=TRUE
RUN pip --no-cache-dir install pipenv
WORKDIR /app
COPY ["Pipfile", "Pipfile.lock", "./"]
RUN pipenv install --deploy --system && rm -rf /root/.cache
COPY ["predict.py", "*.bin", "./"]
EXPOSE $PORT
CMD gunicorn --workers=4 --bind 0.0.0.0:$PORT predict:app
This allows us to install python, run pipenv and its dependencies, run our predict script and our model itself and deploys our model using Flask/gunicorn
.
Similarly, you can just use the dockerfile in this repository.
- Build the Docker Container with :
docker build -t hotel-booking-model .
- Run the Docker container with:
Docker run -it -p 9696:9696 docker build -t hotel-booking-model
Now we can use our model through
python test_pred.py
- tag the docker container with:
docker tag hotel-booking-model dajebbar/hotel-booking-model:latest
- Push it Docker registry with :
docker push dajebbar/hotel-booking-model:latest
In order to deploy this model on heroku cloud server, here are the steps to follow:
- Create a new
.github
folder in the parent folder, then in the latter, create anotherworkflows
folder, you can do it easily using the linux command:
touch .github/workflows
- in the
workflows
folder, create the main.yaml file, which is responsible for auto-reading theDockerfile
file once you push it to github. Here is its content:
# Your workflow name.
name: Deploy to heroku.
# Run workflow on every push to main branch.
on:
push:
branches: [main]
# Your workflows jobs.
jobs:
build:
runs-on: ubuntu-latest
steps:
# Check-out your repository.
- name: Checkout
uses: actions/checkout@v2
### ⬇ IMPORTANT PART ⬇ ###
- name: Build, Push and Release a Docker container to Heroku. # Your custom step name
uses: gonuit/heroku-docker-deploy@v1.3.3 # GitHub action name (leave it as it is).
with:
# Below you must provide variables for your Heroku app.
# The email address associated with your Heroku account.
# If you don't want to use repository secrets (which is recommended) you can do:
# email: my.email@example.com
email: ${{ secrets.HEROKU_EMAIL }}
# Heroku API key associated with provided user's email.
# Api Key is available under your Heroku account settings.
heroku_api_key: ${{ secrets.HEROKU_API_KEY }}
# Name of the heroku application to which the build is to be sent.
heroku_app_name: ${{ secrets.HEROKU_APP_NAME }}
# (Optional, default: "./")
# Dockerfile directory.
# For example, if you have a Dockerfile in the root of your project, leave it as follows:
dockerfile_directory: ./
# (Optional, default: "Dockerfile")
# Dockerfile name.
dockerfile_name: Dockerfile
# (Optional, default: "")
# Additional options of docker build command.
docker_options: "--no-cache"
# (Optional, default: "web")
# Select the process type for which you want the docker container to be uploaded.
# By default, this argument is set to "web".
# For more information look at https://devcenter.heroku.com/articles/process-model
process_type: web
- Then, go to github, open the folder
hotel_booking_project
then click onSettings
, go toSecrets
then click onActions
, then click on the buttonNew repository secret
. On theName
field typeHEROKU_EMAIL
, below there is theSecret
field, enter your email which serves as your identifier onHeroku
. After confirming by clicking on theAdd secret
button, you must repeat the same manipulation this time typeHEROKU_API_KEY
as name, and go to theHeroku
platform, go toAccounts settings
then go down to theAPI Key
field then click on theReveal
button copy-paste yourAPI Key
into theSecret
field in github. The same manipulation this time on theName
field typeHEROKU_APP_NAME
and in theSecret
field type the name you gave to your application onHeroku
, in my case it is calledhotelbooking-api
. Once saved, go back to your.github
folder, open the console, type:
git add .
git commit -m "your comment"
git push main origin
And now, when you refresh your github page, a small yellow dot will appear next to the folder name, click on commits to see the automatic execution of the Dockerfile
file with all its dependencies. Return to the Heroku
platform, press the Open app
button and voila.
The project is now deployed on heroku cloud servers and to test it just run the file heroku_pred.py.
- Fork 🍴 the repository and send PRs.
- Do ⭐ this repository if you like the content.
Connect with me: