NLP Chatbot project

Keywords: Python, TensorFlow, Deep Learning, Natural Language Processing, Chatbot, Movie Dialogues

1. Installation

This project was designed for:

Python 3.6
TensorFlow 1.12.0

Please install requirements & project:

$ cd /path/to/project/
$ git clone https://github.com/filippogiruzzi/nlp_chatbot.git
$ cd nlp_chatbot/
$ pip3 install -r requirements.txt
$ pip3 install -e . --user --upgrade

2. Introduction

2.1 Goal

The purpose of this project is to design and implement a realistic Chatbot based on Natural Language Processing (NLP).

2.2 Results

3. Project structure

The project nlp_chatbot/ has the following structure:

nlp/data_processing/: data processing, recording & visualization
nlp/training/: data input pipeline, model & training / evaluation / prediction operations
nlp/inference/: exporting trained model & inference

4. Dataset

Please download the Cornell Movie-Dialogs Corpus dataset , and extract all files to /path/to/cornell_movie_data/. The challenge description can be found on Kaggle .

The dataset consists of 220 579 conversational exchanges between 10 292 pairs of movie characters and involves 9 035 characters from 617 movies, and is thus well suited for realistic chatbot applications.

5. Project usage

$ cd /path/to/project/nlp_chatbot/nlp/

5.1 Reformat the raw data .txt files

$ python3 data_processing/data_formatter.py --data-dir /path/to/cornell_movie_data/

5.2 Train the NLP Seq2Seq model

$ python3 training/train.py --data-dir /path/to/cornell_movie_data/

5.3 Visualize predictions with trained model

$ python3 training/train.py --data-dir /path/to/cornell_movie_data/tfrecords/
                            --mode predict
                            --model-dir /path/to/trained/model/dir/
                            --ckpt /path/to/trained/model/dir/

5.4 Chat with the Chatbot AI

$ python3 inference/export_model.py --model-dir /path/to/trained/model/dir/
                                    --ckpt /path/to/trained/model/dir/
$ python3 inference/inference.py --data_dir /path/to/cornell_movie_data/
                                 --exported_model /path/to/exported/model/

The trained model will be recorded in /path/to/cornell_movie_data/tfrecords/models/seq2seq/. The exported model will be recorded inside this directory.

6. Todo

7. Resources

This project was widely inspired by:

Pytorch chatbot tutorial, Pytorch website
Pytorch NLP tutorial, Pytorch website
TensorFlow NLP tutorial, TensorFlow website
Keras NLP tutorial, TDS
Kaggle challenge, Kaggle
Sequence to Sequence Learning with Neural Networks, I. Sutskever, O. Vinyals, Q. V. Le, 2014, Arxiv
A Neural Conversational Model, O. Vinyals, Q. Le, 2015, Arxiv
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation, K. Cho, B. van Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, Y. Bengio, 2014, Arxiv
Effective Approaches to Attention-based Neural Machine Translation, M-T. Luong, H. Pham, C. D. Manning, 2015, Arxiv
Neural Machine Translation by Jointly Learning to Align and Translate, D. Bahdanau, K. Cho, Y. Bengio, 2014, Arxiv

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
nlp		nlp
pics		pics
ReadMe.md		ReadMe.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP Chatbot project

Table of contents

1. Installation

2. Introduction

2.1 Goal

2.2 Results

3. Project structure

4. Dataset

5. Project usage

5.1 Reformat the raw data .txt files

5.2 Train the NLP Seq2Seq model

5.3 Visualize predictions with trained model

5.4 Chat with the Chatbot AI

6. Todo

7. Resources

About

Releases

Packages

Languages

filippogiruzzi/nlp_chatbot

Folders and files

Latest commit

History

Repository files navigation

NLP Chatbot project

Table of contents

1. Installation

2. Introduction

2.1 Goal

2.2 Results

3. Project structure

4. Dataset

5. Project usage

5.1 Reformat the raw data .txt files

5.2 Train the NLP Seq2Seq model

5.3 Visualize predictions with trained model

5.4 Chat with the Chatbot AI

6. Todo

7. Resources

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages