Skip to content

a-mhamdi/nlp-workshop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

a580a58 · Mar 11, 2025

History

60 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

An Introduction to NLP in Python

NLP-CI Docker Version

This repository contains slides and code samples for using Python to implement some NLP related tasks.

Included Topics

The repository includes the implementation of the following parts:

  1. Regular Expressions (RegEx)
  2. Text Tokenization
  3. Text Processing and Visualization
  4. Gensim Text Processing
  5. Named Entity Recognition (NER)

Prerequisites

Note

You can either follow the steps below for local installation or use the provided Docker image for a containerized environment.

Installation Steps

These commands will set up an isolated environment and install all required packages for this project.

uv venv # creates a virtual environment: `.venv`
uv sync # installs all dependencies

Docker Setup

Codes run on top of a Docker image, ensuring a consistent and reproducible environment.

Attention You will need to have Docker installed on your machine. You can download it from the Docker website.

To run the code, you will need to first pull the Docker image by running the following command:

docker pull abmhamdi/nlp

This may take a while, as it will download and install all necessary dependencies.

How to control the containers:

  • docker-compose up -d starts the container in detached mode
  • docker-compose down stops and destroys the container

Services can be run by typing the command docker-compose up. This will start the Jupyter Lab on http://localhost:2468, and you should be able to use Python from within the notebook by starting a new Python notebook. You can parallelly start Marimo on http://localhost:1357.

License

This project is licensed under the MIT License - see the LICENSE file for details.