diff --git a/.gitignore b/.gitignore index fa8dd846..c51478ef 100644 --- a/.gitignore +++ b/.gitignore @@ -26,3 +26,4 @@ test_kpconv/ kernels/ **/.fuse* train_log/ +*.ipynb_checkpoints \ No newline at end of file diff --git a/docs/tutorial/notebook/Inference_on_a_custom_data.ipynb b/docs/tutorial/notebook/Inference_on_a_custom_data.ipynb new file mode 100644 index 00000000..11793198 --- /dev/null +++ b/docs/tutorial/notebook/Inference_on_a_custom_data.ipynb @@ -0,0 +1,225 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Running Semantic Segmentation inference on custom data" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In this tutorial, we will cover how to run an inference on a pointcloud data in Open3D-ML. To accomplish that, we will take these steps:\n", + "\n", + "1. Download the data *weights* file;\n", + "2. Set up `torch` and `numpy` libraries;\n", + "3. Create a `dataset` object and extract a sample from its `'test'` split;\n", + "4. Create and initialize `model` and `pipeline` objects;\n", + "5. Restore the `model` with data from the *weights* file;\n", + "6. Convert the custom pointcloud data into the specified format;\n", + "7. Run an inference on the sample data.\n", + "\n", + "\n", + "> **Note:** We will be using a sample `RandLANet` `SemanticKITTI` weight file which we need to:\n", + ">\n", + "> 1. Download for either *PyTorch* or *TensorFlow* from links below:\n", + "> > a. For *PyTorch*: https://storage.googleapis.com/open3d-releases/model-zoo/randlanet_semantickitti_202201071330utc.pth\n", + "> >\n", + "> > b. For *TensorFlow*: https://storage.googleapis.com/open3d-releases/model-zoo/randlanet_semantickitti_202201071330utc.zip\n", + ">\n", + "> 2. Place the downloaded `randlanet_semantickitti_202201071330utc.pth` file into `'Open3D-ML/docs/tutorial/notebook/'` subdirectory, or any other place and change the `ckpt_path` accordingly.\n", + ">\n", + "> For other model/dataset weight files, please check out https://github.com/isl-org/Open3D-ML#semantic-segmentation-1\n", + "\n", + "\n", + "An inference predicts the results based on the trained model.\n", + "\n", + "> **Please see the [Training a semantic segmentation model using PyTorch](train_ss_model_using_pytorch.ipynb) and [Training a semantic segmentation model using TensorFlow](train_ss_model_using_tensorflow.ipynb) for training tutorials.**\n", + "\n", + "While training, the model saves the checkpoint files every few epochs, in the *logs* directory. We use these trained weights to restore the model for inference.\n", + "\n", + "Our first step in inference on a custom data implementation is to import `open3d.ml` and `numpy` libraries:\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import open3d.ml.torch as ml3d # just switch to open3d.ml.tf for tf usage\n", + "import numpy as np" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We then create a checkpoint path pointing to the weights file we downloaded (generated at the end of the Training stage):\n", + "\n", + "(You can download any other weights using a link from the model zoo (collection of weights for all combinations of model and dataset): https://github.com/isl-org/Open3D-ML#semantic-segmentation-1 )" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "weights_url = 'https://storage.googleapis.com/open3d-releases/model-zoo/randlanet_semantickitti_202201071330utc.zip'\n", + "ckpt_path = './randlanet_semantickitti_202201071330utc.pth'\n", + "# from urllib.request import urlretrieve\n", + "# urlretrieve(weights_url, filename=ckpt_path)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now, we define a `dataset`, `model`, and `pipeline` objects identical to how it was done in our previous *Training a semantic segmentation model* tutorials:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# We define dataset (similar to train_ss_using_pytorch tutorial)\n", + "dataset = ml3d.datasets.SemanticKITTI(dataset_path='SemanticKITTI/',\n", + " cache_dir='./logs/cache',\n", + " training_split=['00'],\n", + " validation_split=['01'],\n", + " test_split=['01'])\n", + "\n", + "# Initializing the model and pipeline\n", + "model = ml3d.models.RandLANet(in_channels=3)\n", + "pipeline = ml3d.pipelines.SemanticSegmentation(model)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, we restore the model with our weights file with `pipeline.load_ckpt()` method:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Load checkpoint using `load_ckpt` method (restoring weights for inference)\n", + "pipeline.load_ckpt(ckpt_path=ckpt_path)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now, let us query the first pointcloud from the `test` split." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "test_data = dataset.get_split('test')\n", + "data = test_data.get_data(0)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's display what `data` contains:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "print(data)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For inference on custom data, you can convert your point cloud into this format:\n", + "\n", + "**Dictionary with keys {'point', 'feat', 'label'}**\n", + "\n", + "If you already have the *ground truth labels*, you can add them to data to get accuracy and IoU (Intersection over Union). Otherwise, pass labels as `None`.\n", + "\n", + "And now - the main topic of our tutorial - running inference on the test data. You can call the `run_inference()` method with your data, - it will print *accuracy per class* and *Intersection over Union (IoU)* metrics. The last entry in the list is *mean accuracy* and *mean IoU*:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Running inference on test data\n", + "results = pipeline.run_inference(data)\n", + "# prints per class accuracy and IoU (Intersection over Union). Last entry is mean accuracy and mean IoU.\n", + "# We get several `nan` outputs for missing classes in the input data." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The `results` object will return a dictionary of predicted labels and predicted probabilities per point:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Dictionary of predicted labels and predicted probabilities per class\n", + "results" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.4" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/docs/tutorial/notebook/reading_a_config_file.ipynb b/docs/tutorial/notebook/reading_a_config_file.ipynb new file mode 100644 index 00000000..7789b0d3 --- /dev/null +++ b/docs/tutorial/notebook/reading_a_config_file.ipynb @@ -0,0 +1,290 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Controlling training and inference with config files" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Look at the following snippet of code which creates a dataset:\n", + "\n", + "```py\n", + "# Read a dataset by specifying the path. We can pass other arguments like cache directory and training split.\n", + "\n", + "dataset = ml3d.datasets.SemanticKITTI(dataset_path='SemanticKITTI/',\n", + " cache_dir='./logs/cache',\n", + " training_split=['00'],\n", + " validation_split=['01'],\n", + " test_split=['01'])\n", + "```\n", + "\n", + "In the code above, the `dataset` object is created by explicitly passing dataset-specific parameters into `ml3d.datasets.SemanticKITTI()` method.\n", + "\n", + "Instead of passing a bunch of parameters to a function call, we can supply `dataset` information from a specific *config* file. Each *config* file contains parameters for a dataset, model and pipeline.\n", + "\n", + "\n", + ">***Config* files pass information into Open3D-ML in YAML format.**\n", + "\n", + "\n", + "In this example, we will:\n", + "\n", + "- Load a *config* `cfg_file` into a `Config` class object;\n", + "- Parse `dataset` dictionaries from the `Config` object;\n", + "- Access individual dictionaries in the `Config` object;\n", + "- Access individual elements from within dictionaries.\n", + "\n", + "\n", + "## Loading a *config* file" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from open3d.ml import utils\n", + "import open3d.ml.torch as ml3d" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Here, we import two modules:\n", + " \n", + " 1. `utils` - Open3D-ML utilities\n", + " 2. `ml3d` - Open3D-ML PyTorch API library\n", + "\n", + "Now, we'll create *config* object:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "cfg_file = \"../../../ml3d/configs/randlanet_semantickitti.yml\"\n", + "cfg = utils.Config.load_from_file(cfg_file)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The `cfg_file` holds the full path to the particular *config* file - `randlanet_semantickitti.yml`.\n", + "The `cfg` object is initialized by the Open3D-ML `utils.Config.load_from_file()` method to hold parameters that are read from the `cfg_file`.\n", + "\n", + "## Examining dataset dictionaries in the `cfg` object\n", + "\n", + "Let's examine the contents of the `cfg` object:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "vars(cfg)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "`vars(cfg)` returns the three dictionaries: `dataset`, `model`, and `pipeline`.\n", + "\n", + "Now, let's explore them. The first one is the `cfg.dataset`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "cfg.dataset" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Accessing individual dictionary items\n", + "\n", + "These `cfg` dictionary items can be viewed as well as updated like in a standard Python dictionary. We can access individual items of the `cfg.dataset` dictionary like so: " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "cfg.dataset['name']" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## `cfg.model` and `cfg.pipeline` dictionaries\n", + "\n", + "We'll later revisit the `cfg.dataset`. Next, let's look at the `cfg.model` dictionary:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "cfg.model" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Just as in the case of `cfg.dataset`, we can access `cfg.model` dictionary items by referencing them individually:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "cfg.model['sub_sampling_ratio']" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "cfg.pipeline" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Likewise, individual dictionary items in `cfg.pipeline` can be accesed just like those of `cfg.model` and `cfg.dataset`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "cfg.pipeline['name']" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Initializing datasets from *config* files\n", + "\n", + "Next, we explicitly create the `dataset` object which will hold all information from the `cfg.dataset` dictionary:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "dataset = ml3d.datasets.SemanticKITTI(cfg.dataset)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, we'll look at what properties the newly-created `dataset` object exposes with the Python `vars()` function:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "vars(dataset)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can reference any property of the dataset by using *`object.property`* syntax. For example, to find out what value the `num_classes` property holds, we type in:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "dataset.num_classes" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Likewise, to extract information from a `label_to_names` property which maps labels to the object names, we call:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "dataset.label_to_names" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Experiment with other `dataset` properties to see how convenient it is to reference them." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.4" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/docs/tutorial/notebook/reading_a_dataset.ipynb b/docs/tutorial/notebook/reading_a_dataset.ipynb new file mode 100644 index 00000000..21f0ada9 --- /dev/null +++ b/docs/tutorial/notebook/reading_a_dataset.ipynb @@ -0,0 +1,276 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Reading a dataset" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In this tutorial, we will learn how to read Open3D-ML datasets.\n", + "\n", + "You can use any dataset available in the `ml3d.datasets` namespace. For this example, we will use the `SemanticKITTI` dataset. You can use any of the other datasets to load data. However, you must understand that the parameters may vary for each dataset.\n", + "\n", + "To read a dataset in this example, we will supply the following parameter variables:\n", + "\n", + "- Dataset path (`dataset_path`)\n", + "- Cache directory (`cache_dir`)\n", + "- Dataset splits (for training, validation, and testing)\n", + "\n", + "> **For more theoretical background information on dataset splitting, please refer to these articles:**\n", + ">\n", + "> https://machinelearningcompass.com/dataset_optimization/split_data_the_right_way/\n", + ">\n", + "> https://www.freecodecamp.org/news/key-machine-learning-concepts-explained-dataset-splitting-and-random-forest/\n", + "\n", + "## Creating a global dataset object\n", + "\n", + "First, we import the Open3D-ML PyTorch library:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# import torch\n", + "import open3d.ml.torch as ml3d" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We then create a `dataset` object, initializing it with *path, cache directory*, and *splits*. This `dataset` can read all the files inside `dataset_path` directory:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Read a dataset by specifying the path. We are also providing the cache directory and splits.\n", + "dataset = ml3d.datasets.SemanticKITTI(dataset_path='SemanticKITTI/',\n", + " cache_dir='./logs/cache',\n", + " training_split=['00'],\n", + " validation_split=['01'],\n", + " test_split=['01'])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "A couple of words regarding the *splits* variables: here, we isolate different portions of the `SemanticKITTI` dataset content and divide them into 3 different parts:\n", + "\n", + "1. `training_split` for data training. This part usually contains 70-75% of the global `dataset` content.\n", + "2. `validation_split` for data validation. This part accounts for 10-15% of the global `dataset` content;\n", + "3. `test_split` for testing. It contains test data and its size varies.\n", + "\n", + "Note the `SemanticKITTI` dataset folder structure:\n", + "\n", + "![dataset_structure](https://user-images.githubusercontent.com/93158890/162548755-28c541d3-3557-4903-a9a1-cc685d16dfc2.jpg)\n", + "\n", + "The three different *split* parameter variables instruct Open3D-ML subsystem to reference the following folder locations:\n", + "\n", + "- `training_split=['00']` points to `'SemanticKITTI/dataset/sequences/00/'`\n", + "- `validation_split=['01']` points to `'SemanticKITTI/dataset/sequences/01/'`\n", + "- `test_split=['01']` points to `'SemanticKITTI/dataset/sequences/01/'`\n", + "\n", + "> Note: **dataset split directories usually contain numerous point cloud files.** In our example we included only one point cloud file for extra speed and convenience.\n", + "\n", + "## Creating dataset split objects to query the data\n", + "\n", + "Next, we will create **dedicated** dataset split objects for specifying which split portion we would like to query.\n", + "\n", + "First, we create a `train_split` subset for training from the global `dataset` content we have initialized above using the `get_split()` method:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "# Split the dataset for 'training'. You can get the other splits by passing 'validation' or 'test'\n", + "train_split = dataset.get_split('training')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now, let's do the same for validation:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Similarly, get validataion split.\n", + "val_split = dataset.get_split('validation')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally, we create a `test_split` subset for testing:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Get test split\n", + "test_split = dataset.get_split('test')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Determining the size of dataset splits\n", + "\n", + "Let's see how large out *split* portions are:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Get length of splits\n", + "print(len(train_split))\n", + "print(len(val_split))\n", + "print(len(test_split))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Above, Open3D-ML prints out the number of pointcloud files it found in `'SemanticKITTI/dataset/sequences/'`' `'/00/'` and `'/01/'` subdirectories we have specified earlier in `training_split=['00'], validation_split=['01'], test_split=['01']` varables for the `dataset`.\n", + "\n", + "## Querying dataset splits for data\n", + "\n", + "In this section, we are using the `train_split` dataset split object as an example. The procedure would be identical for all other dataset splits - `val_split` and `test_split`.\n", + "\n", + "In order to extract the data from the `train_split`, we can iterate through the `train_split` with the index `i` (ranging from `0` - `len(train_split)-1`) using the `get_data()` method.\n", + ":" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Query splits for data, index should be from `0` to `len(split) - 1`\n", + "for i in range(len(train_split)):\n", + " data = train_split.get_data(i)\n", + " print(data)\n", + " break" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "`data` objects from the above `for` loop return a dictionary of points (`'point'`), features (`'feat'`), and labels (`'label'`), as we will see below:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "data = train_split.get_data(0) # Dictionary of `point`, `feat`, and `label`\n", + "print(data.keys())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "- The **`'point'`** key value contains a set of 3D points/coordinates - X, Y, and Z:\n", + "\n", + "![dataset_coordinates](https://user-images.githubusercontent.com/93158890/162549410-6369cbd0-b835-4216-ba54-945e3f591395.jpg)\n", + "\n", + "- The **`'feat'`** (*features*) key value contains RGB color information for each of the above points.\n", + "\n", + "- The **`'label'`** key value represents which class the dataset content belongs to, i.e.: *pedestrian, vehicle, traffic light*, etc.\n", + "\n", + "### Querying dataset splits for attributes\n", + "\n", + "We can also extract corresponding point cloud information:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "attr = train_split.get_attr(0)\n", + "print(\n", + " attr\n", + ") # Dictionary containing information about the data e.g. name, path, split, etc." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Atttributes returned are: `'idx'`(*index*), `'name'`, `'path'`, and `'split'`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "#support of Open3d-ML visualizer in Jupyter Notebooks is in progress\n", + "#view the frames using the visualizer\n", + "#vis = ml3d.vis.Visualizer()\n", + "#vis.visualize_dataset(dataset, 'training',indices=range(len(train_split)))" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.4" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/docs/tutorial/notebook/train_ss_model_using_pytorch.ipynb b/docs/tutorial/notebook/train_ss_model_using_pytorch.ipynb new file mode 100644 index 00000000..28ce55ca --- /dev/null +++ b/docs/tutorial/notebook/train_ss_model_using_pytorch.ipynb @@ -0,0 +1,222 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "9bc05fec", + "metadata": {}, + "source": [ + "# Training a semantic segmentation model using PyTorch" + ] + }, + { + "cell_type": "markdown", + "id": "f25922b0", + "metadata": {}, + "source": [ + "In this tutorial, we will learn how to train a semantic segmentation model using PyTorch.\n", + "\n", + "Before you begin, ensure that you have *PyTorch* installed. To install a compatible version of PyTorch, use the requirement file:\n", + "\n", + "```sh\n", + "pip install -r requirements-torch-cuda.txt\n", + "```\n", + "\n", + "At a high level, we will:\n", + "\n", + "- Read a dataset and create a *'training'* split. For this example, we will use the `SemanticKITTI` dataset.\n", + "- Train a model. We will train a `RandLANet` model on the *'training'* split.\n", + "- Run a test on a *'test'* split to evaluate the model.\n", + "- Run an inference on a custom point cloud.\n" + ] + }, + { + "cell_type": "markdown", + "id": "76e89bac", + "metadata": {}, + "source": [ + "## Reading a dataset" + ] + }, + { + "cell_type": "markdown", + "id": "fdff079f", + "metadata": {}, + "source": [ + "Downloading scripts are available in: `Open3D-ML/scripts/download_datasets`\n", + "\n", + "You can use any dataset available in the `ml3d.datasets` dataset namespace. Here, we will use the `SemanticKITTI` dataset and visualize it. You can use any of the other datasets to load data. However, you must understand that the parameters may vary for each dataset.\n", + "\n", + "We will read the dataset by specifying its path and then get all splits." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "508129cd", + "metadata": {}, + "outputs": [], + "source": [ + "# Training Semantic Segmentation Model using PyTorch\n", + "\n", + "# import torch\n", + "import open3d.ml.torch as ml3d\n", + "\n", + "# Read a dataset by specifying the path. We are also providing the cache directory and training split.\n", + "dataset = ml3d.datasets.SemanticKITTI(dataset_path='SemanticKITTI/',\n", + " cache_dir='./logs/cache',\n", + " training_split=['00'],\n", + " validation_split=['01'],\n", + " test_split=['01'])\n", + "\n", + "# Split the dataset for 'training'. You can get the other splits by passing 'validation' or 'test'\n", + "train_split = dataset.get_split('training')\n", + "\n", + "#support of Open3d-ML visualizer in Jupyter Notebooks is in progress\n", + "#view the frames using the visualizer\n", + "#vis = ml3d.vis.Visualizer()\n", + "#vis.visualize_dataset(dataset, 'training',indices=range(len(train_split)))" + ] + }, + { + "cell_type": "markdown", + "id": "f34dca8f", + "metadata": {}, + "source": [ + "Now that you have visualized the dataset for training, let us train the model." + ] + }, + { + "cell_type": "markdown", + "id": "e2dc77ea", + "metadata": {}, + "source": [ + "## Training a model\n", + "\n", + "First, import the desired model from `open3d.ml.torch.models`.\n", + "\n", + "After you load a dataset, you can initialize any model and then train the model. The following example shows how you can train RandLANet:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b7c1ff9a", + "metadata": {}, + "outputs": [], + "source": [ + "# Training Semantic Segmentation Model using PyTorch\n", + "\n", + "# Import torch and the model to use for training\n", + "import open3d.ml.torch as ml3d\n", + "from open3d.ml.torch.models import RandLANet\n", + "from open3d.ml.torch.pipelines import SemanticSegmentation\n", + "\n", + "# Read a dataset by specifying the path. We are also providing the cache directory and training split.\n", + "# dataset = ml3d.datasets.SemanticKITTI(dataset_path='/Users/sanskara/data/SemanticKITTI/', cache_dir='./logs/cache',training_split=['00'], validation_split=['01'], test_split=['01'])\n", + "dataset = ml3d.datasets.SemanticKITTI(dataset_path='SemanticKITTI/',\n", + " cache_dir='./logs/cache',\n", + " training_split=['00'],\n", + " validation_split=['01'],\n", + " test_split=['01'])\n", + "\n", + "# Initialize the RandLANet model.\n", + "model = RandLANet(in_channels=3)\n", + "pipeline = SemanticSegmentation(model=model,\n", + " dataset=dataset,\n", + " max_epoch=3,\n", + " optimizer={'lr': 0.001},\n", + " num_workers=0)\n", + "\n", + "# Run the training\n", + "pipeline.run_train()" + ] + }, + { + "cell_type": "markdown", + "id": "1a0b0bc9", + "metadata": {}, + "source": [ + "The training checkpoints are saved in: `pipeline.main_log_dir` (default path is: “./logs/Model_Dataset/“). You can use them for testing and inference." + ] + }, + { + "cell_type": "markdown", + "id": "6f4fa94c", + "metadata": {}, + "source": [ + "## Running a test\n", + "\n", + "Next, we will evaluate the trained model on the test split by calling the `run_test()` method:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "80f461dd", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline.run_test()" + ] + }, + { + "cell_type": "markdown", + "id": "3a385892", + "metadata": {}, + "source": [ + "## Running an inference\n", + "\n", + "An inference processes point cloud and displays the results based on the trained model. For this example, we will use a trained `RandLANet` model.\n", + "\n", + "This example gets the pipeline, model, and dataset based on our previous training example. It runs the inference based the \"train\" split and prints the results." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d39c5381", + "metadata": {}, + "outputs": [], + "source": [ + "# Get data from the SemanticKITTI dataset using the \"test\" split\n", + "train_split = dataset.get_split(\"test\")\n", + "data = train_split.get_data(0)\n", + "\n", + "# Run the inference\n", + "results = pipeline.run_inference(data)\n", + "\n", + "# Print the results\n", + "print(results)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d4039f9d", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.4" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/docs/tutorial/notebook/train_ss_model_using_tensorflow.ipynb b/docs/tutorial/notebook/train_ss_model_using_tensorflow.ipynb new file mode 100644 index 00000000..b8070ba8 --- /dev/null +++ b/docs/tutorial/notebook/train_ss_model_using_tensorflow.ipynb @@ -0,0 +1,241 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "91802780", + "metadata": {}, + "source": [ + "# Training a semantic segmentation model using TensorFlow" + ] + }, + { + "cell_type": "markdown", + "id": "6c42d107", + "metadata": {}, + "source": [ + "In this tutorial, we will learn how to train a semantic segmentation model using `TensorFlow` in a Jupyter Notebook. We assume that you are familiar with Jupyter Notebook and have created a folder *notebooks* in a folder that is relative to *ml3d*.\n", + "\n", + "Before you begin, ensure that you have `TensorFlow` installed. To install a compatible version of `TensorFlow`, use the requirement file:\n", + "\n", + "```sh\n", + "pip install -r requirements-tensorflow.txt\n", + "```\n", + "\n", + "At a high level, we will:\n", + "\n", + "- Read a dataset and create a training split. Here, we will use the `SemanticKITTI` dataset.\n", + "- Train a model. We will train a `RandLANet` model on the training split.\n", + "- Run a test on a *'test'* split to evaluate the model.\n", + "- Run an inference on a custom point cloud." + ] + }, + { + "cell_type": "markdown", + "id": "f788f8d5", + "metadata": {}, + "source": [ + "## Reading a dataset\n", + "\n", + "Downloading scripts are available in: `Open3D-ML/scripts/download_datasets`\n", + "\n", + "You can use any dataset available in the `ml3d.datasets` dataset namespace. For this example, we will use the `SemanticKITTI` dataset and visualize it. You can use any of the other datasets to load data. However, you must understand that the parameters may vary for each dataset.\n", + "\n", + "We will read the dataset by specifying its path and then get all splits." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8714f23e", + "metadata": {}, + "outputs": [], + "source": [ + "# Training Semantic Segmentation Model using TensorFlow\n", + "\n", + "# import tensorflow\n", + "import open3d.ml.tf as ml3d\n", + "\n", + "# Read a dataset by specifying the path. We are also providing the cache directory and training split.\n", + "dataset = ml3d.datasets.SemanticKITTI(dataset_path='./SemanticKITTI/',\n", + " cache_dir='./logs/cache',\n", + " training_split=['00'])\n", + "\n", + "# Split the dataset for 'training'. You can get the other splits by passing 'validation' or 'test'\n", + "train_split = dataset.get_split('training')\n", + "\n", + "# view the first 1000 frames using the visualizer\n", + "# MyVis = ml3d.vis.Visualizer()\n", + "# MyVis.visualize_dataset(dataset, 'training',indices=range(1))" + ] + }, + { + "cell_type": "markdown", + "id": "9d46e91b", + "metadata": {}, + "source": [ + "Now that you have visualized the dataset for training, let us train the model." + ] + }, + { + "cell_type": "markdown", + "id": "cb4d73cd", + "metadata": {}, + "source": [ + "## Training a model\n", + "\n", + "`TensorFlow` maps nearly all of GPU memory by default. This may result in out_of_memory error if some of the ops allocate memory independent to tensorflow. You may want to limit memory usage as and when needed by the process. Use following code right after importing tensorflow:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "90c55d8d", + "metadata": {}, + "outputs": [], + "source": [ + "import tensorflow as tf\n", + "gpus = tf.config.experimental.list_physical_devices('GPU')\n", + "if gpus:\n", + " try:\n", + " for gpu in gpus:\n", + " tf.config.experimental.set_memory_growth(gpu, True)\n", + " except RuntimeError as e:\n", + " print(e)" + ] + }, + { + "cell_type": "markdown", + "id": "9ef4c76d", + "metadata": {}, + "source": [ + "Refer to [this link](https://www.tensorflow.org/guide/gpu#limiting_gpu_memory_growth) for more details.\n", + "\n", + "First, import the desired model from `open3d.ml.torch.models`.\n", + "\n", + "After you load a dataset, you can initialize any model and then train the model. The following example shows how you can train RandLANet:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "66b92b2d", + "metadata": {}, + "outputs": [], + "source": [ + "# Training Semantic Segmentation Model using TensorFlow\n", + "\n", + "# Import tensorflow and the model to use for training\n", + "import open3d.ml.tf as ml3d\n", + "from open3d.ml.tf.models import RandLANet\n", + "from open3d.ml.tf.pipelines import SemanticSegmentation\n", + "\n", + "# Read a dataset by specifying the path. We are also providing the cache directory and training split.\n", + "dataset = ml3d.datasets.SemanticKITTI(dataset_path='SemanticKITTI/',\n", + " cache_dir='./logs/cache',\n", + " training_split=['00'],\n", + " validation_split=['01'],\n", + " test_split=['01'])\n", + "\n", + "# Initialize the RandLANet model with three layers.\n", + "model = RandLANet(dim_input=3, augment={})\n", + "pipeline = SemanticSegmentation(model=model,\n", + " dataset=dataset,\n", + " max_epoch=3,\n", + " optimizer={'learning_rate': 0.001})\n", + "\n", + "# Run the training\n", + "pipeline.run_train()" + ] + }, + { + "cell_type": "markdown", + "id": "0e99379b", + "metadata": {}, + "source": [ + "The training checkpoints are saved in: `pipeline.main_log_dir` (default path is: “./logs/Model_Dataset/“). You can use them for testing and inference." + ] + }, + { + "cell_type": "markdown", + "id": "001c40dc", + "metadata": {}, + "source": [ + "## Running a test\n", + "\n", + "Running a test is very similar to training the model.\n", + "\n", + "We can call the `run_test()` method, and it will run inference on the test split." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9b936b96", + "metadata": {}, + "outputs": [], + "source": [ + "# Run the test\n", + "pipeline.run_test()" + ] + }, + { + "cell_type": "markdown", + "id": "d0b482d8", + "metadata": {}, + "source": [ + "## Running an inference\n", + "\n", + "An inference processes point cloud and displays the results based on the trained model. For this example, we will use a trained `RandLANet` model.\n", + "\n", + "This example gets the pipeline, model, and dataset based on our previous training example. It runs the inference based the \"train\" split and prints the results." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c3c7dbc9", + "metadata": {}, + "outputs": [], + "source": [ + "# Get data from the SemanticKITTI dataset using the \"test\" split\n", + "train_split = dataset.get_split(\"test\")\n", + "data = train_split.get_data(0)\n", + "\n", + "# Run the inference\n", + "results = pipeline.run_inference(data)\n", + "\n", + "# Print the results\n", + "print(results)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "633dab8c", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.4" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/docs/tutorial/notebook/train_ss_model_using_tensorflow.rst b/docs/tutorial/notebook/train_ss_model_using_tensorflow.rst index 8cd249dc..2c7b9108 100644 --- a/docs/tutorial/notebook/train_ss_model_using_tensorflow.rst +++ b/docs/tutorial/notebook/train_ss_model_using_tensorflow.rst @@ -1,7 +1,7 @@ .. _train_ss_model_using_tensorflow: Train a Semantic Segmentation Model Using TensorFlow -------------------------------------------------- +---------------------------------------------------- In this tutorial, we will learn how to train a semantic segmentation model using TensorFlow in a Jupyter Notebook. We assume that you are familiar with Jupyter Notebook and have created a folder `notebooks` in a folder that is relative to `ml3d`. Before you begin, ensure that you have TensorFlow installed. To install a compatible version of TensorFlow, use the requirement file: