Skip to content

Commit

Permalink
siamrpn: Single Object tracking 2d (#367)
Browse files Browse the repository at this point in the history
* SiamRPN learner + ROS node

* CRLF -> LF

* tests + custom dataset support

* changelog update

* Update src/opendr/perception/object_tracking_2d/siamrpn/siamrpn_learner.py

Co-authored-by: Kostas Tsampazis <27914645+tsampazk@users.noreply.github.com>

* Update docs/reference/rosbridge.md

Co-authored-by: Kostas Tsampazis <27914645+tsampazk@users.noreply.github.com>

* Update docs/reference/rosbridge.md

Co-authored-by: Kostas Tsampazis <27914645+tsampazk@users.noreply.github.com>

* Update src/opendr/perception/object_tracking_2d/siamrpn/README.md

Co-authored-by: Kostas Tsampazis <27914645+tsampazk@users.noreply.github.com>

* fix docs, rename ros1 node, rearrange learner + static methods

* fix yolov5 download bug

* Update src/opendr/perception/object_tracking_2d/siamrpn/README.md

Co-authored-by: Kostas Tsampazis <27914645+tsampazk@users.noreply.github.com>

* dependencies possible fix

* dependencies possible fix

* update tests to avoid running siamrpn test twice

* Some fixes in bridge doc

* Minor format fixes

* Some more

* More details

Co-authored-by: Kostas Tsampazis <27914645+tsampazk@users.noreply.github.com>
Co-authored-by: ad-daniel <daniel.dias@epfl.ch>
  • Loading branch information
3 people authored Dec 12, 2022
1 parent 739c577 commit 58a9067
Show file tree
Hide file tree
Showing 33 changed files with 2,973 additions and 1,300 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ Released on December, XX, 2022.
- Added YOLOv5 as an inference-only tool ([#360](https://github.com/opendr-eu/opendr/pull/360)).
- Added Continual Transformer Encoders ([#317](https://github.com/opendr-eu/opendr/pull/317)).
- Added AmbiguityMeasure utility tool ([#361](https://github.com/opendr-eu/opendr/pull/361)).
- Added SiamRPN 2D tracking tool ([#367](https://github.com/opendr-eu/opendr/pull/367))
- Bug Fixes:
- Fixed `BoundingBoxList`, `TrackingAnnotationList`, `BoundingBoxList3D` and `TrackingAnnotationList3D` confidence warnings ([#365](https://github.com/opendr-eu/opendr/pull/365)).
- Fixed undefined `image_id` and `segmentation` for COCO `BoundingBoxList` ([#365](https://github.com/opendr-eu/opendr/pull/365)).
Expand Down
3 changes: 3 additions & 0 deletions docs/reference/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ Neither the copyright holder nor any applicable licensor will be liable for any
- object tracking 2d:
- [fair_mot Module](object-tracking-2d-fair-mot.md)
- [deep_sort Module](object-tracking-2d-deep-sort.md)
- [siamrpn Module](object-tracking-2d-siamrpn.md)
- object tracking 3d:
- [ab3dmot Module](object-tracking-3d-ab3dmot.md)
- multimodal human centric:
Expand Down Expand Up @@ -121,11 +122,13 @@ Neither the copyright holder nor any applicable licensor will be liable for any
- [centernet Demo](/projects/python/perception/object_detection_2d/centernet)
- [ssd Demo](/projects/python/perception/object_detection_2d/ssd)
- [yolov3 Demo](/projects/python/perception/object_detection_2d/yolov3)
[yolov5 Demo](/projects/python/perception/object_detection_2d/yolov5)
- [seq2seq-nms Demo](/projects/python/perception/object_detection_2d/nms/seq2seq-nms)
- object detection 3d:
- [voxel Demo](/projects/python/perception/object_detection_3d/demos/voxel_object_detection_3d)
- object tracking 2d:
- [fair_mot Demo](/projects/python/perception/object_tracking_2d/demos/fair_mot_deep_sort)
- [siamrpn Demo](/projects/python/perception/object_tracking_2d/demos/siamrpn)
- panoptic segmentation:
- [efficient_ps Demo](/projects/python/perception/panoptic_segmentation/efficient_ps)
- semantic segmentation:
Expand Down
221 changes: 221 additions & 0 deletions docs/reference/object-tracking-2d-siamrpn.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,221 @@
## SiamRPNLearner module

The *SiamRPN* module contains the *SiamRPNLearner* class, which inherits from the abstract class *Learner*.

### Class SiamRPNLearner
Bases: `engine.learners.Learner`

The *SiamRPNLearner* class is a wrapper of the SiamRPN detector[[1]](#siamrpn-1)
[GluonCV implementation](https://github.com/dmlc/gluon-cv/tree/master/gluoncv/model_zoo/siamrpn).
It can be used to perform object tracking on videos (inference) as well as train new object tracking models.

The [SiamRPNLearner](/src/opendr/perception/object_tracking_2d/siamrpn/siamrpn_learner.py) class has the following public methods:

#### `SiamRPNLearner` constructor
```python
SiamRPNLearner(self, device, n_epochs, num_workers, warmup_epochs, lr, weight_decay, momentum, cls_weight, loc_weight, batch_size, temp_path)
```

Parameters:

- **device**: *{'cuda', 'cpu'}, default='cuda'*\
Specifies the device to be used.
- **n_epochs**: *int, default=50*\
Specifies the number of epochs to be used during training.
- **num_workers**: *int, default=1*\
Specifies the number of workers to be used when loading datasets or performing evaluation.
- **warmup_epochs**: *int, default=2*\
Specifies the number of epochs during which the learning rate is annealed to **lr**.
- **lr**: *float, default=0.001*\
Specifies the initial learning rate to be used during training.
- **weight_decay**: *float, default=0*\
Specifies the weight decay to be used during training.
- **momentum**: *float, default=0.9*\
Specifies the momentum to be used for optimizer during training.
- **cls_weight**: *float, default=1.*\
Specifies the classification loss multiplier to be used for optimizer during training.
- **loc_weight**: *float, default=1.2*\
Specifies the localization loss multiplier to be used for optimizer during training.
- **batch_size**: *int, default=32*\
Specifies the batch size to be used during training.
- **temp_path**: *str, default=''*\
Specifies a path to be used for data downloading.


#### `SiamRPNLearner.fit`
```python
SiamRPNLearner.fit(self, dataset, log_interval, n_gpus, verbose)
```

This method is used to train the algorithm on a `DetectionDataset` or `ExternalDataset` dataset and also performs evaluation on a validation set using the trained model.
Returns a dictionary containing stats regarding the training process.

Parameters:

- **dataset**: *object*\
Object that holds the training dataset.
- **log_interval**: *int, default=20*\
Training loss is printed in stdout after this amount of iterations.
- **n_gpus**: *int, default=1*\
If CUDA is enabled, training can be performed on multiple GPUs as set by this parameter.
- **verbose**: *bool, default=True*\
If True, enables maximum verbosity.

#### `SiamRPNLearner.eval`
```python
SiamRPNLearner.eval(self, dataset)
```

Performs evaluation on a dataset. The OTB dataset is currently supported.

Parameters:

- **dataset**: *object*\
Object that holds dataset to perform evaluation on.
Expected type is `ExternalDataset` with `otb2015` dataset type.

#### `SiamRPNLearner.infer`
```python
SiamRPNLearner.infer(self, img, init_box)
```

Performs inference on a single image.
If the `init_box` is provided, the tracker is initialized.
If not, the current position of the target is updated by running inference on the image.

Parameters:

- **img**: *object*\
Object of type engine.data.Image.
- **init_box**: *object, default=None*\
Object of type engine.target.TrackingAnnotation.
If provided, it is used to initialize the tracker.

#### `SiamRPNLearner.save`
```python
SiamRPNLearner.save(self, path, verbose)
```

Saves a model in OpenDR format at the specified path.
The model name is extracted from the base folder in the specified path.

Parameters:

- **path**: *str*\
Specifies the folder where the model will be saved.
The model name is extracted from the base folder of this path.
- **verbose**: *bool default=False*\
If True, enables maximum verbosity.

#### `SiamRPNLearner.load`
```python
SiamRPNLearner.load(self, path, verbose)
```

Loads a model which was previously saved in OpenDR format at the specified path.

Parameters:

- **path**: *str*\
Specifies the folder where the model will be loaded from.
- **verbose**: *bool default=False*\
If True, enables maximum verbosity.

#### `SiamRPNLearner.download`
```python
SiamRPNLearner.download(self, path, mode, verbose, url, overwrite)
```

Downloads data needed for the various functions of the learner, e.g., pre-trained models as well as test data.

Parameters:

- **path**: *str, default=None*\
Specifies the folder where data will be downloaded.
If *None*, the *self.temp_path* directory is used instead.
- **mode**: *{'pretrained', 'video', 'test_data', 'otb2015'}, default='pretrained'*\
If *'pretrained'*, downloads a pre-trained detector model.
If *'video'*, downloads a single video to perform inference on.
If *'test_data'* downloads a dummy version of the OTB dataset for testing purposes.
If *'otb2015'*, attempts to download the OTB dataset (100 videos).
This process lasts a long time.
- **verbose**: *bool default=False*\
If True, enables maximum verbosity.
- **url**: *str, default=OpenDR FTP URL*\
URL of the FTP server.
- **overwrite**: *bool, default=False*\
If True, files will be re-downloaded if they already exists.
This can solve some issues with large downloads.

#### Examples

* **Training example using `ExternalDataset` objects**.
Training is supported solely via the `ExternalDataset` class.
See [class README](/src/opendr/perception/object_tracking_2d/siamrpn/README.md) for a list of supported datasets and presumed data directory structure.
Example training on COCO Detection dataset:
```python
from opendr.engine.datasets import ExternalDataset
from opendr.perception.object_tracking_2d import SiamRPNLearner

dataset = ExternalDataset("/path/to/data/root", "coco")
learner = SiamRPNLearner(device="cuda", n_epochs=50, batch_size=32,
lr=1e-3)
learner.fit(dataset)
learner.save("siamrpn_custom")
```

* **Inference and result drawing example on a test mp4 video using OpenCV.**
```python
import cv2
from opendr.engine.target import TrackingAnnotation
from opendr.perception.object_tracking_2d import SiamRPNLearner

learner = SiamRPNLearner(device="cuda")
learner.download(".", mode="pretrained")
learner.load("siamrpn_opendr")

learner.download(".", mode="video")
cap = cv2.VideoCapture("tc_Skiing_ce.mp4")

init_bbox = TrackingAnnotation(left=598, top=312, width=75, height=200, name=0, id=0)

frame_no = 0
while cap.isOpened():
ok, frame = cap.read()
if not ok:
break

if frame_no == 0:
# first frame, pass init_bbox to infer function to initialize the tracker
pred_bbox = learner.infer(frame, init_bbox)
else:
# after the first frame only pass the image to infer
pred_bbox = learner.infer(frame)

frame_no += 1

cv2.rectangle(frame, (pred_bbox.left, pred_bbox.top),
(pred_bbox.left + pred_bbox.width, pred_bbox.top + pred_bbox.height),
(0, 255, 255), 3)
cv2.imshow('Tracking Result', frame)
cv2.waitKey(1)

cv2.destroyAllWindows()
```


#### Performance evaluation

We have measured the performance on the OTB2015 dataset in terms of success and FPS on an RTX 2070.
```
------------------------------------------------
| Tracker name | Success | FPS |
------------------------------------------------
| siamrpn_alexnet_v2_otb15 | 0.668 | 132.1 |
------------------------------------------------
```

#### References
<a name="siamrpn-1" href="https://openaccess.thecvf.com/content_cvpr_2018/papers/Li_High_Performance_Visual_CVPR_2018_paper.pdf">[1]</a>
High Performance Visual Tracking with Siamese Region Proposal Network,
[PDF](https://openaccess.thecvf.com/content_cvpr_2018/papers/Li_High_Performance_Visual_CVPR_2018_paper.pdf).
56 changes: 45 additions & 11 deletions docs/reference/opendr-ros-bridge.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ Converts an OpenDRPose2D message into an OpenDR Pose.

Parameters:

- **ros_pose**: *ros_bridge.msg.OpenDRPose2D*\
- **ros_pose**: *opendr_bridge.msg.OpenDRPose2D*\
ROS pose to be converted into an OpenDR Pose.

#### `ROSBridge.to_ros_pose`
Expand All @@ -72,9 +72,9 @@ Parameters:
ROSBridge.to_ros_pose(self,
pose)
```
Converts an OpenDR Pose into a OpenDRPose2D msg that can carry the same information, i.e. a list of keypoints,
Converts an OpenDR Pose into a OpenDRPose2D msg that can carry the same information, i.e. a list of keypoints,
the pose detection confidence and the pose id.
Each keypoint is represented as an OpenDRPose2DKeypoint with x, y pixel position on input image with (0, 0)
Each keypoint is represented as an OpenDRPose2DKeypoint with x, y pixel position on input image with (0, 0)
being the top-left corner.

Parameters:
Expand Down Expand Up @@ -121,7 +121,7 @@ Converts a ROS ObjectHypothesis message into an OpenDR Category.

Parameters:

- **message**: *ros_bridge.msg.ObjectHypothesis*\
- **message**: *vision_msgs.msg.ObjectHypothesis*\
ROS ObjectHypothesis to be converted into an OpenDR Category.


Expand All @@ -136,7 +136,7 @@ Converts a ROS ObjectHypothesis message into an OpenDR Category.

Parameters:

- **message**: *ros_bridge.msg.ObjectHypothesis*\
- **message**: *vision_msgs.msg.ObjectHypothesis*\
ROS ObjectHypothesis to be converted into an OpenDR Category.

#### `ROSBridge.to_ros_face`
Expand Down Expand Up @@ -387,11 +387,45 @@ Parameters:
- **frame**: *int, default=-1*\
The frame index to assign to the tracking boxes.

#### `ROSBridge.to_ros_single_tracking_annotation`

```python
ROSBridge.to_ros_single_tracking_annotation(self, tracking_annotation)
```

Converts a `TrackingAnnotation` object to a `Detection2D` ROS message.
This method is intended for single object tracking methods.

Parameters:

- **tracking_annotation**: *opendr.engine.target.TrackingAnnotation*\
The box to be converted.

#### `ROSBridge.from_ros_single_tracking_annotation`

```python
ROSBridge.from_ros_single_tracking_annotation(self, ros_detection_box)
```

Converts a `Detection2D` ROS message object to a `TrackingAnnotation` object.
This method is intended for single object tracking methods.

Parameters:

- **ros_detection_box**: *vision_msgs.Detection2D*\
The box to be converted.

## ROS message equivalence with OpenDR
1. `sensor_msgs.msg.Img` is used as an equivelant to `engine.data.Image`
2. `ros_bridge.msg.Pose` is used as an equivelant to `engine.target.Pose`
1. `sensor_msgs.msg.Img` is used as an equivalent to `engine.data.Image`
2. `opendr_bridge.msg.Pose` is used as an equivalent to `engine.target.Pose`
3. `vision_msgs.msg.Detection2DArray` is used as an equivalent to `engine.target.BoundingBoxList`
4. `vision_msgs.msg.Detection2D` is used as an equivalent to `engine.target.BoundingBox`
5. `geometry_msgs.msg.Pose` is used as an equivelant to `engine.target.Pose` for 3D poses conversion only.
6. `vision_msgs.msg.Detection3DArray` is used as an equivelant to `engine.target.BoundingBox3DList`.
7. `sensor_msgs.msg.PointCloud` is used as an equivelant to `engine.data.PointCloud`.
4. `vision_msgs.msg.Detection2D` is used as an equivalent to `engine.target.BoundingBox` and
to `engine.target.TrackingAnnotation` in single object tracking
5. `geometry_msgs.msg.Pose` is used as an equivalent to `engine.target.Pose` for 3D poses conversion only.
6. `vision_msgs.msg.Detection3DArray` is used as an equivalent to `engine.target.BoundingBox3DList`.
7. `sensor_msgs.msg.PointCloud` is used as an equivalent to `engine.data.PointCloud`.

## ROS services
The following ROS services are implemented (`srv` folder):
1. `opendr_bridge.OpenDRSingleObjectTracking`: can be used to initialize the tracking process of single
object trackers, by providing a `Detection2D` bounding box
6 changes: 6 additions & 0 deletions projects/opendr_ws/src/opendr_bridge/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,12 @@ add_message_files(
OpenDRPose2D.msg
)

add_service_files(
DIRECTORY srv
FILES
OpenDRSingleObjectTracking.srv
)

generate_messages(
DEPENDENCIES
std_msgs
Expand Down
Loading

0 comments on commit 58a9067

Please # to comment.