Skip to content

Commit

Permalink
Apply suggestions from code review
Browse files Browse the repository at this point in the history
Co-authored-by: Kostas Tsampazis <27914645+tsampazk@users.noreply.github.com>
  • Loading branch information
mthodoris and tsampazk authored Dec 21, 2022
1 parent 664d0bb commit 82313ba
Show file tree
Hide file tree
Showing 6 changed files with 34 additions and 46 deletions.
18 changes: 9 additions & 9 deletions docs/reference/high-resolution-pose-estimation.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,12 @@ Bases: `engine.learners.Learner`
The *HighResolutionLightweightOpenPose* class is an implementation for pose estimation in high resolution images.
This method creates a heatmap of a resized version of the input image.
Using this heatmap, the input image is cropped keeping the area of interest and then it is used for pose estimation.
Since the high resolution pose estimation method is based on Lightweight OpenPose algorithm,the models that could be used have to be trained with Lightweight OpenPose tool.
Since the high resolution pose estimation method is based on the Lightweight OpenPose algorithm, the models that can be used have to be trained with the Lightweight OpenPose tool.

In this method there are two important variables which are responsible for the increase in speed and accuracy in high resolution images.
These variables are the *first_pass_height* and the *second_pass_height* that the image is resized in this procedure.
These variables are *first_pass_height* and *second_pass_height* which define how the image is resized in this procedure.

The [HighResolutionPoseEstimationLearner](/src/opendr/perception/pose_estimation/hr_pose_estimation/HighResolutionLearner.py) class has the following public methods:
The [HighResolutionPoseEstimationLearner](/src/opendr/perception/pose_estimation/hr_pose_estimation/high_resolution_learner.py) class has the following public methods:

#### `HighResolutionPoseEstimationLearner` constructor
```python
Expand Down Expand Up @@ -134,7 +134,7 @@ Parameters:
HighResolutionPoseEstimationLearner.__first_pass(self, net, img)
```

This method is used for extracting from the input image a heatmap about human locations in the picture.
This method is used for extracting a heatmap from the input image about human locations in the picture.

Parameters:

Expand All @@ -148,8 +148,8 @@ Parameters:
HighResolutionPoseEstimationLearner.__second_pass(self, net, img, net_input_height_size, max_width, stride, upsample_ratio, pad_value, img_mean, img_scale)
```

On this method it is carried out the second inference step which estimates the human poses on the image that is inserted.
Following the steps of the proposed method this image should be the cropped part of the initial high resolution image that came out from taking into account the area of interest of heatmap generation.
On this method the second inference step is carried out, which estimates the human poses on the image that is provided.
Following the steps of the proposed method this image should be the cropped part of the initial high resolution image that came out from taking into account the area of interest of the heatmap generated.

Parameters:

Expand Down Expand Up @@ -253,7 +253,7 @@ The experiments are conducted on a 1080p image.
| OpenDR - Full | 2.9 | 83.1 | 11.2 | 13.5 |


#### Lightweght OpenPoseWithout resizing
#### Lightweight OpenPoseWithout resizing
| Method | CPU i7-9700K (FPS) | RTX 2070 (FPS) | Jetson TX2 (FPS) | Xavier NX (FPS) |
|-------------------|--------------------|-----------------|------------------|-----------------|
| OpenDR - Baseline | 0.05 | 2.6 | 0.3 | 0.5 |
Expand All @@ -270,7 +270,7 @@ The experiments are conducted on a 1080p image.
| HRPoseEstim - H+S | 8.2 | 25.9 | 3.6 | 5.5 |
| HRPoseEstim - Full | 10.9 | 31.7 | 4.8 | 6.9 |

As it is shown in the previous Table, OpenDR Lightweight OpenPose achieves higher FPS when it is resing the input image into 256 pixels.
As it is shown in the previous tables, OpenDR Lightweight OpenPose achieves higher FPS when it is resizing the input image into 256 pixels.
It is easier to process that image, but as it is shown in the next tables the method falls apart when it comes to accuracy and there are no detections.

We have evaluated the effect of using different inference settings, namely:
Expand All @@ -282,7 +282,7 @@ We have evaluated the effect of using different inference settings, namely:
- *HRPoseEstim - Full*, which refers to combining all three available optimization.
was used as input to the models.

The average precision and average recall on the COCO evaluation split is also reported in the Table below:
The average precision and average recall on the COCO evaluation split is also reported in the tables below:


#### Lightweight OpenPose with resizing
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,6 @@ More specifically, the applications provided are:

1. demos/inference_demo.py: A tool that demonstrates how to perform inference on a single high resolution image and then draw the detected poses.
2. demos/eval_demo.py: A tool that demonstrates how to perform evaluation using the High Resolution Pose Estimation algorithm on 720p, 1080p and 1440p datasets.
3. demos/benchmarking.py: A simple benchmarking tool for measuring the performance of High Resolution Pose Estimation in various platforms.
3. demos/benchmarking_demo.py: A simple benchmarking tool for measuring the performance of High Resolution Pose Estimation in various platforms.


Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@

if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument("--onnx", help="Use ONNX", default=False, action="store_true")
parser.add_argument("--device", help="Device to use (cpu, cuda)", type=str, default="cuda")
parser.add_argument("--accelerate", help="Enables acceleration flags (e.g., stride)", default=False,
action="store_true")
Expand All @@ -32,7 +31,7 @@

args = parser.parse_args()

onnx, device, accelerate, base_height1, base_height2 = args.onnx, args.device, args.accelerate,\
device, accelerate, base_height1, base_height2 = args.device, args.accelerate,\
args.height1, args.height2

if device == 'cpu':
Expand Down Expand Up @@ -61,15 +60,11 @@
image_path = join("temp", "dataset", "image", "000000000785_1080.jpg")
img = cv2.imread(image_path)

if onnx:
pose_estimator.optimize()

fps_list = []
print("Benchmarking...")
for i in tqdm(range(50)):
start_time = time.perf_counter()
# Perform inference

poses = pose_estimator.infer(img)

end_time = time.perf_counter()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@

if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument("--onnx", help="Use ONNX", default=False, action="store_true")
parser.add_argument("--device", help="Device to use (cpu, cuda)", type=str, default="cuda")
parser.add_argument("--accelerate", help="Enables acceleration flags (e.g., stride)", default=False,
action="store_true")
Expand All @@ -30,7 +29,7 @@

args = parser.parse_args()

onnx, device, accelerate, base_height1, base_height2 = args.onnx, args.device, args.accelerate,\
device, accelerate, base_height1, base_height2 = args.device, args.accelerate,\
args.height1, args.height2

if accelerate:
Expand All @@ -50,9 +49,6 @@
pose_estimator.download(path=".", verbose=True)
pose_estimator.load("openpose_default")

if onnx:
pose_estimator.optimize()

# Download a sample dataset
pose_estimator.download(path=".", mode="test_data")

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@

if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument("--onnx", help="Use ONNX", default=False, action="store_true")
parser.add_argument("--device", help="Device to use (cpu, cuda)", type=str, default="cuda")
parser.add_argument("--accelerate", help="Enables acceleration flags (e.g., stride)", default=False,
action="store_true")
Expand All @@ -31,7 +30,7 @@

args = parser.parse_args()

onnx, device, accelerate, base_height1, base_height2 = args.onnx, args.device, args.accelerate,\
device, accelerate, base_height1, base_height2 = args.device, args.accelerate,\
args.height1, args.height2

if accelerate:
Expand All @@ -57,9 +56,6 @@

img = Image.open(image_path)

if onnx:
pose_estimator.optimize()

poses = pose_estimator.infer(img)

img_cv = img.opencv()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,8 @@ def __second_pass(self, net, img, net_input_height_size, max_width, stride, upsa

return heatmaps, pafs, scale, pad

def __pooling(self, img, kernel): # Pooling on input image for dimension reduction
@staticmethod
def __pooling(img, kernel): # Pooling on input image for dimension reduction
"""This method applies a pooling filter on an input image in order to resize it in a fixed shape
:param img: input image for resizing
Expand All @@ -173,12 +174,15 @@ def __pooling(self, img, kernel): # Pooling on input image for dimension reduct
pool_img = pool_img.squeeze(0).permute(1, 2, 0).cpu().float().numpy()
return pool_img

def fit(self, dataset, val_dataset=None, logging_path='', silent=True, verbose=True):
def fit(self, dataset, val_dataset=None, logging_path='', logging_flush_secs=30,
silent=False, verbose=True, epochs=None, use_val_subset=True, val_subset_size=250,
images_folder_name="train2017", annotations_filename="person_keypoints_train2017.json",
val_images_folder_name="val2017", val_annotations_filename="person_keypoints_val2017.json"):
"""This method is not used in this implementation."""

raise NotImplementedError

def optimize(self, target_device):
def optimize(self, do_constant_folding=False):
"""This method is not used in this implementation."""

raise NotImplementedError
Expand All @@ -187,11 +191,11 @@ def reset(self):
"""This method is not used in this implementation."""
return NotImplementedError

def save(self, path):
def save(self, path, verbose=False):
"""This method is not used in this implementation."""
return NotImplementedError

def eval(self, dataset, silent=False, verbose=True, use_subset=True, subset_size=250, upsample_ratio=4,
def eval(self, dataset, silent=False, verbose=True, use_subset=True, subset_size=250, upsample_ratio=4,
images_folder_name="val2017", annotations_filename="person_keypoints_val2017.json"):
"""
This method is used to evaluate a trained model on an evaluation dataset.
Expand Down Expand Up @@ -222,7 +226,7 @@ def eval(self, dataset, silent=False, verbose=True, use_subset=True, subset_siz
:rtype: dict
"""

data = super(HighResolutionPoseEstimationLearner,
data = super(HighResolutionPoseEstimationLearner, # NOQA
self)._LightweightOpenPoseLearner__prepare_val_dataset(dataset, use_subset=use_subset,
subset_name="val_subset.json",
subset_size=subset_size,
Expand Down Expand Up @@ -287,13 +291,13 @@ def eval(self, dataset, silent=False, verbose=True, use_subset=True, subset_siz
max_width = w
kernel = int(h / self.first_pass_height)
if kernel > 0:
pool_img = HighResolutionPoseEstimationLearner.__pooling(self, img, kernel)
pool_img = self.__pooling(img, kernel)

else:
pool_img = img

# ------- Heatmap Generation -------
avg_pafs = HighResolutionPoseEstimationLearner.__first_pass(self, self.model, pool_img)
avg_pafs = self.__first_pass(self.model, pool_img)
avg_pafs = avg_pafs.astype(np.float32)

pafs_map = cv2.blur(avg_pafs, (5, 5))
Expand Down Expand Up @@ -345,11 +349,9 @@ def eval(self, dataset, silent=False, verbose=True, use_subset=True, subset_siz
h, w, _ = crop_img.shape

# ------- Second pass of the image, inference for pose estimation -------
avg_heatmaps, avg_pafs, scale, pad = \
HighResolutionPoseEstimationLearner.__second_pass(self,
self.model, crop_img,
self.second_pass_height, max_width,
self.stride, upsample_ratio)
avg_heatmaps, avg_pafs, scale, pad = self.__second_pass(self.model, crop_img,
self.second_pass_height, max_width,
self.stride, upsample_ratio)
total_keypoints_num = 0
all_keypoints_by_type = []
for kpt_idx in range(18):
Expand Down Expand Up @@ -396,7 +398,7 @@ def eval(self, dataset, silent=False, verbose=True, use_subset=True, subset_siz
if self.visualize:
for keypoints in coco_keypoints:
for idx in range(len(keypoints) // 3):
cv2.circle(img, (int(keypoints[idx * 3]+offset), int(keypoints[idx * 3 + 1])+offset),
cv2.circle(img, (int(keypoints[idx * 3] + offset), int(keypoints[idx * 3 + 1]) + offset),
3, (255, 0, 255), -1)
cv2.imshow('keypoints', img)
key = cv2.waitKey()
Expand Down Expand Up @@ -461,12 +463,12 @@ def infer(self, img, upsample_ratio=4, stride=8, track=True, smooth=True,

kernel = int(h / self.first_pass_height)
if kernel > 0:
pool_img = HighResolutionPoseEstimationLearner.__pooling(self, img, kernel)
pool_img = self.__pooling(img, kernel)
else:
pool_img = img

# ------- Heatmap Generation -------
avg_pafs = HighResolutionPoseEstimationLearner.__first_pass(self, self.model, pool_img)
avg_pafs = self.__first_pass(self.model, pool_img)
avg_pafs = avg_pafs.astype(np.float32)
pafs_map = cv2.blur(avg_pafs, (5, 5))

Expand Down Expand Up @@ -517,10 +519,9 @@ def infer(self, img, upsample_ratio=4, stride=8, track=True, smooth=True,
h, w, _ = crop_img.shape

# ------- Second pass of the image, inference for pose estimation -------
avg_heatmaps, avg_pafs, scale, pad = \
HighResolutionPoseEstimationLearner.__second_pass(self, self.model, crop_img,
self.second_pass_height,
max_width, stride, upsample_ratio)
avg_heatmaps, avg_pafs, scale, pad = self.__second_pass(self.model, crop_img,
self.second_pass_height,
max_width, stride, upsample_ratio)

total_keypoints_num = 0
all_keypoints_by_type = []
Expand Down

0 comments on commit 82313ba

Please # to comment.