Apply suggestions from code review

Co-authored-by: Kostas Tsampazis <27914645+tsampazk@users.noreply.github.com>
opendr-eu · Dec 21, 2022 · 82313ba · 82313ba
1 parent 664d0bb
commit 82313ba
Show file tree

Hide file tree

Showing 6 changed files with 34 additions and 46 deletions.
diff --git a/docs/reference/high-resolution-pose-estimation.md b/docs/reference/high-resolution-pose-estimation.md
@@ -8,12 +8,12 @@ Bases: `engine.learners.Learner`
 The *HighResolutionLightweightOpenPose* class is an implementation for pose estimation in high resolution images.
 This method creates a heatmap of a resized version of the input image.
 Using this heatmap, the input image is cropped keeping the area of interest and then it is used for pose estimation.
-Since the high resolution pose estimation method is based on Lightweight OpenPose algorithm,the models that could be used have to be trained with Lightweight OpenPose tool.
+Since the high resolution pose estimation method is based on the Lightweight OpenPose algorithm, the models that can be used have to be trained with the Lightweight OpenPose tool.
 
 In this method there are two important variables which are responsible for the increase in speed and accuracy in high resolution images.
-These variables are the *first_pass_height* and the *second_pass_height* that the image is resized in this procedure.
+These variables are *first_pass_height* and *second_pass_height* which define how the image is resized in this procedure.
 
-The [HighResolutionPoseEstimationLearner](/src/opendr/perception/pose_estimation/hr_pose_estimation/HighResolutionLearner.py) class has the following public methods:
+The [HighResolutionPoseEstimationLearner](/src/opendr/perception/pose_estimation/hr_pose_estimation/high_resolution_learner.py) class has the following public methods:
 
 #### `HighResolutionPoseEstimationLearner` constructor
 ```python
@@ -134,7 +134,7 @@ Parameters:
 HighResolutionPoseEstimationLearner.__first_pass(self, net, img)
 ```
 
-This method is used for extracting from the input image a heatmap about human locations in the picture.
+This method is used for extracting a heatmap from the input image about human locations in the picture.
 
 Parameters:
 
@@ -148,8 +148,8 @@ Parameters:
 HighResolutionPoseEstimationLearner.__second_pass(self, net, img, net_input_height_size, max_width, stride, upsample_ratio, pad_value, img_mean, img_scale)
 ```
 
-On this method it is carried out the second inference step which estimates the human poses on the image that is inserted.
-Following the steps of the proposed method this image should be the cropped part of the initial high resolution image that came out from taking into account the area of interest of heatmap generation.
+On this method the second inference step is carried out, which estimates the human poses on the image that is provided.
+Following the steps of the proposed method this image should be the cropped part of the initial high resolution image that came out from taking into account the area of interest of the heatmap generated.
 
 Parameters:
 
@@ -253,7 +253,7 @@ The experiments are conducted on a 1080p image.
 |                  OpenDR - Full                   | 2.9                   | 83.1              | 11.2                 | 13.5                |
 
 
-#### Lightweght OpenPoseWithout resizing
+#### Lightweight OpenPoseWithout resizing
 | Method            | CPU i7-9700K (FPS) | RTX 2070  (FPS) | Jetson TX2 (FPS) | Xavier NX (FPS) |
 |-------------------|--------------------|-----------------|------------------|-----------------|
 | OpenDR - Baseline | 0.05               | 2.6             | 0.3              | 0.5             |
@@ -270,7 +270,7 @@ The experiments are conducted on a 1080p image.
 | HRPoseEstim - H+S      | 8.2                | 25.9           | 3.6              | 5.5             |
 | HRPoseEstim - Full     | 10.9               | 31.7           | 4.8              | 6.9             |
 
-As it is shown in the previous Table, OpenDR Lightweight OpenPose achieves higher FPS when it is resing the input image into 256 pixels.
+As it is shown in the previous tables, OpenDR Lightweight OpenPose achieves higher FPS when it is resizing the input image into 256 pixels.
 It is easier to process that image, but as it is shown in the next tables the method falls apart when it comes to accuracy and there are no detections.
 
 We have evaluated the effect of using different inference settings, namely:
@@ -282,7 +282,7 @@ We have evaluated the effect of using different inference settings, namely:
 - *HRPoseEstim - Full*, which refers to combining all three available optimization.
 was used as input to the models.
 
-The average precision and average recall on the COCO evaluation split is also reported in the Table below:
+The average precision and average recall on the COCO evaluation split is also reported in the tables below:
 
 
 #### Lightweight OpenPose with resizing

diff --git a/...cts/python/perception/pose_estimation/high_resolution_pose_estimation/README.md b/...cts/python/perception/pose_estimation/high_resolution_pose_estimation/README.md
@@ -6,6 +6,6 @@ More specifically, the applications provided are:
 
 1. demos/inference_demo.py: A tool that demonstrates how to perform inference on a single high resolution image and then draw the detected poses. 
 2. demos/eval_demo.py: A tool that demonstrates how to perform evaluation using the High Resolution Pose Estimation algorithm on 720p, 1080p and 1440p datasets. 
-3. demos/benchmarking.py: A simple benchmarking tool for measuring the performance of High Resolution Pose Estimation in various platforms.
+3. demos/benchmarking_demo.py: A simple benchmarking tool for measuring the performance of High Resolution Pose Estimation in various platforms.
 
 
diff --git a/...hon/perception/pose_estimation/high_resolution_pose_estimation/demos/benchmarking_demo.py b/...hon/perception/pose_estimation/high_resolution_pose_estimation/demos/benchmarking_demo.py
@@ -23,7 +23,6 @@
 
 if __name__ == '__main__':
     parser = argparse.ArgumentParser()
-    parser.add_argument("--onnx", help="Use ONNX", default=False, action="store_true")
     parser.add_argument("--device", help="Device to use (cpu, cuda)", type=str, default="cuda")
     parser.add_argument("--accelerate", help="Enables acceleration flags (e.g., stride)", default=False,
                         action="store_true")
@@ -32,7 +31,7 @@
 
     args = parser.parse_args()
 
-    onnx, device, accelerate, base_height1, base_height2 = args.onnx, args.device, args.accelerate,\
+    device, accelerate, base_height1, base_height2 = args.device, args.accelerate,\
         args.height1, args.height2
 
     if device == 'cpu':
@@ -61,15 +60,11 @@
     image_path = join("temp", "dataset", "image", "000000000785_1080.jpg")
     img = cv2.imread(image_path)
 
-    if onnx:
-        pose_estimator.optimize()
-
     fps_list = []
     print("Benchmarking...")
     for i in tqdm(range(50)):
         start_time = time.perf_counter()
         # Perform inference
-
         poses = pose_estimator.infer(img)
 
         end_time = time.perf_counter()

diff --git a/...ects/python/perception/pose_estimation/high_resolution_pose_estimation/demos/eval_demo.py b/...ects/python/perception/pose_estimation/high_resolution_pose_estimation/demos/eval_demo.py
@@ -21,7 +21,6 @@
 
 if __name__ == '__main__':
     parser = argparse.ArgumentParser()
-    parser.add_argument("--onnx", help="Use ONNX", default=False, action="store_true")
     parser.add_argument("--device", help="Device to use (cpu, cuda)", type=str, default="cuda")
     parser.add_argument("--accelerate", help="Enables acceleration flags (e.g., stride)", default=False,
                         action="store_true")
@@ -30,7 +29,7 @@
 
     args = parser.parse_args()
 
-    onnx, device, accelerate, base_height1, base_height2 = args.onnx, args.device, args.accelerate,\
+    device, accelerate, base_height1, base_height2 = args.device, args.accelerate,\
         args.height1, args.height2
 
     if accelerate:
@@ -50,9 +49,6 @@
     pose_estimator.download(path=".", verbose=True)
     pose_estimator.load("openpose_default")
 
-    if onnx:
-        pose_estimator.optimize()
-
     # Download a sample dataset
     pose_estimator.download(path=".", mode="test_data")
 

diff --git a/...python/perception/pose_estimation/high_resolution_pose_estimation/demos/inference_demo.py b/...python/perception/pose_estimation/high_resolution_pose_estimation/demos/inference_demo.py
@@ -22,7 +22,6 @@
 
 if __name__ == '__main__':
     parser = argparse.ArgumentParser()
-    parser.add_argument("--onnx", help="Use ONNX", default=False, action="store_true")
     parser.add_argument("--device", help="Device to use (cpu, cuda)", type=str, default="cuda")
     parser.add_argument("--accelerate", help="Enables acceleration flags (e.g., stride)", default=False,
                         action="store_true")
@@ -31,7 +30,7 @@
 
     args = parser.parse_args()
 
-    onnx, device, accelerate, base_height1, base_height2 = args.onnx, args.device, args.accelerate,\
+    device, accelerate, base_height1, base_height2 = args.device, args.accelerate,\
         args.height1, args.height2
 
     if accelerate:
@@ -57,9 +56,6 @@
 
     img = Image.open(image_path)
 
-    if onnx:
-        pose_estimator.optimize()
-
     poses = pose_estimator.infer(img)
 
     img_cv = img.opencv()

diff --git a/src/opendr/perception/pose_estimation/hr_pose_estimation/high_resolution_learner.py b/src/opendr/perception/pose_estimation/hr_pose_estimation/high_resolution_learner.py
@@ -159,7 +159,8 @@ def __second_pass(self, net, img, net_input_height_size, max_width, stride, upsa
 
         return heatmaps, pafs, scale, pad
 
-    def __pooling(self, img, kernel):  # Pooling on input image for dimension reduction
+    @staticmethod
+    def __pooling(img, kernel):  # Pooling on input image for dimension reduction
         """This method applies a pooling filter on an input image in order to resize it in a fixed shape
 
          :param img: input image for resizing
@@ -173,12 +174,15 @@ def __pooling(self, img, kernel):  # Pooling on input image for dimension reduct
         pool_img = pool_img.squeeze(0).permute(1, 2, 0).cpu().float().numpy()
         return pool_img
 
-    def fit(self, dataset, val_dataset=None, logging_path='', silent=True, verbose=True):
+    def fit(self, dataset, val_dataset=None, logging_path='', logging_flush_secs=30,
+            silent=False, verbose=True, epochs=None, use_val_subset=True, val_subset_size=250,
+            images_folder_name="train2017", annotations_filename="person_keypoints_train2017.json",
+            val_images_folder_name="val2017", val_annotations_filename="person_keypoints_val2017.json"):
         """This method is not used in this implementation."""
 
         raise NotImplementedError
 
-    def optimize(self, target_device):
+    def optimize(self, do_constant_folding=False):
         """This method is not used in this implementation."""
 
         raise NotImplementedError
@@ -187,11 +191,11 @@ def reset(self):
         """This method is not used in this implementation."""
         return NotImplementedError
 
-    def save(self, path):
+    def save(self, path, verbose=False):
         """This method is not used in this implementation."""
         return NotImplementedError
 
-    def eval(self, dataset,  silent=False, verbose=True, use_subset=True, subset_size=250, upsample_ratio=4,
+    def eval(self, dataset, silent=False, verbose=True, use_subset=True, subset_size=250, upsample_ratio=4,
              images_folder_name="val2017", annotations_filename="person_keypoints_val2017.json"):
         """
                 This method is used to evaluate a trained model on an evaluation dataset.
@@ -222,7 +226,7 @@ def eval(self, dataset,  silent=False, verbose=True, use_subset=True, subset_siz
                 :rtype: dict
                 """
 
-        data = super(HighResolutionPoseEstimationLearner,
+        data = super(HighResolutionPoseEstimationLearner,  # NOQA
                      self)._LightweightOpenPoseLearner__prepare_val_dataset(dataset, use_subset=use_subset,
                                                                             subset_name="val_subset.json",
                                                                             subset_size=subset_size,
@@ -287,13 +291,13 @@ def eval(self, dataset,  silent=False, verbose=True, use_subset=True, subset_siz
             max_width = w
             kernel = int(h / self.first_pass_height)
             if kernel > 0:
-                pool_img = HighResolutionPoseEstimationLearner.__pooling(self, img, kernel)
+                pool_img = self.__pooling(img, kernel)
 
             else:
                 pool_img = img
 
             # ------- Heatmap Generation -------
-            avg_pafs = HighResolutionPoseEstimationLearner.__first_pass(self, self.model, pool_img)
+            avg_pafs = self.__first_pass(self.model, pool_img)
             avg_pafs = avg_pafs.astype(np.float32)
 
             pafs_map = cv2.blur(avg_pafs, (5, 5))
@@ -345,11 +349,9 @@ def eval(self, dataset,  silent=False, verbose=True, use_subset=True, subset_siz
                 h, w, _ = crop_img.shape
 
                 # ------- Second pass of the image, inference for pose estimation -------
-                avg_heatmaps, avg_pafs, scale, pad = \
-                    HighResolutionPoseEstimationLearner.__second_pass(self,
-                                                                      self.model, crop_img,
-                                                                      self.second_pass_height, max_width,
-                                                                      self.stride, upsample_ratio)
+                avg_heatmaps, avg_pafs, scale, pad = self.__second_pass(self.model, crop_img,
+                                                                        self.second_pass_height, max_width,
+                                                                        self.stride, upsample_ratio)
                 total_keypoints_num = 0
                 all_keypoints_by_type = []
                 for kpt_idx in range(18):
@@ -396,7 +398,7 @@ def eval(self, dataset,  silent=False, verbose=True, use_subset=True, subset_siz
             if self.visualize:
                 for keypoints in coco_keypoints:
                     for idx in range(len(keypoints) // 3):
-                        cv2.circle(img, (int(keypoints[idx * 3]+offset), int(keypoints[idx * 3 + 1])+offset),
+                        cv2.circle(img, (int(keypoints[idx * 3] + offset), int(keypoints[idx * 3 + 1]) + offset),
                                    3, (255, 0, 255), -1)
                 cv2.imshow('keypoints', img)
                 key = cv2.waitKey()
@@ -461,12 +463,12 @@ def infer(self, img, upsample_ratio=4, stride=8, track=True, smooth=True,
 
         kernel = int(h / self.first_pass_height)
         if kernel > 0:
-            pool_img = HighResolutionPoseEstimationLearner.__pooling(self, img, kernel)
+            pool_img = self.__pooling(img, kernel)
         else:
             pool_img = img
 
         # ------- Heatmap Generation -------
-        avg_pafs = HighResolutionPoseEstimationLearner.__first_pass(self, self.model, pool_img)
+        avg_pafs = self.__first_pass(self.model, pool_img)
         avg_pafs = avg_pafs.astype(np.float32)
         pafs_map = cv2.blur(avg_pafs, (5, 5))
 
@@ -517,10 +519,9 @@ def infer(self, img, upsample_ratio=4, stride=8, track=True, smooth=True,
             h, w, _ = crop_img.shape
 
             # ------- Second pass of the image, inference for pose estimation -------
-            avg_heatmaps, avg_pafs, scale, pad = \
-                HighResolutionPoseEstimationLearner.__second_pass(self, self.model, crop_img,
-                                                                  self.second_pass_height,
-                                                                  max_width, stride, upsample_ratio)
+            avg_heatmaps, avg_pafs, scale, pad = self.__second_pass(self.model, crop_img,
+                                                                    self.second_pass_height,
+                                                                    max_width, stride, upsample_ratio)
 
             total_keypoints_num = 0
             all_keypoints_by_type = []