Update HybrIK support by @Jeff-sjtu

YuliangXiu · May 30, 2022 · 3663704 · 3663704
1 parent 53273e0
commit 3663704
Show file tree

Hide file tree

Showing 20 changed files with 2,390 additions and 32 deletions.
diff --git a/.gitignore b/.gitignore
@@ -9,3 +9,4 @@ results/*
 force_push.sh
 scripts/vis*
 scripts/process_all*
+.idea
diff --git a/README.md b/README.md
@@ -46,6 +46,7 @@
 <br />
 
 ## News :triangular_flag_on_post:
+- [2022/04/26] <a href="https://github.com/Jeff-sjtu/HybrIK">HybrIK (SMPL)</a> is supported as optional HPS by <a href="https://jeffli.site/">Jiefeng Li</a>.
 - [2022/03/05] <a href="https://github.com/YadiraF/PIXIE">PIXIE (SMPL-X)</a>, <a href="https://github.com/mkocabas/PARE">PARE (SMPL)</a>, <a href="https://github.com/HongwenZhang/PyMAF">PyMAF (SMPL)</a> are all supported as optional HPS.
 - [2022/02/07] <a href='https://colab.research.google.com/drive/1-AWeWhPvCTBX0KfMtgtMk10uPU05ihoA?usp=sharing' style='padding-left: 0.5rem;'><img src='https://colab.research.google.com/assets/colab-badge.svg' alt='Google Colab'></a> is ready to use.
 
@@ -119,7 +120,7 @@
 ## TODO
 
 - [x] testing code and pretrained models (*self-implemented version)
-  - [x] ICON (w/ & w/o global encoder, w/ PyMAF/PIXIE/PARE as HPS)
+  - [x] ICON (w/ & w/o global encoder, w/ PyMAF/HybrIK/PIXIE/PARE as HPS)
   - [x] PIFu* (RGB image + predicted normal map as input)
   - [x] PaMIR* (RGB image + predicted normal map as input, w/ PyMAF/PARE as HPS)
 - [x] colab notebook <a href='https://colab.research.google.com/drive/1-AWeWhPvCTBX0KfMtgtMk10uPU05ihoA?usp=sharing' style='padding-left: 0.5rem;'>
@@ -150,10 +151,10 @@ python infer.py -cfg ../configs/pifu.yaml -gpu 0 -in_dir ../examples -out_dir ..
 python infer.py -cfg ../configs/pamir.yaml -gpu 0 -in_dir ../examples -out_dir ../results
 
 # ICON w/ global filter (better visual details --> lower Normal Error))
-python infer.py -cfg ../configs/icon-filter.yaml -gpu 0 -in_dir ../examples -out_dir ../results -hps_type {pixie/pymaf/pare}
+python infer.py -cfg ../configs/icon-filter.yaml -gpu 0 -in_dir ../examples -out_dir ../results -hps_type {pixie/pymaf/pare/hybrik}
 
 # ICON w/o global filter (higher evaluation scores --> lower P2S/Chamfer Error))
-python infer.py -cfg ../configs/icon-nofilter.yaml -gpu 0 -in_dir ../examples -out_dir ../results -hps_type {pixie/pymaf/pare}
+python infer.py -cfg ../configs/icon-nofilter.yaml -gpu 0 -in_dir ../examples -out_dir ../results -hps_type {pixie/pymaf/pare/hybrik}
 ```
 
 ## More Qualitative Results
@@ -194,7 +195,7 @@ Here are some great resources we benefit from:
 - [PaMIR](https://github.com/ZhengZerong/PaMIR), [PIFu](https://github.com/shunsukesaito/PIFu), [PIFuHD](https://github.com/facebookresearch/pifuhd), and [MonoPort](https://github.com/Project-Splinter/MonoPort) for Benchmark
 - [SCANimate](https://github.com/shunsukesaito/SCANimate) and [AIST++](https://github.com/google/aistplusplus_api) for Animation
 - [rembg](https://github.com/danielgatis/rembg) for Human Segmentation
-- [smplx](https://github.com/vchoutas/smplx), [PARE](https://github.com/mkocabas/PARE), [PyMAF](https://github.com/HongwenZhang/PyMAF), and [PIXIE](https://github.com/YadiraF/PIXIE) for Human Pose & Shape Estimation
+- [smplx](https://github.com/vchoutas/smplx), [PARE](https://github.com/mkocabas/PARE), [PyMAF](https://github.com/HongwenZhang/PyMAF), [PIXIE](https://github.com/YadiraF/PIXIE), and [HybrIK](https://github.com/Jeff-sjtu/HybrIK) for Human Pose & Shape Estimation
 - [CAPE](https://github.com/qianlim/CAPE) and [THuman](https://github.com/ZhengZerong/DeepHuman/tree/master/THUmanDataset) for Dataset
 - [PyTorch3D](https://github.com/facebookresearch/pytorch3d) for Differential Rendering
 

diff --git a/assets/rendering/080.png b/assets/rendering/080.png
diff --git a/assets/rendering/SMPL_norm_B_080.png b/assets/rendering/SMPL_norm_B_080.png
diff --git a/assets/rendering/SMPL_norm_F_080.png b/assets/rendering/SMPL_norm_F_080.png
diff --git a/assets/rendering/norm_B_080.png b/assets/rendering/norm_B_080.png
diff --git a/assets/rendering/norm_F_080.png b/assets/rendering/norm_F_080.png
diff --git a/docs/dataset.md b/docs/dataset.md
@@ -30,3 +30,10 @@ bash render_batch.sh gen all
 ```
 
 Then you will get the whole generated dataset under `data/thuman2_{num_views}views`
+
+## Examples
+
+|<img src="assets/../../assets/rendering/080.png" width="150">|<img src="assets/../../assets/rendering/norm_F_080.png" width="150">|<img src="assets/../../assets/rendering/norm_B_080.png" width="150">|<img src="assets/../../assets/rendering/SMPL_norm_F_080.png" width="150">|<img src="assets/../../assets/rendering/SMPL_norm_B_080.png" width="150">|
+|---|---|---|---|---|
+|Image|Normal(Front)|Normal(Back)|Normal(SMPL, Front)|Normal(SMPL, Back)|
+
diff --git a/docs/installation.md b/docs/installation.md
@@ -36,6 +36,9 @@ source activate icon
 pip install -r requirements.txt --use-deprecated=legacy-resolver
 ```
 
+
+:warning: If you have trouble assessing Google Drive, you need VPN to use `rembg` for the first time.
+
 ## Register at [ICON's website](https://icon.is.tue.mpg.de/)
 
 ![Register](../assets/register.png)
@@ -58,7 +61,7 @@ Optional:
   cd ICON
   bash fetch_data.sh # requires username and password
   ```
-  * Download [PyMAF](https://github.com/HongwenZhang/PyMAF#necessary-files), [PARE (optional, SMPL)](https://github.com/mkocabas/PARE#demo), [PIXIE (optional, SMPL-X)](https://pixie.is.tue.mpg.de/)
+  * Download [PyMAF](https://github.com/HongwenZhang/PyMAF#necessary-files), [PARE (optional, SMPL)](https://github.com/mkocabas/PARE#demo), [PIXIE (optional, SMPL-X)](https://pixie.is.tue.mpg.de/), [HybrIK (optional, SMPL)](https://github.com/Jeff-sjtu/HybrIK)
 
   ```bash
   bash fetch_hps.sh
@@ -75,6 +78,11 @@ data/
 │   ├── normal.ckpt
 │   ├── pamir.ckpt
 │   └── pifu.ckpt
+├── hybrik_data/
+│   ├── h36m_mean_beta.npy
+│   ├── J_regressor_h36m.npy
+│   ├── hybrik_config.yaml
+│   └── pretrained_w_cam.pth
 ├── pare_data/
 │   ├── J_regressor_{extra,h36m}.npy
 │   ├── pare/

diff --git a/fetch_hps.sh b/fetch_hps.sh
@@ -20,11 +20,12 @@ rm -rf data && rm -f data.tar.gz
 source activate icon
 pip install gdown --upgrade
 gdown https://drive.google.com/drive/u/1/folders/1CkF79XRaZzdRlj6eJUt4W0nbTORv2t7O -O pretrained_model --folder
-cd ..
+cd ../..
 echo "PyMAF done!"
 
 function download_pare(){
     # (optional) download PARE
+    cd data
     wget https://www.dropbox.com/s/aeulffqzb3zmh8x/pare-github-data.zip
     unzip pare-github-data.zip && mv data pare_data
     rm -f pare-github-data.zip
@@ -54,6 +55,20 @@ function download_pixie(){
   cd ../../
 }
 
+function download_hybrik(){
+    mkdir -p data/hybrik_data
+
+    # (optional) download HybrIK
+    # gdown https://drive.google.com/uc?id=16Y_MGUynFeEzV8GVtKTE5AtkHSi3xsF9 -O data/hybrik_data/pretrained_w_cam.pth
+    gdown https://drive.google.com/uc?id=1lEWZgqxiDNNJgvpjlIXef2VuxcGbtXzi -O data/hybrik_data.zip
+    cd data
+    unzip hybrik_data.zip
+    rm -r *.zip __MACOSX
+    cd ..
+
+    echo "HybrIK done!"
+}
+
 read -p "(optional) Download PARE[SMPL] (y/n)?" choice
 case "$choice" in 
   y|Y ) download_pare;;
@@ -66,4 +81,12 @@ case "$choice" in
   y|Y ) download_pixie;;
   n|N ) echo "PIXIE Done!";;
   * ) echo "Invalid input! Please use y|Y or n|N";;
-esac
+esac
+
+pwd
+read -p "(optional) Download HybrIK[SMPL] (y/n)?" choice
+case "$choice" in 
+  y|Y ) download_hybrik;;
+  n|N ) echo "HybrIK Done!";;
+  * ) echo "Invalid input! Please use y|Y or n|N";;
+esac
diff --git a/lib/dataset/TestDataset.py b/lib/dataset/TestDataset.py
@@ -50,6 +50,9 @@
 from lib.pixielib.pixie import PIXIE
 from lib.pixielib.utils.config import cfg as pixie_cfg
 
+# for hybrik
+from lib.hybrik.models.simple3dpose import HybrIKBaseSMPLCam
+
 
 class TestDataset():
     def __init__(self, cfg, device):
@@ -104,8 +107,12 @@ def __init__(self, cfg, device):
         elif self.hps_type == 'pixie':
             self.hps = PIXIE(config = pixie_cfg, device=self.device)
             self.smpl_model = self.hps.smplx
-
-
+        elif self.hps_type == 'hybrik':
+            smpl_path = osp.join(self.smpl_data.model_dir, "smpl/SMPL_NEUTRAL.pkl")
+            self.hps = HybrIKBaseSMPLCam(cfg_file=path_config.HYBRIK_CFG, smpl_path=smpl_path, data_path=path_config.hybrik_data_dir)
+            self.hps.load_state_dict(torch.load(path_config.HYBRIK_CKPT, map_location='cpu'), strict=False)
+            self.hps.to(self.device)
+
         print(colored(f"Using {self.hps_type} as HPS Estimator\n", "green"))
 
         self.render = Render(size=512, device=device)
@@ -217,6 +224,14 @@ def __getitem__(self, index):
             data_dict['smpl_verts'] = preds_dict['vertices']
             scale, tranX, tranY = preds_dict['cam'][0, :3]
 
+        elif self.hps_type == 'hybrik':
+            data_dict['body_pose'] = preds_dict['pred_theta_mats'][:, 1:]
+            data_dict['global_orient'] = preds_dict['pred_theta_mats'][:, [0]]
+            data_dict['betas'] = preds_dict['pred_shape']
+            data_dict['smpl_verts'] = preds_dict['pred_vertices']
+            scale, tranX, tranY = preds_dict['pred_camera'][0, :3]
+            scale = scale * 2
+
         data_dict['scale'] = scale
         data_dict['trans'] = torch.tensor([tranX, tranY, 0.0]).to(self.device)
 
@@ -246,7 +261,6 @@ def visualize_alignment(self, data):
                                         global_orient=data['global_orient'],
                                         pose2rot=False)
             smpl_verts = ((smpl_out.vertices + data['trans'])* data['scale']).detach().cpu().numpy()[0]
-
         else:
             smpl_verts, _, _ = self.smpl_model(shape_params=data['betas'],
                                         expression_params=data['exp'],
@@ -303,7 +317,7 @@ def visualize_alignment(self, data):
         {
             'image_dir': "../examples",
             'has_det': True,    # w/ or w/o detection
-            'hps_type': 'pixie'  # pymaf/pare/pixie
+            'hps_type': 'hybrik'  # pymaf/pare/pixie/hybrik
         }, device)
 
 

diff --git a/lib/hybrik/models/layers/Resnet.py b/lib/hybrik/models/layers/Resnet.py
@@ -0,0 +1,166 @@
+import torch.nn as nn
+import torch.nn.functional as F
+
+
+def conv3x3(in_planes, out_planes, stride=1, groups=1, dilation=1):
+    """3x3 convolution with padding"""
+    return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
+                     padding=dilation, groups=groups, bias=False, dilation=dilation)
+
+
+class BasicBlock(nn.Module):
+    expansion = 1
+
+    def __init__(self, inplanes, planes, stride=1, downsample=None, groups=1,
+                 base_width=64, dilation=1, norm_layer=None, dcn=None):
+        super(BasicBlock, self).__init__()
+        if norm_layer is None:
+            norm_layer = nn.BatchNorm2d
+        if groups != 1 or base_width != 64:
+            raise ValueError('BasicBlock only supports groups=1 and base_width=64')
+        if dilation > 1:
+            raise NotImplementedError("Dilation > 1 not supported in BasicBlock")
+        # Both self.conv1 and self.downsample layers downsample the input when stride != 1
+        self.conv1 = conv3x3(inplanes, planes, stride)
+        self.bn1 = norm_layer(planes)
+        self.relu = nn.ReLU(inplace=True)
+        self.conv2 = conv3x3(planes, planes)
+        self.bn2 = norm_layer(planes)
+        self.downsample = downsample
+        self.stride = stride
+
+    def forward(self, x):
+        identity = x
+
+        out = self.conv1(x)
+        out = self.bn1(out)
+        out = self.relu(out)
+
+        out = self.conv2(out)
+        out = self.bn2(out)
+
+        if self.downsample is not None:
+            identity = self.downsample(x)
+
+        out += identity
+        out = self.relu(out)
+
+        return out
+
+
+class Bottleneck(nn.Module):
+    expansion = 4
+
+    def __init__(self, inplanes, planes, stride=1,
+                 downsample=None, norm_layer=nn.BatchNorm2d, dcn=None):
+        super(Bottleneck, self).__init__()
+        self.dcn = dcn
+        self.with_dcn = dcn is not None
+
+        self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
+        self.bn1 = norm_layer(planes, momentum=0.1)
+        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
+                               padding=1, bias=False)
+
+        self.bn2 = norm_layer(planes, momentum=0.1)
+        self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False)
+        self.bn3 = norm_layer(planes * 4, momentum=0.1)
+        self.downsample = downsample
+        self.stride = stride
+
+    def forward(self, x):
+        residual = x
+
+        out = F.relu(self.bn1(self.conv1(x)), inplace=True)
+        if not self.with_dcn:
+            out = F.relu(self.bn2(self.conv2(out)), inplace=True)
+        elif self.with_modulated_dcn:
+            offset_mask = self.conv2_offset(out)
+            offset = offset_mask[:, :18 * self.deformable_groups, :, :]
+            mask = offset_mask[:, -9 * self.deformable_groups:, :, :]
+            mask = mask.sigmoid()
+            out = F.relu(self.bn2(self.conv2(out, offset, mask)))
+        else:
+            offset = self.conv2_offset(out)
+            out = F.relu(self.bn2(self.conv2(out, offset)), inplace=True)
+
+        out = self.conv3(out)
+        out = self.bn3(out)
+
+        if self.downsample is not None:
+            residual = self.downsample(x)
+
+        out += residual
+        out = F.relu(out)
+
+        return out
+
+
+class ResNet(nn.Module):
+    """ ResNet """
+
+    def __init__(self, architecture, norm_layer=nn.BatchNorm2d, dcn=None, stage_with_dcn=(False, False, False, False)):
+        super(ResNet, self).__init__()
+        self._norm_layer = norm_layer
+        assert architecture in ["resnet18", "resnet34", "resnet50", "resnet101", 'resnet152']
+        layers = {
+            'resnet18': [2, 2, 2, 2],
+            'resnet34': [3, 4, 6, 3],
+            'resnet50': [3, 4, 6, 3],
+            'resnet101': [3, 4, 23, 3],
+            'resnet152': [3, 8, 36, 3],
+        }
+        self.inplanes = 64
+        if architecture == "resnet18" or architecture == 'resnet34':
+            self.block = BasicBlock
+        else:
+            self.block = Bottleneck
+        self.layers = layers[architecture]
+
+        self.conv1 = nn.Conv2d(3, 64, kernel_size=7,
+                               stride=2, padding=3, bias=False)
+        self.bn1 = norm_layer(64, eps=1e-5, momentum=0.1, affine=True)
+        self.relu = nn.ReLU(inplace=True)
+        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
+
+        stage_dcn = [dcn if with_dcn else None for with_dcn in stage_with_dcn]
+
+        self.layer1 = self.make_layer(
+            self.block, 64, self.layers[0], dcn=stage_dcn[0])
+        self.layer2 = self.make_layer(
+            self.block, 128, self.layers[1], stride=2, dcn=stage_dcn[1])
+        self.layer3 = self.make_layer(
+            self.block, 256, self.layers[2], stride=2, dcn=stage_dcn[2])
+
+        self.layer4 = self.make_layer(
+            self.block, 512, self.layers[3], stride=2, dcn=stage_dcn[3])
+
+    def forward(self, x):
+        x = self.maxpool(self.relu(self.bn1(self.conv1(x))))  # 64 * h/4 * w/4
+        x = self.layer1(x)  # 256 * h/4 * w/4
+        x = self.layer2(x)  # 512 * h/8 * w/8
+        x = self.layer3(x)  # 1024 * h/16 * w/16
+        x = self.layer4(x)  # 2048 * h/32 * w/32
+        return x
+
+    def stages(self):
+        return [self.layer1, self.layer2, self.layer3, self.layer4]
+
+    def make_layer(self, block, planes, blocks, stride=1, dcn=None):
+        downsample = None
+        if stride != 1 or self.inplanes != planes * block.expansion:
+            downsample = nn.Sequential(
+                nn.Conv2d(self.inplanes, planes * block.expansion,
+                          kernel_size=1, stride=stride, bias=False),
+                self._norm_layer(planes * block.expansion),
+            )
+
+        layers = []
+        layers.append(block(self.inplanes, planes, stride, downsample,
+                            norm_layer=self._norm_layer, dcn=dcn))
+        self.inplanes = planes * block.expansion
+        for i in range(1, blocks):
+            layers.append(block(self.inplanes, planes,
+                                norm_layer=self._norm_layer, dcn=dcn))
+
+        return nn.Sequential(*layers)