Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Visualization for SemanticKITTI demo #72

Closed
Alexander-Yao opened this issue Sep 18, 2023 · 0 comments
Closed

Visualization for SemanticKITTI demo #72

Alexander-Yao opened this issue Sep 18, 2023 · 0 comments

Comments

@Alexander-Yao
Copy link

Hi @anhquancao, thanks for your excellent repo.

I met a problem when I was trying to generate the demo for SemanticKITTI dataset. That is, I have generated all .pkl files for validation (sequence 08) set, but I am not sure where is the camera pose I could use or how to fix the same viewpoint for these frames. The script I wrote as following:

import open3d as o3d
import pickle
import numpy as np
import cv2

def get_grid_coords(dims, resolution):
'''
:param dims: the dimensions of the grid [x, y, z] (i.e. [256, 256, 32])
:return coords_grid: is the center coords of voxels in the grid
'''

The sensor in centered in X (we go to dims/2 + 1 for the histogramdd)

g_xx = np.arange(0, dims[0] + 1)

The sensor is in Y=0 (we go to dims + 1 for the histogramdd)

g_yy = np.arange(0, dims[1] + 1)

The sensor is in Z=1.73. I observed that the ground was to voxel levels above the grid bottom, so Z pose is at 10

if bottom voxel is 0. If we want the sensor to be at (0, 0, 0), then the bottom in z is -10, top is 22

(we go to 22 + 1 for the histogramdd)

ATTENTION.. Is 11 for old grids.. 10 for new grids (v1.1) (PRBonn/semantic-kitti-api#49)

sensor_pose = 10
g_zz = np.arange(0, dims[2] + 1)

Obtaining the grid with coords...

xx, yy, zz = np.meshgrid(g_xx[:-1], g_yy[:-1], g_zz[:-1])

coords_grid = np.array([xx.flatten(), yy.flatten(), zz.flatten()]).T
coords_grid = coords_grid.astype(np.float32)

coords_grid = (coords_grid * resolution) + resolution/2

temp = np.copy(coords_grid)
temp[:, 0] = coords_grid[:, 1]
temp[:, 1] = coords_grid[:, 0]
coords_grid = np.copy(temp)

return coords_grid, g_xx, g_yy, g_zz

def draw_semantic_open3d(
voxels,
cam_param_path="",
voxel_size=0.2):

colors = np.array([
    [0  , 0  , 0, 255],
    [100, 150, 245, 255],
    [100, 230, 245, 255],
    [30, 60, 150, 255],
    [80, 30, 180, 255],
    [100, 80, 250, 255],
    [255, 30, 30, 255],
    [255, 40, 200, 255],
    [150, 30, 90, 255],
    [255, 0, 255, 255],
    [255, 150, 255, 255],
    [75, 0, 75, 255],
    [175, 0, 75, 255],
    [255, 200, 0, 255],
    [255, 120, 50, 255],
    [0, 175, 0, 255],
    [135, 60, 0, 255],
    [150, 240, 80, 255],
    [255, 240, 150, 255],
    [255, 0, 0, 255]]) / 255.0


grid_coords, _, _, _ = get_grid_coords([voxels.shape[0], voxels.shape[1], voxels.shape[2]], voxel_size)    

points = np.vstack([grid_coords.T, voxels.reshape(-1)]).T

# Obtaining voxels with semantic class
points = points[(points[:, 3] != 0) & (points[:, 3] != 255)] # remove empty voxel and unknown class

colors = np.take_along_axis(colors, points[:, 3].astype(np.int32).reshape(-1, 1), axis=0)

vis = o3d.visualization.Visualizer()
vis.create_window(width=1200, height=600)
ctr = vis.get_view_control()
param = o3d.io.read_pinhole_camera_parameters(cam_param_path)

Line80: #param.extrinsic = pose_matrix # set extrinsic matrix from 4x4 camera pose matrix

pcd = o3d.geometry.PointCloud()
pcd.points = o3d.utility.Vector3dVector(points[:, :3])
pcd.colors = o3d.utility.Vector3dVector(colors[:, :3])
pcd.estimate_normals()
vis.add_geometry(pcd)

ctr.convert_from_pinhole_camera_parameters(param)

vis.run()  # user changes the view and press "q" to terminate
param = vis.get_view_control().convert_to_pinhole_camera_parameters()
o3d.io.write_pinhole_camera_parameters(cam_param_path, param)

def process(scan, dataset="Semantic_KITTI"):

with open(scan, "rb") as handle:
    b = pickle.load(handle)

fov_mask_1 = b["fov_mask_1"]
T_velo_2_cam = b["T_velo_2_cam"]
vox_origin = np.array([0, -25.6, -2])

y_pred = b["y_pred"]

print(T_velo_2_cam.shape)

if dataset == "Semantic_KITTI":
    # Visualize Semantic_KITTI
    draw_semantic_open3d(
        y_pred,
        cam_param_path="./Semantic_KITTI/cam_params.json",
        voxel_size=0.2
    )
else:
    # Visualize Semantic KITTI
    draw_semantic_open3d(
        y_pred,
        T_velo_2_cam,
        vox_origin,
        fov_mask_1,
        img_size=(1220, 370),
        f=707.0912,
        voxel_size=0.2,
        d=7,
    )

from glob import glob

if name == "main":
pkl_file = glob("./Semantic_KITTI/*.pkl")
process(pkl_file[0])

Could you plesae tell me how to set the camera matrix at line:80 I marked? I have tried to find the camera matrix from https://www.cvlibs.net/datasets/kitti/eval_odometry.php and #16 . But it seems both of them does not work.

Thank you so much for your time and help.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant