Two Stream CNN is proposed in SKELETON-BASED ACTION RECOGNITION WITH CONVOLUTIONAL NEURAL NETWORKS, which is used for skeleton-based action recognition. It maps a skeleton sequence to an image( coordinates x,y,z to image R,G,B ). And they specially designed skeleton transformer module to rearrange and select important skeleton joints automatically.
- Python3
- Keras
- h5py
- matplotlib
- numpy
The network mainly consists of four modules which are Skeleton Transformer
, ConvNet
, Feature Fusion
and Classification
. The inputs of two stream are raw data(x, y, z) and frame difference respectively. As show below :
-
function/data_generator.py : generate the inputs numpy array of two stream
-
layers/transformer : the layer of Skeleton Transformer implement in Keras
-
network/ : the fold has four flies with different feature fusion way
model | accuracy(cs) |
---|---|
base line | 83.2% |
my model | 80.7% |
Introduce attention mechanism
to Skeleton Transformer module. Then, the accurancy can reach at 82.1%.
If you have any questions, please feel free to contact me.
Duohan Liang (duohanl@outlook.com)