Skip to content

Backend implementing a multimodal AI model in PyTorch and serving it using FastAPI to automatically identify and label dance moves in videos.

License

Notifications You must be signed in to change notification settings

dance-segmentation/dance-bits-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DanceBits API: ML-powered automated choreography video segmentation

Backend implementing a multimodal AI model in PyTorch and serving it using FastAPI and Docker to automatically identify and label dance moves in videos to create an interactive learning platform. Features: video preprocessing, pose estimation, audio processing, and multimodal segmentation model.

Features

  • Advanced pose estimation and motion feature extraction using MediaPipe
  • Audio feature extraction for enhanced move detection
  • Real-time multimodal dance move segmentation model
  • User-friendly learning interface with customizable speeds and segment sizes
  • Side-by-side webcam/video option with recording functionality
  • Similarity score calculation for comparing dance performances

Requirements

  • Python 3.11 or higher
  • FFmpeg
  • CUDA-compatible GPU (optional, for faster inference)
  • Docker (optional, for containerized deployment)

Installation

  1. Clone this repository:

    git clone https://github.com/your-username/dance-bits-api.git
    cd dancebits
  2. Create a Conda environment and activate it:

    conda create --name dance-bits-api python
    conda activate dance-bits-api
  3. Install the required packages:

    pip install -r requirements.txt

    Note: You can also install via Conda, but some packages may not be available:

    conda install --file requirements.txt

Local Deployment

Running the Model Locally

  1. Set up environment variables in a .env file:

    WANDB_API_KEY=your_key
    WANDB_ORG=your_org
    WANDB_PROJECT=your_project
    WANDB_MODEL_NAME=your_model
    WANDB_MODEL_VERSION=your_version
  2. Install FFmpeg (required for video processing):

    • On Ubuntu/Debian:
      sudo apt-get update
      sudo apt-get install ffmpeg libsm6 libxext6
    • On macOS:
      brew install ffmpeg
    • On Windows: Download from FFmpeg website and add to PATH
  3. Start the FastAPI server:

    uvicorn app.main:app --reload --host 0.0.0.0 --port 8080
  4. Access the API:

Testing the API

  1. Test video segmentation:

    curl -X POST "http://localhost:8080/predict/" \
      -H "accept: application/json" \
      -H "Content-Type: multipart/form-data" \
      -F "video=@path/to/your/dance_video.mp4" \
      -F "min_segmentation_prob=0.5"
  2. Test video comparison:

    curl -X POST "http://localhost:8080/compare/" \
      -H "accept: application/json" \
      -H "Content-Type: multipart/form-data" \
      -F "user_video=@path/to/user_video.mp4" \
      -F "teacher_video=@path/to/teacher_video.mp4"

Troubleshooting

  1. Model Loading Issues:

    • Ensure all environment variables are set correctly
    • Check if the model weights are downloaded properly
    • Verify CUDA availability if using GPU
  2. Video Processing Issues:

    • Verify FFmpeg installation: ffmpeg -version
    • Check video format compatibility (MP4, AVI, MOV supported)
    • Ensure sufficient disk space for temporary files
  3. Memory Issues:

    • Reduce video resolution if experiencing OOM errors
    • Consider using CPU inference if GPU memory is limited
    • Monitor system resources during processing

Docker Deployment

  1. Build the Docker image:

    docker build -t dancebits-api .
  2. Run the container:

    docker run -d --name dancebits-api \
      -p 8080:8080 \
      -e WANDB_API_KEY=your_key \
      -e WANDB_ORG=your_org \
      -e WANDB_PROJECT=your_project \
      -e WANDB_MODEL_NAME=your_model \
      -e WANDB_MODEL_VERSION=your_version \
      dancebits-api

Environment Variables

The following environment variables are required for the application:

  • WANDB_API_KEY: Weights & Biases API key
  • WANDB_ORG: Weights & Biases organization name
  • WANDB_PROJECT: Weights & Biases project name
  • WANDB_MODEL_NAME: Name of the model to use
  • WANDB_MODEL_VERSION: Version of the model to use

API Endpoints

Predict Dance Segments

POST /predict/

Segments a dance video into individual moves.

Parameters:

  • video: Video file (MP4, AVI, or MOV)
  • min_segmentation_prob: Minimum probability threshold for segmentation (default: 0.5)

Response:

{
    "segmented_probs": [...],
    "segmented_percentages": [...]
}

Compare Videos

POST /compare/

Calculates similarity score between two dance videos.

Parameters:

  • user_video: User's dance video file
  • teacher_video: Teacher's reference video file

Response:

{
    "similarity_score": float
}

Technical Details

Video Processing Pipeline

  1. Frame Extraction: Videos are processed frame by frame using OpenCV
  2. Pose Estimation: MediaPipe Pose is used to extract 35 bone vectors per frame
  3. Audio Processing:
    • Audio is extracted from video using MoviePy
    • Mel spectrogram is generated using Librosa
    • Tempo analysis for beat detection
  4. Model Inference:
    • Processes both visual (pose) and audio features
    • Returns frame-by-frame segmentation probabilities
  5. Post-processing:
    • Smoothing of segmentation probabilities
    • Dynamic adjustment based on beat detection
    • Segment identification based on probability thresholds

Performance Considerations

  • The API supports both CPU and GPU inference
  • Video processing is optimized for real-time performance
  • Temporary files are automatically cleaned up after processing
  • CORS is enabled for all origins by default

Contributing

We welcome contributions! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

Backend implementing a multimodal AI model in PyTorch and serving it using FastAPI to automatically identify and label dance moves in videos.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •