GitHub - ap-atul/Torpido: Allows you to edit videos automatically

Introduction

As we progress in this digital life concept, everyone tries to create contents plus they almost shoot everything moreover they spend a whole lot of time in editing and making that content watchable.

This raw content requires a lot of cleaning and tuning to make the final output easy to understand and contains highlights to regions of interest, which then can be posted on media sites like Youtube, Instagram, Twitter, etc.

So we provide a solution to automate the task by using various methods to analyze audio and video aspects of the raw video and generate a better and summarized output content, expected by any user.

How are we doing this?

Automated summarization of digital Video Sequences is accomplished using a vector rank filter. The output of the rank vector is determined by the minimum rank to be given to the input sequence. And the selection of the max ranking subset which is continuous and satisfies the minimum ranking.

Each frame in a Video Segment can be ranked according to its feature significance. Using all these features to generate a ranking vector for each such feature.

Applying filter on the final summation of all the ranked feature vectors to extract subsequences on the vector.

Which features are we talking about?

1. Visual

Motion

Every video has some moments. Nobody wants to see an idle image as a video. So proposing a motion feature ranking. The amount of motion determines the rank for the FRAME in the sequence.
The rank is set to 0 if the motion is below a certain threshold

Blur

Determining the sharpness of the video FRAME, to rank the subsequence.
If the sharpness is below a certain threshold ranking is set to 0.

2. Auditory

Audio energy

Ranking the video sequence based on the audio activity i.e. talking, sound, music. etc.
A certain threshold will determine whether to rank the sequence or not

De-noising

Audio will be denoised using Wavelet Transform

3. Textual

Text Detection

Ranking the video sequence based on the text detected in the video
If text is detected rank gets added or else 0 is added

EAST model

The east model of the OpenCV will be used to detect the text in the video.

Basic Working

- Start
- Accepts video from the user
- Reads the video [All processes below are parallel]

    - Processes the video stream for [Visual] :
        - Motion ranking, no motion will be ranked 0
        - Blur detection, blur detected ranked 0

    - Processes the video stream for [Textual] :
        - Text detection, high rank for text detected

    - Processes the audio stream for [Auditory] :
        - Audio de-noising with DWT & FWT
        - Audio activity ranking

- Calculate the sum of all ranks
- Select slices satisfying min rank
- Make trims to video using the ranks time stamps
- End

Applications

Automatic video editing for any video
Security footage extraction of importance parts
Tutoring video, with text detection and motion it can extract good amount
In general video editing
Audio de-noising of vlogging videos

Architecture

Screens

Sample Test

Docs

For all docs visit torpido

For dev logs visit logs

Execution

Install ffmpeg

$ sudo apt install ffmpeg

Install all the dependencies

$ pip install -r requirements.txt

Compile the cython files

$ python setup.py build_ext --inplace

Download EAST model and add it to the path

$ wget  https://www.dropbox.com/s/r2ingd0l3zt8hxs/frozen_east_text_detection.tar.gz?dl=1
$ tar -xvf frozen_east_text_detection.tar.gz

// set environment variable
$ sudo gedit /etc/environment

// add new var
EAST_MODEL="path_to_frozen_east_text_detection.pb"

// test the var
$ echo $EAST_MODEL

Run the run.py using some video file

$ python run.py /example/sample.mp4

// or with ui
$ python3 start_up.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

How are we doing this?

Which features are we talking about?

1. Visual

Motion

Blur

2. Auditory

Audio energy

De-noising

3. Textual

Text Detection

EAST model

Basic Working

Applications

Architecture

Screens

Sample Test

Docs

Execution

About

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 231 Commits
img		img
logs		logs
test		test
torpido		torpido
ui		ui
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.torpido		config.torpido
requirements.txt		requirements.txt
run.py		run.py
setup.py		setup.py
start_up.py		start_up.py

License

ap-atul/Torpido

Folders and files

Latest commit

History

Repository files navigation

Introduction

How are we doing this?

Which features are we talking about?

1. Visual

Motion

Blur

2. Auditory

Audio energy

De-noising

3. Textual

Text Detection

EAST model

Basic Working

Applications

Architecture

Screens

Sample Test

Docs

Execution

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 2

Languages