Object-Searching-in-Videos

Introduction

In this task, I will focus on three different levels of searching for objects within videos:

Level 1: Find similar objects with no properties: A truck.
Level 2: Find object with color property: The red truck.
Level 3: Find this person

Eventually, I locate all frames and draw bounding boxes around the finding object X in the videos, and then export these frames as JPG files.

The structure of the output folders is as follows:

Video 1
- Object X
  - Frame 15.jpg
  - Frame 32.jpg
  - Frame 120.jpg
Video 2
- Object X
Video 3
- Object X
  - Frame 215.jpg

1. How to run this repository

I recommend creating an anaconda environment:

conda create --name [environment-name] python=3.9

Then, install Python requirements:

pip install -r requirements.txt

Finally, to reproduce the results, you first have to download the provided example videos here. Then from the [environment-name] project root, run:

python demo.py

2. Approach

i. Level 1

At this level, I employ YOLOv8 model to detect all objects in the video, and subsequently I extract and draw bounding boxes exclusively around objects classified as truck.

ii. Level 2

Moving to the next stage, I commence by replicating the procedures of Level 1. utilizing YOLOv8 model to extract truck objects. Furthermore, in this task I employ the large segmentation YOLOv8 model(yolov8l-seg.pt) for all three levels. This choice is made not only to enhance the prediction accuracy due to its larger size but also it has the capability to generate masks for the detected objects, for example:

Identified Object:
Object's mask:

In the event that the background contains elements with a similar color to the object, I further enhance accuracy by extracting the detected object based on its mask and applying a color detection algorithm as follows:

Extracted object:

To determine whether the pixel values of the object fall within the red color range, I check if the values for the blue and green channels are in the range (0, 50) and for the red channel are in the range (120, 255). Subsequently, I obtain the following red mask:

Red mask:

Eventually, I can determine whether the detected truck is red by calculating the ratio of red pixels to the total object's pixels and setting a specific threshold for it.

iii. Level 3

At this final stage, I incorporate the use of the YOLOv8 model and Detector-Free Local Feature Matching with Transformers model (LoFTR for short), you can find their paper here.

The first task follows similar procedures to those of Level 1, but it focuses on human class.
Next step is to identify the similarities between the target person (input for this task) and the detected person. LoFTR identifies and extracts keypoints from the given image and the detected human. It then establishes mappings between pairs of keypoints and provides confidence scores for these pairs, you will have a deeper understanding through the following example:

Subsequently, I check if the number of confidence scores greater than 0.5 satisfies a particular threshold (I use a threshold of 65 in my code). Eventually, I employ YOLOv8 model to track ID of the detected human. If the model loses track of the person, the process will start over.

3. Results

In this section, I will provide an overview of the results from the provided examples, which you can access and download from here. Furthermore, please access the result frames for each video level via the following link.

i. Level 1

Video 1:
- Frame 103:
Video 2:
- Frame 42:

ii. Level 2

Video 1:
- Frame 266:
Video 2:
- Frame 205:

iii. Level 3

At the final level, you may want to see the full video result via this link.

Video 3:
- Frame 1115:
- Frame 1898:

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
Input		Input
Media		Media
README.md		README.md
demo.py		demo.py
level1.py		level1.py
level2.py		level2.py
level3.py		level3.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Object-Searching-in-Videos

Introduction

Table of contents:

1. How to run this repository

2. Approach

i. Level 1

ii. Level 2

iii. Level 3

3. Results

i. Level 1

ii. Level 2

iii. Level 3

About

Releases

Packages

Languages

khoi03/Object-Searching-in-Videos

Folders and files

Latest commit

History

Repository files navigation

Object-Searching-in-Videos

Introduction

Table of contents:

1. How to run this repository

2. Approach

i. Level 1

ii. Level 2

iii. Level 3

3. Results

i. Level 1

ii. Level 2

iii. Level 3

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages