Document Clustering And Classification

This repo contains notebooks performing clustering and classification on documents from the FUNSD dataset

The first notebook implements K-means and agglomerative clustering on the FUNSD dataset using visual and textual features, as well as Principant Component Analysis on the tokenized content of the documents for clusters visualization purposes.

The second notebook implements supervised classification by performing transfer learning on the VGG architecture, using the labels learned through clustering.

The notebooks make use of Scikit-learn and keras libraries.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Classification.ipynb		Classification.ipynb
Clustering.ipynb		Clustering.ipynb
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Document Clustering And Classification

About

Releases

Packages

Languages

License

DamiFass/document-clustering-and-classification

Folders and files

Latest commit

History

Repository files navigation

Document Clustering And Classification

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages