Skip to content

Give this a body of text and this algorithm will compress it into a set of core concepts. This can take massive bodies of text, significantly larger than even the best current GPT-4 model

Notifications You must be signed in to change notification settings

rscarbel/word-vector-compression

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Heads up, the point of this repository is not to be a beautiful UI or clean code, it's to demonstrate that it is possible to compress the conceptual content of massive bodies of text. The practical use is that this can be fed into a GPT algorithm so that it can handle bodies of text that far exceed the context window they're used to.

There is a lot of nuance as to how I think it would best integrate with a GPT algorithm, but on a high level, you can train one GPT algorithm on how to predict topics that should come next, and another GPT algorithm specialized in filling in the actual words and sentences to "fill in" the next topic.

Demo

Screenshare.-.2024-01-13.10_24_30.PM.mp4

Fair warning, the code itself is not super clean on this. I just wanted to spit out a proof of concept that you could compress sentences into smaller units based on clusters in vector space.

About

Give this a body of text and this algorithm will compress it into a set of core concepts. This can take massive bodies of text, significantly larger than even the best current GPT-4 model

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published