LLM Chunk

Super simple and easy-to-use text splitter for Node.js

Perfect for quickly building LLM prototypes or small-scale applications in Node.js.

With a compressed (ZIP) file size of just 1KB.

Installation

npm install llm-chunk

Usage

Easily integrate it into your project with just a few lines of code:

import { chunk } from 'llm-chunk'

const text = `
Hello World.
This is
 a test sentence! Have a good day? Haha. Haha
`;

// Default options
const chunks = chunk(text, {
    minLength: 0,          // number of minimum characters into chunk
    maxLength: 1000,       // number of maximum characters into chunk
    splitter: "paragraph", // paragraph | sentence
    overlap: 0,            // number of overlap chracters
    delimiters: ""         // regex for base split method
});

// The result shows 'paragraph' splitter as default
chunk(text)
// Results
[
  'Hello World.\nThis is\n a test sentence! Have a good day? Haha. Haha'
]

chunk(text, { minLength: 7, maxLength: 9 })
// Results
[
  'Hello World.\nThis',
  ' is\n a test',
  ' sentence! Have a good day? Haha. Haha'
]

Use 'sentence' splitter:

chunk(text, { splitter: "sentence" })
// Results
[
  'Hello World.',
  'This is\n',
  'a test sentence!',
  'Have a good day?',
  'Haha.',
  'Haha'
]

chunk(text, { minLength: 10, splitter: "sentence" })
// Results
[
  'Hello World.',
  'This is\n a test sentence!',
  'Have a good day?',
  'Haha. Haha'
]

chunks = chunk(text, { overlap: 3, splitter: "sentence" });
// Results
[
  'Hello World.',
  ' World. This is\n',
  ' is\n a test sentence!',
  ' sentence! Have a good day?',
  ' day? Haha.',
  ' Haha. Haha',
  ' Haha'
]

For more examples and chunk results, please check the "samples" folder.

Performance

It's super fast. But there's still room for performance improvement.

Patches and PRs are welcome.

----------
Chunk 163948 characters into 436 chunks
----------
Total: 12.169ms (100 times)
Average: 0.122ms

Family

In-memory VectorDB
- Super simple and easy-to-use in-memory vector DB for Node.js

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
dist		dist
samples		samples
src		src
test		test
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Chunk

Installation

Usage

Performance

Family

License

About

Releases

Packages

Languages

golbin/llm-chunk

Folders and files

Latest commit

History

Repository files navigation

LLM Chunk

Installation

Usage

Performance

Family

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages