Skip to content

Summarization Parameters not working #453

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
kwlayman opened this issue Dec 12, 2023 · 4 comments
Open

Summarization Parameters not working #453

kwlayman opened this issue Dec 12, 2023 · 4 comments
Labels
question Further information is requested

Comments

@kwlayman
Copy link

Question

I've tried several of the supported summarization models with the code used in the browser extension example.

The only one I get any results from in a reasonable time is t5-small.

My problem with it is that despite any parameters I try to pass in the result is always same length.

I've traced through the code and it appears that the config params get passed in.

I've tried max_new_tokens, min_new_tokens, max_length, no joy.

I initially started specifying 2.5.3 and last tried just letting cdn handle it, looks like 2.10.x, no joy, same thing.

Could someone please provide me with an example of getting, in my case, the t5-small model running a summarization task that implements parameters as to output?

@kwlayman kwlayman added the question Further information is requested label Dec 12, 2023
@xenova
Copy link
Collaborator

xenova commented Dec 12, 2023

Hi there 👋 Can you provide the code which you tried? That'll make debugging a lot easier :)

@kwlayman
Copy link
Author

kwlayman commented Dec 15, 2023 via email

@kwlayman
Copy link
Author

Hello, I'm dead in the water on this. Is there something else I can submit so I can get some help please?

@xenova
Copy link
Collaborator

xenova commented Dec 19, 2023

Hi there 👋 Sorry for the delay. It looks like a typo: max_target_length should be max_length.

import { pipeline } from "@xenova/transformers";

// Create a summarization pipeline
const summarizer = await pipeline('summarization', 'Xenova/t5-small');

// Text to summarize
const text = "Data science is an interdisciplinary field[10] focused on extracting knowledge from typically large data sets and applying the knowledge and insights from that data to solve problems in a wide range of application domains.[11] The field encompasses preparing data for analysis, formulating data science problems, analyzing data, developing data-driven solutions, and presenting findings to inform high-level decisions in a broad range of application domains. As such, it incorporates skills from computer science, statistics, information science, mathematics, data visualization, information visualization, data sonification, data integration, graphic design, complex systems, communication and business.[12][13] Statistician Nathan Yau, drawing on Ben Fry, also links data science to human–computer interaction: users should be able to intuitively control and explore data.[14][15] In 2015, the American Statistical Association identified database management, statistics and machine learning, and distributed and parallel systems as the three emerging foundational professional communities.[16]";

// Generate summary
const output = await summarizer(text, { min_length: 32, max_length: 128 });
console.log(output);
// "data science is an interdisciplinary field focused on extracting knowledge from typically large data sets. it encompasses preparing data for analysis, formulating data science problems, analyzing data, developing data-driven solutions. it also combines skills from computer science, statistics, information science, mathematics, data visualization, information visualization, data sonification, data integration, graphic design, complex systems, communication and business."

in contrast to using no parameters:

const output = await summarizer(text);
console.log(output);
// [{ summary_text: 'data science is an interdisciplinary field focused on extracting knowledge from typically large data sets ' }]

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants