Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Missing test evaluation #11

Open
Niklss opened this issue Jun 4, 2023 · 2 comments
Open

Missing test evaluation #11

Niklss opened this issue Jun 4, 2023 · 2 comments

Comments

@Niklss
Copy link

Niklss commented Jun 4, 2023

Missing code for test evaluation, only dev evaluation exists.

# Evaluate
                    if self.scheduler._step_count % self.config['eval_frequency'] == 0:
                        logger.info('Dev')

                        f1, _ = self.evaluate(
                            model, examples_dev, stored_info, self.scheduler._step_count
                        )
                        logger.info('Test')
                        f1_test = 0. # It is always zero
                        if f1 > max_f1:
                            max_f1 = max(max_f1, f1)
                            max_f1_test = 0. 
                            self.save_model_checkpoint(
                                model, self.optimizer, self.scheduler, self.scheduler._step_count, epo
                            )

                        logger.info('Eval max f1: %.2f' % max_f1)
                        logger.info('Test max f1: %.2f' % max_f1_test)
                        start_time = time.time()
@HMTTT
Copy link

HMTTT commented Jun 7, 2023

I also meet this issue. How to evaluate with test dataset?

@Niklss
Copy link
Author

Niklss commented Jul 14, 2023

I also meet this issue. How to evaluate with test dataset?

I simply rewrote the code to also evaluate on test and save the best model based on test eval. But be careful, test dataset is pretty big, so it's better to increase eval_frequency to not stuck on evaluation for too long.

# Evaluate
                    if self.scheduler._step_count % self.config['eval_frequency'] == 0:
                        logger.info('Dev')

                        f1, _ = self.evaluate(
                            model, examples_dev, stored_info, self.scheduler._step_count
                        )
                        logger.info('Test')
                        f1_test, _ = self.evaluate(
                            model, examples_test, stored_info, self.scheduler._step_count
                        )
                        max_f1 = max(max_f1, f1)
                        if f1_test > max_f1_test:
                            max_f1_test = max(max_f1_test, f1_test)
                            self.save_model_checkpoint(
                                model, self.optimizer, self.scheduler, self.scheduler._step_count, epo
                            )

                        logger.info('Eval max f1: %.2f' % max_f1)
                        logger.info('Test max f1: %.2f' % max_f1_test)
                        start_time = time.time()

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants