Regression Test Pipeline #120

Abhinay1997 · 2024-04-20T18:14:50Z

Add English text normlaization.
Add WER calculations.
Compare and check norm outputs from Python implementation
Add WER to the regression tests once Memory and Latency Regression Tests #99 is merged

Abhinay1997 · 2024-05-02T16:03:03Z

Code is still messy. Needs cleanup once the normalization starts working.

Abhinay1997 · 2024-05-04T16:29:29Z

Running tests locally. Adding more unit tests for the new normalization code.

atiorh

@Abhinay1997 As discussed offline, here are the last few items before we can merge this one:

Removing example test result JSONs
Removing the Fraction implementation altogether
Adding AudioEncoder latency measurements to LatencyStats

We discussed the following as nice to haves (could defer to a future PR):

Writing the following static attributes to the resulting JSON:

"static_attributes": 
- whisperkit_version (string)
- os (string)
- encoder_compute_units (string)
- decoder_compute_units (string)
- enable_word_timestamps (boolean)
- enable_eager_decoding (boolean)
- enable_vad (boolean)
- silence_threshold 
- is_low_power_mode (boolean)
- is_stream_simulated (boolean)

Periodically (e.g. 1 minute) measuring the following stats in a separate thread and writing the timeseries results to the final JSON:

"system_measurements":
- thermal_state (string)
- device_temperature (int)
- memory_total_available_gb (float)
- memory_total_used_gb (float)
- memory_app_allocated_gb (float)
- memory_app_used_gb (float)
- memory_swap_used_gb (float)
- battery_level (float)
- disk_total_space_gb (float)
- disk_free_space_gb (float)

* Remove Fraction.swift * Remove commented out redundant code

Abhinay1997 · 2024-08-12T16:47:52Z

@atiorh made the changes except for the AudioEncoder latency stats. Need to add a callback for that. Discussing with Zack on this.

Co-authored-by: Arda Atahan Ibis <ardaibis@gmail.com>

ZachNagengast

Added a comprehensive benchmarking automation script to this PR, ready to merge with passing tests 👍 Check out the BENCHMARKS.md for details on how to test locally.

More info:
This is now a fully automated system to run benchmarks on any connected device in parallel that runs full checks on every model for short form and long form audio. It is using fastlane now via the makefile such that running the command make benchmark-devices will start the script and it will run through a list of models and output the results to be processed in an upload_folder. We also are building a hugginface space that will process and display the data in a convenient way, more on that soon.

Abhinay1997 added 5 commits April 20, 2024 21:29

daa68c8

Merge branch 'main' into wer_utils

e2e5632

Add basic Fraction type to handle Number normalization

6af9f5a

Add EnglishNumberNormalizer

87c230a

Merge branch 'main' into wer_utils

d8cda9f

Abhinay1997 added 2 commits May 4, 2024 21:55

Adds Basic Fraction type for WER

b8c30fe

Refactor + Add english normalizers

06e66e4

ZachNagengast linked an issue May 7, 2024 that may be closed by this pull request

English text normalization utilization for Eager Streaming Mode #111

Open

ZachNagengast removed a link to an issue May 7, 2024

English text normalization utilization for Eager Streaming Mode #111

Open

ZachNagengast mentioned this pull request May 7, 2024

English text normalization utilization for Eager Streaming Mode #111

Open

Abhinay1997 added 8 commits May 9, 2024 00:20

Bug fixes in number normalization. regex, multiplier processing.

3334d44

wer evaluate function + string optimization

da3a719

Add wer test on long audio

acb80ff

Remove Wagner-Fischer, fix normalization bugs.

dbbf9bf

Hirschberg's LCS Algorithm for edit operations

16a5525

Remove warnings in Fraction implementation

70456b3

Add tests

a3c94cc

Merge branch 'main' into wer_utils

b7e52fa

Abhinay1997 marked this pull request as ready for review May 28, 2024 03:22

Abhinay1997 added 7 commits May 29, 2024 07:46

Refactoring

60f8956

Refactor regression tests

89df136

Add WER to regression test results, fix overflow

ad13284

clean up files

47be844

Merge branch 'main' into wer_utils

bf46309

patch overflow for now.

6296506

Re-add file needed for tests

6a28fc1

ZachNagengast changed the title ~~English Normalisation and WER Utils~~ Regression Test Pipeline Jun 25, 2024

ZachNagengast linked an issue Jun 25, 2024 that may be closed by this pull request

Benchmark for WhisperAX & CLI #28

Closed

ZachNagengast removed a link to an issue Jun 25, 2024

Benchmark for WhisperAX & CLI #28

Closed

ZachNagengast added the enhancement Improves existing code label Jun 25, 2024

ZachNagengast and others added 3 commits July 28, 2024 15:10

Fix xcode test attachment

26bb7c6

Fix overflow when using Int.

01baf7b

Add flag to run only on first audio file of the dataset

cca6f50

atiorh requested changes Aug 5, 2024

View reviewed changes

Abhinay1997 and others added 5 commits August 6, 2024 18:33

3fceef3

PR Clenup:

ad4c7f5

* Remove Fraction.swift * Remove commented out redundant code

Merge branch 'main' into wer_utils

74ad9be

Adds system memory, disk space and battery level tracking.

525657b

Remove sample JSON

83ffc3f

Merge branch 'main' into wer_utils

a8d6e27

Abhinay1997 requested a review from atiorh August 12, 2024 16:54

Abhinay1997 and others added 6 commits August 12, 2024 22:37

Fix compilation on non macOS

2f3be51

Fix battery checks for watchOS

d9bc43b

Fix imports

c99bd94

Merge branch 'main' into regression-test-automations

f2e2fac

Regression test automations

6962d0d

Co-authored-by: Arda Atahan Ibis <ardaibis@gmail.com>

Add modelSizeMB to testInfo

a74a592

ZachNagengast approved these changes Oct 25, 2024

View reviewed changes

ZachNagengast added 2 commits October 27, 2024 16:07

Cleanup for merge

039003e

Upgrade example app

442eb24

ZachNagengast merged commit 0054f3e into argmaxinc:main Oct 28, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regression Test Pipeline #120

Regression Test Pipeline #120

Abhinay1997 commented Apr 20, 2024 •

edited by ZachNagengast

Loading

Abhinay1997 commented May 2, 2024

Abhinay1997 commented May 4, 2024

atiorh left a comment •

edited

Loading

Abhinay1997 commented Aug 12, 2024

ZachNagengast left a comment

Regression Test Pipeline #120

Regression Test Pipeline #120

Conversation

Abhinay1997 commented Apr 20, 2024 • edited by ZachNagengast Loading

Abhinay1997 commented May 2, 2024

Abhinay1997 commented May 4, 2024

atiorh left a comment • edited Loading

Choose a reason for hiding this comment

Abhinay1997 commented Aug 12, 2024

ZachNagengast left a comment

Choose a reason for hiding this comment

Abhinay1997 commented Apr 20, 2024 •

edited by ZachNagengast

Loading

atiorh left a comment •

edited

Loading