Skip to content

Actions: open-compass/opencompass

deploy

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
703 workflow runs
703 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

[Update] Add AIME2025 oss info (#1936)
deploy #942: Commit 709bc4a pushed by MaiziXiao
March 12, 2025 10:41 1s main
March 12, 2025 10:41 1s
[Feature] Add support for BBEH dataset (#1925)
deploy #941: Commit bc2969d pushed by MaiziXiao
March 12, 2025 02:53 Skipped main
March 12, 2025 02:53 Skipped
[Feature] Support SuperGPQA (#1924)
deploy #940: Commit 59e49ae pushed by MaiziXiao
March 11, 2025 11:32 2s main
March 11, 2025 11:32 2s
[Fix] Fix math-verify evaluator (#1917)
deploy #939: Commit e403fd2 pushed by MaiziXiao
March 11, 2025 09:35 3s main
March 11, 2025 09:35 3s
[Feature] Update LLM Evaluation for MMLU-Pro (#1923)
deploy #938: Commit cbf84fb pushed by tonysy
March 7, 2025 13:01 1s main
March 7, 2025 13:01 1s
[Fix] Fix CLI option for results persistence (#1920)
deploy #937: Commit 570c30c pushed by MaiziXiao
March 7, 2025 10:24 2s main
March 7, 2025 10:24 2s
[Fix] Fix typo in deepseed_r1.md (#1916)
deploy #936: Commit 277d794 pushed by MaiziXiao
March 5, 2025 11:37 1s main
March 5, 2025 11:37 1s
[Feature] Evaluation Results Persistence (#1894)
deploy #935: Commit 1585c0a pushed by MaiziXiao
March 5, 2025 10:33 1s main
March 5, 2025 10:33 1s
[Docs] Results persistance (#1908)
deploy #934: Commit 5432465 pushed by MaiziXiao
March 5, 2025 10:23 1s main
March 5, 2025 10:23 1s
[Update] Code evaluation alignment (#1909)
deploy #933: Commit fff2d51 pushed by MaiziXiao
March 4, 2025 10:49 2s main
March 4, 2025 10:49 2s
[Bump] Bump version to 0.4.1
deploy #932: Commit 5547fd1 pushed by MaiziXiao
March 4, 2025 10:35 33s 0.4.1
March 4, 2025 10:35 33s
[Bump] Bump version to 0.4.1
deploy #931: Commit 5547fd1 pushed by MaiziXiao
March 4, 2025 10:26 2s main
March 4, 2025 10:26 2s
[Feature] Add HLE (Humanity's Last Exam) dataset (#1902)
deploy #930: Commit 198c086 pushed by MaiziXiao
March 4, 2025 08:42 2s main
March 4, 2025 08:42 2s
March 3, 2025 10:56 2s
[Update] Fix Hard Configs With General GPassK (#1906)
deploy #928: Commit f0809fe pushed by MaiziXiao
March 3, 2025 10:17 3s main
March 3, 2025 10:17 3s
[Fix] Fix compatible issue
deploy #927: Commit 6a573f6 pushed by MaiziXiao
March 3, 2025 07:36 3s main
March 3, 2025 07:36 3s
February 26, 2025 11:43 3s
[CI] update dailytest sceduler and baseline's score(#1898)
deploy #925: Commit 6042b88 pushed by MaiziXiao
February 26, 2025 11:04 2s main
February 26, 2025 11:04 2s
[Feature] Add general math, llm judge evaluator (#1892)
deploy #924: Commit bdb2d46 pushed by tonysy
February 26, 2025 07:08 3s main
February 26, 2025 07:08 3s
[Update] Support AIME-24 Evaluation for DeepSeek-R1 series (#1888)
deploy #923: Commit fd6fbf0 pushed by tonysy
February 25, 2025 12:34 3s main
February 25, 2025 12:34 3s
[Update] Update LiveMathBench Hard Configs (#1826)
deploy #922: Commit 22a33d8 pushed by tonysy
February 25, 2025 09:24 3s main
February 25, 2025 09:24 3s
[Update] Academic bench llm judge update (#1876)
deploy #921: Commit 465e93e pushed by MaiziXiao
February 24, 2025 07:45 3s main
February 24, 2025 07:45 3s
[Update] Update Greedy Config & README of LiveMathBench (#1862)
deploy #920: Commit 046b6f7 pushed by MaiziXiao
February 20, 2025 11:47 2s main
February 20, 2025 11:47 2s
[Update] OpenAI model update, bigcodebench update (#1879)
deploy #919: Commit d7daee6 pushed by tonysy
February 20, 2025 11:33 2s main
February 20, 2025 11:33 2s
[Feature] Math Verify with model post_processor (#1881)
deploy #918: Commit 27c9166 pushed by tonysy
February 20, 2025 11:32 2s main
February 20, 2025 11:32 2s