lobehub

centersl/lobehub

Fork 0

mirror of https://github.com/lobehub/lobehub.git synced 2026-03-26 13:19:34 +07:00

Commit Graph

Author	SHA1	Message	Date
Arvin Xu	e7598fe90b	✨ feat: support agent benchmark (#12355 ) * improve total fix page size issue fix error message handler fix eval home page try to fix batch run agent step issue fix run list fix dataset loading fix abort issue improve jump and table column fix error streaming try to fix error output in vercel refactor qstash workflow client improve passK add evals to proxy refactor metrics try to fix build refactor tests improve detail page fix passK issue improve eval-rubric fix types support passK fix type update fix db insert issue improve dataset ui improve run config finish step limit now add step limited 100% coverage to models add failed tests todo support interruptOperation fix lint improve report detail improve pass rate improve sort order issue fix timeout issue Update db schema 完整 case 跑通 update database improve error handling refactor to improve database 优化 test case 的处理流程优化部分细节体验和实现基本完成 Benchmark 全流程功能优化 run case 展示优化 run case 序号问题优化 eval test case 页面新增 eval test 模式新增 dataset 页面 update schema support finish create test run fix update improve import exp refactor data flow improve import workflow rubric Benchmark detail 页面 improve import ux update schema finish eval home page add eval workflow endpoint implement benchmark run model refactor RAG eval implement backend update db schema update db migration init benchmark * support rerun error test case * fix tests * fix tests	2026-02-21 20:36:40 +08:00

Author

SHA1

Message

Date

Arvin Xu

e7598fe90b

✨ feat: support agent benchmark (#12355 )

* improve total

fix page size issue

fix error message handler

fix eval home page

try to fix batch run agent step issue

fix run list

fix dataset loading

fix abort issue

improve jump and table column

fix error streaming

try to fix error output in vercel

refactor qstash workflow client

improve passK

add evals to proxy

refactor metrics

try to fix build

refactor tests

improve detail page

fix passK issue

improve eval-rubric

fix types

support passK

fix type

update

fix db insert issue

improve dataset ui

improve run config

finish step limit now

add step limited

100% coverage to models

add failed tests todo

support interruptOperation

fix lint

improve report detail

improve pass rate

improve sort order issue

fix timeout issue

Update db schema

完整 case 跑通

update database

improve error handling

refactor to improve database

优化 test case 的处理流程

优化部分细节体验和实现

基本完成 Benchmark 全流程功能

优化 run case 展示

优化 run case 序号问题

优化 eval test case 页面

新增 eval test 模式

新增 dataset 页面

update schema

support

finish create test run

fix

update

improve import exp

refactor data flow

improve import workflow

rubric Benchmark detail 页面

improve import ux

update schema

finish eval home page

add eval workflow endpoint

implement benchmark run model

refactor RAG eval

implement backend

update db schema

update db migration

init benchmark

* support rerun error test case

* fix tests

* fix tests

2026-02-21 20:36:40 +08:00

1 Commits