Files
Rylan Cai ea329113be feat(eval): add external scoring mode (#12729)
* wip: add llm relevant & BrowseComp

* wip: add widesearch desc

* wip: dsqa, hle, widesearch

* wip: add dsqa

* wip: add awaiting eval status for runs

* wip: add awaiting status for run

* wip: adjust hle-verified

* 🐛 fix: browsecomp topics

* 📝 docs: add annotations

* wip: add awaiting status for pass@k

* wip: add complete status

* wip: update theard dots

* wip: update run status page

* wip: remove useless impl

* wip: update prompt

*  feat: add external eval routes

* wip: add eval cli

* 🐛 fix: support authoritize in no browser environment

* wip: pass tests

* ♻️ refactor: remove tests

* ♻️ refactor: mo camel case
2026-03-10 09:53:26 +08:00
..