This commit is contained in:
Timothy Jaeryang Baek
2025-01-31 01:34:19 -08:00
parent 61fad2681b
commit 5360cb5d50
59 changed files with 200 additions and 31 deletions

View File

@@ -59,11 +59,11 @@ For your feedback to affect the leaderboard, you need whats called a **siblin
Heres a sneak peek at how the Arena Model interface works:
![Arena Model Example](/img/evaluation/arena.png)
![Arena Model Example](/images/evaluation/arena.png)
Need more depth? You can even replicate a [**Chatbot Arena**](https://lmarena.ai/)-style setup!
![Chatbot Arena Example](/img/evaluation/arena-many.png)
![Chatbot Arena Example](/images/evaluation/arena-many.png)
### **2. Normal Interaction**
@@ -71,11 +71,11 @@ No need to switch to “arena mode” if you don't want to. You can use Open Web
For instance, this is how you can rate during a normal interaction:
![Normal Model Rating Interface](/img/evaluation/normal.png)
![Normal Model Rating Interface](/images/evaluation/normal.png)
And here's an example of setting up a multi-model comparison, similar to an arena:
![Multi-Model Comparison](/img/evaluation/normal-many.png)
![Multi-Model Comparison](/images/evaluation/normal-many.png)
---
@@ -85,7 +85,7 @@ After rating, check out the **Leaderboard** under the Admin Panel. This is where
This is a sample leaderboard layout:
![Leaderboard Example](/img/evaluation/leaderboard.png)
![Leaderboard Example](/images/evaluation/leaderboard.png)
### Topic-Based Reranking
@@ -100,7 +100,7 @@ Don't skip this! Tagging is super powerful because it allows you to **re-rank mo
Heres an example of how re-ranking looks:
![Reranking Leaderboard by Topic](/img/evaluation/leaderboard-reranked.png)
![Reranking Leaderboard by Topic](/images/evaluation/leaderboard-reranked.png)
---