# LLM Ranking Challenge
We have a 1000 conversations, but we want to find the 10 most interesting ones for the user to read. We can show LLM *4* conversations at a time and LLM will rank those relative to each other.
Now we need to find an algorithm to find the top 10 conversations with as few LLM calls as possible.
Now we need to find an algorithm to find the top 10 conversations with as few LLM calls as possible.
## Leaderboard
╔════════════╤═══════╤══╗║ Name │ Score │ ║╟────────────┼───────┼──╢║ │ │ ║╟────────────┼───────┼──╢║ Loading... │ │ ║╟────────────┼───────┼──╢║ │ │ ║╚════════════╧═══════╧══╝
## How it works
In this challenge, we simulate the LLM call with an API endpoint. We make the following assumptions:
- Ground Truth Ranking: Each conversation c_i has an underlying, hidden rank rank(c_i)
- LLM's Accuracy: For some pairs (a, b) with rank(a) < rank(b), LLM is wrong and thinks that b < a.
LLM is more likely to mix up the order of conversations that are close in rank and less likely to do so for conversations that are far apart in rank. We model this using a logistic function:
P(a < b) = 1 / (1 + e^((rank(a) - rank(b))/3)where P(a < b) is the probability that LLM ranks conversation a before conversation b.
Eg. for sequential ranks r_0, ..., r_30:P(r_0 < r_1) = 58%
P(r_0 < r_2) = 66%
P(r_0 < r_3) = 73%
P(r_0 < r_5) = 84%
P(r_0 < r_10) = 97%
Note that the LLM calls for a fixed pair are deterministic.
1. Create a new game:
/new-game?n=100&name=Alice
You can choose n with 4 < n < 1000 but you will only be eligible for the leaderboard if you choose n=1000.
name will be displayed on the leaderboard.
Returns:
{ "gameId": "2b7175d8-73ce-46f9-8a93-7970faf6ca64"}2. Get a partial ranking:
/order ?ids=0,9,13,99 &gameId=2b7175d8-73ce-46f9-8a93-7970faf6ca64
This will be a noisy ordering of the conversations as described above. The elments are ordered from most interesting (lowest index) to least interesting (highest index).
Returns:
{ "sortedIds": [13, 9, 0, 99] }3. Submit top 10:
/submit ?ids=4,2,11,32,12,0,62,85,3,8 &gameId=2b7175d8-73ce-46f9-8a93-7970faf6ca64
Once you think you know the top 10, you can submit them. Note that you only have one try per game. Your submission will be considered correct if all of your 10 conversations rank within the top 15 according to the true order.
To avoid lucky submissions, you name will only appear on the leaderboard if you have 3 correct solution for n=1000 in a row.
Returns:
{ "correct": true, "score": 83, "top15": [4,2,11,32,12,0,62,85,3,8,1,5,55,33,45] }A Listen Labs challenge.