LLM Routing post-mortem

Idea in 2-3 sentences

Currently most companies use 1 model provider and barely change as changing the model provider means changing quite some code. Often you dont need the best mode for every request and some models perform better for one task (e.g. math knowledge) and other better in other tasks (e.g. coding). So what if you could route your incoming requst to the best fitting LLM for an optimal response?

The current problems are:

Lack of Overview of Newly Published LLM Models: new LLM models are published on a daily basis; its difficult to stay up-to-date what the current best model for a task (e.g. math/coding/general knowledge) currently is ⇒ many use sub-optimal/old models (me included!)
Lack Unified APIs for Different LLM Models: Changing a model provider (e.g. from OpenAI to Anthrophic) means changing a lot of code in the code base ⇒ many developer do not change the model provider eventhough they know they use a sub-optimal model. Working with multiple models means handling multiple api keys and code files.
Lack of Flexibility to Switch Between Model Providers: Using the same model for each request is certainly sub-optimal. Each request should be routed to the best fitting model - and each company might have different criteria for this:
- Latency of the Model (e.g. OpenAi had big latency issues with GPT-4 in Europe in the begnning, sometimes resultig in time-outs). Sometimes you need fast latency and are willing to trade in quality ⇒ an llm router should handle this.
- Quality of Model depending on request: A code reuest should be routed to the model performing best on code, while a general request might be routed to a different model.
- Cost of Request: Not every request needs to newest most expensive model. Basic requests can be routed to older/smaller models.

Why are/were we excited about it

Long LLM market ⇒ growing market
product with international focus
solves a real pain point one of us experienced (Charlotte)
potential reseach focus of developing proper routers

End user

Open-source first; hosted enterprise version later
Any company working with LLMs (for internal workflows, in their external products) could be a potential customer

Why we killed it

We wanted to built an open-source router, as a competitor to closed-source Martian and Unify. LM SYS ORG published 5 open-source router in the week we worked on routing.
Usually, competition should not hold us back, but:
- LM SYS Org trained their RouteLLM Model on user data (in an ELO score manner) from the ChatBot Arena → very powerful, and to attain this amount of user data would take us very long
- Questionable how to monetize/built a big company out of this ⇒ is it a product or a feature?