If you read only one thing about AI/LLMs this week, make it this:

You don’t need hosted LLMs... do you?

September 8, 2023

If you read only one thing about AI/LLMs this week, make it this: You don’t need hosted LLMs…do you? by Sergei Savvov


• The choice of self-hosted LLM vs cloud LLM is probably the biggest decision teams make when standing up an LLM powered app. Both approaches involve real tradeoffs.
• Cost estimates depend heavily on usage and quality - both in terms of the hardware and support staff required to operate a self-hosted LLM. Cloud LLMs are, on average, going to deliver better results at a lower cost for low to medium-volume use cases.
• Hosting your own LLM has equally real benefits — privacy, reliability, code control, etc. Cloud LLMs take less to implement (usually just an API integration), so prototyping against the cloud usually makes sense as a first step. if you can do so safely.

My take:

Let’s look at the data!

• I chose this piece because we at Stride have some real world context that makes this picture richer. The article does a great job of enumerating the tradeoffs between self-hosted and cloud at a high level, and in practice we’ve learned a lot about the details, especially around costs.
• The cost of hosting a large model like LLaMa 2-70B can be prohibitive, but smaller models typically aren’t — we were able to run LLaMa 2-7B just fine on a g5.2xlarge on AWS. For prototyping and for many use cases that are not highly creative, this implementation can work and can allow you to prototype in a secure local environment. Benchmarking against ChatGPT can indeed make local models look bad, but it’s all relative — small models can do really well to prove even complicated concepts, and can also handle large tasks when properly broken down.
• If you’re looking for a great, but secure, LLM playground without rolling your own, look at Azure! It’s the only place you can access GPT-4 without talking to OpenAI. This is a huge advantage for MS in the near term, GPT-4 remains best of breed by a fair margin — but Google and others are working to narrow that gap.


Final plug in case you missed it — Debbie Madden and I discussed more of our learnings from working with LLMs in a webinar earlier this week, link is here. Check it out!

Link to original post by Dan Mason on LinkedIn