The formula is very approximate calculation, it doesn't take into consideration model architecture, no. of users, KV cache, number of tokens etc. You can find exact requirement in huggingface here… - Mastering LLM (Large Language Model) - Medium

Then how can we make an estimate on amount of users using the LLM at once?
4
2
LiamVDB
Mastering LLM (Large Language Model)
·Follow
Aug 28, 2024
--
The formula is very approximate calculation, it doesn't take into consideration model architecture, no. of users, KV cache, number of tokens etc.
You can find exact requirement in huggingface here https://huggingface.co/docs/accelerate/en/usage_guides/model_size_estimator
Unfortunately calculating exact requirement considering number of users is difficult, it may require some benchmarking on infra.
--
--
Written by Mastering LLM (Large Language Model)2.9K Followers
·2 Following
MasteringLLM is a AI first EdTech company making learning LLM simplified with its visual contents. Look out for our LLM Interview Prep & AgenticRAG courses.
No responses yet
Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams