The point here is that this comparison is not useful when you are working with small number of requests.
We found Mistral best alternative to re-ranking kind of model where you want to have better recall and is high in terms of tokens ( In our experiment, we found that we can increase the recall rate from ~5% to ~70% compared to open source re-ranking models )
Other use case is where we want to process large number of news articles where getting some hit on accuracy is okay compared to cost.
Finally, Mistral is very promising model and may act as an intermediate layer between best model like GPT 4 to save cost.