Background
Low processing time of inference engine is better for user experience. It is important to find the right LLM computing engine (CPU or GPU)
Objectives
to benchmark CPU vs GPU computation of LLM runner using minstral 7B Q4_K_M
Deliverables
article & illustration