Nano-vLLM: How a vLLM-style inference engine works

(neutree.ai)

162 points | by yz-yu 6 hours ago ago

21 comments