ChunkLLM: A Lightweight Pluggable Framework for Accelerating LLMs Inference

(arxiv.org)

90 points | by PaulHoule 2 days ago ago

8 comments