Show HN: Duplicate 3 layers in a 24B LLM, logical deduction .22→.76. No training

(github.com)

234 points | by xlayn a day ago ago

73 comments