Alignment whack-a-mole: Finetuning activates recall of copyrighted books in LLMs

(github.com)

152 points | by reconnecting 9 hours ago ago

109 comments