Why can't transformers learn multiplication?

(arxiv.org)

147 points | by PaulHoule 4 days ago ago

89 comments