Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Nah, the paper explicitly states that their system is not recurrent nor convolutional:

> To the best of our knowledge, however, the Transformer is the first transduction model relying entirely on self-attention to compute representations of its input and output without using RNNs or convolution.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: