Aside ● There are ways to make transf_中国高校课件下载中心

点击下载：广东工业大学：《机器学习》课程教学资源（课件讲义）第19讲 ViT及注意力机制改进（Vision Transformers ,ViTs）

正在加载图片...

MACHINE LEARNING BERKELEY Aside ● There are ways to make transformers more efficient (architecture-wise) ● BUT recall:a major appeal of using transformers is that they scale well relative to compute Transformer architectures are supposed to be simple:self attention is just huge matrix multiplications o huge matrix multiplications are good for parallelization o want to keep the architecture as simple as possible Aside ● There are ways to make transformers more efficient (architecture-wise) ● BUT recall: a major appeal of using transformers is that they scale well relative to compute ● Transformer architectures are supposed to be simple: self attention is just huge matrix multiplications ○ huge matrix multiplications are good for parallelization ○ want to keep the architecture as simple as possible

<<向上翻页向下翻页>>

点击下载：广东工业大学：《机器学习》课程教学资源（课件讲义）第19讲 ViT及注意力机制改进（Vision Transformers ,ViTs）