Naive Solution (imageGPT) Paper: “Gen_中国高校课件下载中心

点击下载：广东工业大学：《机器学习》课程教学资源（课件讲义）第19讲 ViT及注意力机制改进（Vision Transformers ,ViTs）

正在加载图片...

MACHINE LEARNING BERKELEY Naive Solution(imageGPT) Paper:"Generative Pretraining from Pixels" Pixels are kinda discrete-just treat each color value like a separate word in your vocabulary！ o Each pixel is commonly represented by a 24 bit value(integers in the range [O,255]for each of the 3 color channels) o Vocab size of 2^24=16,777,216! Who needs that many colors anyway? o Use a 9 bit representation (integers in the range [0,8]for each of the 3 color channels) o Vocab size of 512 Read pixels from raster order(row by row from left to right)to get input sequence Naive Solution (imageGPT) Paper: “Generative Pretraining from Pixels” ● Pixels are kinda discrete — just treat each color value like a separate word in your vocabulary! ○ Each pixel is commonly represented by a 24 bit value (integers in the range [0, 255] for each of the 3 color channels) ○ Vocab size of 2^24 = 16,777,216! ● Who needs that many colors anyway? ○ Use a 9 bit representation (integers in the range [0, 8] for each of the 3 color channels) ○ Vocab size of 512 ● Read pixels from raster order (row by row from left to right) to get input sequence

<<向上翻页向下翻页>>

点击下载：广东工业大学：《机器学习》课程教学资源（课件讲义）第19讲 ViT及注意力机制改进（Vision Transformers ,ViTs）