正在加载图片...
MACHINE LEARNING BERKELEY Naive Solution(imageGPT) Paper:"Generative Pretraining from Pixels" Pixels are kinda discrete-just treat each color value like a separate word in your vocabulary! o Each pixel is commonly represented by a 24 bit value(integers in the range [O,255]for each of the 3 color channels) o Vocab size of 2^24=16,777,216! Who needs that many colors anyway? o Use a 9 bit representation (integers in the range [0,8]for each of the 3 color channels) o Vocab size of 512 Read pixels from raster order(row by row from left to right)to get input sequence Naive Solution (imageGPT) Paper: “Generative Pretraining from Pixels” ● Pixels are kinda discrete — just treat each color value like a separate word in your vocabulary! ○ Each pixel is commonly represented by a 24 bit value (integers in the range [0, 255] for each of the 3 color channels) ○ Vocab size of 2^24 = 16,777,216! ● Who needs that many colors anyway? ○ Use a 9 bit representation (integers in the range [0, 8] for each of the 3 color channels) ○ Vocab size of 512 ● Read pixels from raster order (row by row from left to right) to get input sequence
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有