MACHINE LEARNING BERKELEY Naive Solution (imageGPT) Another problem:time complexity o Recall:transformers are O(n^2)w.r.t.input length o AND input length is O(n^2)w.r.t.length of each side o 256x 256 image=>65536 pixels o For reference,BERT only has a max length of 512 tokens ● Solution:just use smaller images Imao o Max size of 64 x 64 Trained on a similar objective to language models(next pixel prediction instead of next token prediction)Naive Solution (imageGPT) ● Another problem: time complexity :( ○ Recall: transformers are O(n^2) w.r.t. input length ○ AND input length is O(n^2) w.r.t. length of each side ○ 256 x 256 image => 65536 pixels ○ For reference, BERT only has a max length of 512 tokens ● Solution: just use smaller images lmao ○ Max size of 64 x 64 ● Trained on a similar objective to language models (next pixel prediction instead of next token prediction)