Browse "Kim Jaechul Graduate School of AI(김재철AI대학원)" by Author Cheng, Xiang

Showing results 1 to 1 of 1

1
Linear attention is (maybe) all you need (to understand Transformer optimization)

Ahn, Kwangjun; Cheng, Xiang; Song, Minhak; Yun, Chulhee; Jadbabaie, Ali; Sra, Suvrit, 12th International Conference on Learning Representations, ICLR 2024, International Conference on Learning Representations (ICLR), 2024-05-07

rss_1.0 rss_2.0 atom_1.0