Skip to yearly menu bar Skip to main content


Poster

LUT-GEMM: Quantized Matrix Multiplication based on LUTs for Efficient Inference in Large-Scale Generative Language Models

Gunho Park · baeseong park · Minsub Kim · Sungjae Lee · Jeonghoon Kim · Beomseok Kwon · Se Jung Kwon · Byeongwook Kim · Youngjoo Lee · Dongsoo Lee
2024 Poster

Abstract

Video

Chat is not available.