International Workshop on Efficient Generative AI 2024

The 2024 International Workshop on Efficient GenAI aims to bring together researchers and practitioners in the field of generative AI, focusing on Large Language Models (LLMs) and Large Multi-Modal Models (LMMs). The objective is to foster collaboration and share insights to improve the training and deployment efficiency of these models. This initiative is expected to drive innovation across the entire system stack, encompassing models, algorithms, software, and hardware. This workshop is funded by the the recently launched Edinburgh Generative AI Lab (GAIL).

We are pleased to announce that we have confirmed invited speakers from University of Edinburgh, UC Berkeley, MIT, Imperial College London, Mila - Quebec AI Institute, GraphCore, Cohere AI, Tenstorrent, Amazon AI Lab, Malted AI, and EPCC (UK’s Supercomputing Centre). We have released a tentative schedule.

Venue

The workshop will be held at the Informatics Forum, University of Edinburgh, in room G.07.

Format and Participation

The workshop will include keynote speeches, panel discussions, invited talks, and networking sessions.

Please reserve your place for the workshop by clicking the “Register for Workshop” button at the top of the page.

Lightning Talks and Posters

PhD students and researchers are invited to give lightning talks and present posters. To apply, click the “Give a talk and/or present poster of your work” button at the top of the page.

Topics

The workshop covers the following topics:

Efficient Model Architectures

Exploring innovations in language and multi-modal models, such as nanoT5, retrieval-based LLMs, and multi-modal integration techniques exemplified by LLaVA and GPT4V.

Training and Fine-tuning Methods

Delving into efficient strategies for LLM/LMM training, including parameter-efficient fine-tuning and resource-efficient distributed training approaches.

Inference Methods

Addressing efficient inference methodologies, such as speculative decoding, along with compression and quantization techniques.

Dataset Management and Retrieval-Based Methods

Focusing on the challenges of efficiently managing datasets for LLMs and LMMs and enhancing the effectiveness of large-scale, intelligent retrieval.

Efficient System Software

Evaluating software solutions pivotal in training and deploying LLMs, including vLLM, PageAttention, DeepSpeed, Megatron-LM, and AI compilers.

New Hardware Adoption

Discussing the deployment of LLMs on emerging hardware platforms such as NVIDIA Graph-Hopper, GraphCore IPU, Cerebras Systems, and Tenstorrent Chips.

Interdisciplinary Efficiency Perspectives

Encouraging dialogues at the intersection of LLM efficiency and sectors such as energy, environment, and social studies, highlighting shared challenges and solutions.