Clarification on Task 1 Zero-Shot Batch Size and Memory Usage

Hi,

I'm running Task 1 zero-shot evaluation on an NVIDIA A100 (80 GB VRAM) using the default batch_size=1024 in gena_lm.py and encountering CUDA OOM errors. Could you please clarify what batch sizes you used for this task? If 1024 was used, how did you avoid running into memory issues on similar hardware?

Thanks for your help!