data loader 속도 높이기
목차
https://stackoverflow.com/questions/9619199/best-way-to-preserve-numpy-arrays-on-disk
best way to preserve numpy arrays on disk
I am looking for a fast way to preserve large numpy arrays. I want to save them to the disk in a binary format, then read them back into memory relatively fastly. cPickle is not fast enough,
stackoverflow.com
https://discuss.pytorch.org/t/how-to-speed-up-the-data-loader/13740
How to speed up the data loader
Hi I want to know how to speed up the dataloader. I am using torch.utils.data.DataLoader(8 workers) to train resnet18 on my own dataset. My environment is Ubuntu 16.04, 3 * Titan Xp, SSD 1T. Epoch: [1079][0/232] Time 5.149 (5.149) Data 5.056 (5.056) Loss 0
discuss.pytorch.org
npy/npz 나 hdf5로 저장하는 것이 제일 numpy 저장할 때 좋다.
NVIDIA DALI 는 data augmentation에 유용하다는데 augmentation을 안할 때에도 유용한지, image data랑 다른 종류 데이터가 섞여있어도 속도가 빠른 건지를 확인해봐야 할듯.
각 저장 형식마다 속도 및 용량 비교한 것은 아래 github에서 확인 가능!
'Experiments' 카테고리의 다른 글
Latex 자주 쓰는 모음 정리 (2) | 2021.12.18 |
---|---|
learning rate 결정하기 (0) | 2021.08.11 |
image encoder batch normalization (0) | 2021.08.01 |
python pickle errors (stable-baselines3 trained agent) (0) | 2021.07.11 |
ResNet & iResNet (0) | 2021.02.11 |
댓글