In chainer, How to write BPTT updater using multiple GPUs? -
i don't find example because existing example extends training.standardupdater, use 1 gpu.
i assume talking bpttupdater
of the ptb example of chainer.
it's not straight forward make customized updater support learning on multiple gpus. multiprocessparallelupdater
hard code way compute gradient (only target link implementation customizable), have copy overall implementation of multiprocessparallelupdater
, modify gradient computation parts. have copy , edit chainer/training/updaters/multiprocess_parallel_updater.py
.
there 2 parts in file compute gradient; 1 in _worker.run
, represents worker process task, , other in multiprocessparallelupdater.update_core
, represents master process task. have make these code bptt modifying code starting _calc_loss
backward
in each of these 2 parts:
# change self._master self.model _worker.run code loss = _calc_loss(self._master, batch) self._master.cleargrads() loss.backward()
it should modified inserting code of bpttupdater.update_core
.
you have take care on data iterators. multiprocessparallelupdater
accept set of iterators distributed master/worker processes. since ptb example uses customized iterator (parallelsequentialiterator
), have make sure these iterators iterate on different portions of dataset or using different initial offsets of word positions. may require customization paralellsequentialiterator
well.
Comments
Post a Comment