python - How to enable Dataset pipeline has distributed reading and consuming -
it easy use 2 threads 1 keeps feeding data queue , other consumes data queue , perform computation. since tensorflow recommends dataset input pipeline after 1.2.0., use dataset , iterator accomplish task above, namely:
- there 2 processes, 1 feeds , other consumes;
- the pipeline suspends either full or empty , stops when computation finishes @ consuming.
p.s. why in tutorial of threading , queues, tensorflow uses thread instead of process?
thank in advance.
distributed tf.contrib.data pipelines not yet supported of tensorflow 1.3. working on support splitting datasets across devices and/or processes, support not yet ready.
in meantime, easiest way achieve goal use tf.fifoqueue. can define dataset reads queue follows:
q = tf.fifoqueue(...) # define dummy dataset contains same value repeated indefinitely. dummy = tf.contrib.data.dataset.from_tensors(0).repeat(none) dataset_from_queue = dummy.map(lambda _: q.dequeue()) you can compose other dataset tranformations dataset_from_queue.
Comments
Post a Comment