python - How to enable Dataset pipeline has distributed reading and consuming -
it easy use 2 threads 1 keeps feeding data queue , other consumes data queue , perform computation. since tensorflow recommends dataset input pipeline after 1.2.0., use dataset
, iterator
accomplish task above, namely:
- there 2 processes, 1 feeds , other consumes;
- the pipeline suspends either full or empty , stops when computation finishes @ consuming.
p.s. why in tutorial of threading , queues, tensorflow uses thread
instead of process
?
thank in advance.
distributed tf.contrib.data
pipelines not yet supported of tensorflow 1.3. working on support splitting datasets across devices and/or processes, support not yet ready.
in meantime, easiest way achieve goal use tf.fifoqueue
. can define dataset
reads queue follows:
q = tf.fifoqueue(...) # define dummy dataset contains same value repeated indefinitely. dummy = tf.contrib.data.dataset.from_tensors(0).repeat(none) dataset_from_queue = dummy.map(lambda _: q.dequeue())
you can compose other dataset
tranformations dataset_from_queue
.
Comments
Post a Comment