tensorflow - scalar summaries not showing up with distributed ML engine custom tier -
i ran ml engine model on cloud --scale-tier:basic_gpu
, of scalar/histogram summaries displayed in tensorboard. think decided switch custom-tier following configuration:
traininginput: scaletier: custom mastertype: complex_model_s workertype: complex_model_m_gpu parameterservertype: large_model workercount: 1 parameterservercount: 1
when looking @ logs, model training, evaluation being performed , eval_metric_ops being output, summary scalars/histograms not being output on tensorboard anymore. in fact, loss not showing updates on tensoboard (even see decreasing in stackdriver logs). running tf.train.estimator custom model_fn defined , assuming have not set model_fn handle distributed training. however, still peculiar logs showing training occurring, while tensorboard behaving nothing occurring? should summaries explicitly handled master or ensure written right place?
Comments
Post a Comment