java - Hadoop Yarn write to local file system -
i have scenario process 1000's of small files using hadoop. output of hadoop job used input non-hadoop algorithm. in current workflow, data read, converted sequence files, processed , resulting small files outputted hdfs in form of sequence file. however, non-hadoop algorithm cannot understand sequence file. therefore, i've written simple hadoop job read resulting files' data sequence file , create final small files can used non-hadoop algorithm.
the catch here final job have read sequence files hdfs , write local file system of each node processed non-hadoop algorithm. i've tried setting output path file:///<local-fs-path>
, using hadoop localfilesystem
class. however, doing outputs final results namenode's local file system only.
just complete picture, have 10 nodes hadoop setup yarn. there way in hadoop yarn mode read data hdfs , write results local file system of each processing node?
thanks
not really. while can write localfilesystem
, can't ask yarn run application on nodes. also, depending on how cluster configured, yarn's node managers might not running on nodes of system.
a possible workaround keep converted files in hdfs have non-hadoop process first call hdfs dfs -copytolocal
.
Comments
Post a Comment