hadoop - Compare X level directory in HDFS using shell script -
i have base directory in hdfs
/user/a123
the hdfs directory a123 has nested sub directories
/user/a123/foldera/2017-08-17/xyz /user/a123/foldera/2017-08-18/abc /user/a123/folderb/2017-08-17 /user/a123/folderb/2017-08-19 /user/a123/folderc/2017-08-17 /user/a123/folderd/2017-08-20/def /user/a123/folderd/2017-08-17 /user/a123/foldere/2017-08-17/xyz
from base directory, need select second level directory i.e 'yyyy-mm-dd' directory, , compare if older 2 days.
if older 2 days, print archive & complete folder name else existing.
today=`date +'%s'` hdfs dfs -ls /user/a123/ | grep "^d" | while read line ; dir_date=$(echo ${line} | awk '{print $6}') difference=$(( ( ${today} - $(date -d ${dir_date} +%s) ) / ( 24*60*60 ) )) filepath=$(echo ${line} | awk '{print $8}') if [ ${difference} -lt 2 ]; echo "archive : "${filepath} else echo "existing : "${filepath} fi done
expected output:
archive : /user/a123/foldera/2017-08-17 archive : /user/a123/foldera/2017-08-18 archive : /user/a123/folderb/2017-08-17 existing : /user/a123/folderb/2017-08-19 archive : /user/a123/folderc/2017-08-17 existing : /user/a123/folderd/2017-08-20 archive : /user/a123/folderd/2017-08-17 archive : /user/a123/foldere/2017-08-17
Comments
Post a Comment