linux - Data Extraction from a .seg file -
i have .seg file holds data of clusters formed after diarization of audio file. file has following data:
;; cluster s0 [ score:fs = -32.694324625945725 ] [ score:ft = -33.32942628147711 ] [ score:ms = -32.847416329096404 ] [ score:mt = -33.45196981196905 ] elonn 1 0 758 f s u s0 ;; cluster s1 [ score:fs = -33.14490351155562 ] [ score:ft = -33.420111126893076 ] [ score:ms = -32.29039025858266 ] [ score:mt = -32.85038927851203 ] elonn 1 758 308 m s u s1 elonn 1 1110 700 m s u s1 elonn 1 1887 2794 m s u s1 elonn 1 4849 1190 m s u s1 ;; cluster s10 [ score:fs = -34.466969784129404 ] [ score:ft = -34.951981832991414 ] [ score:ms = -34.83408030011385 ] [ score:mt = -35.17326803680231 ] elonn 1 6731 352 f s u s10 ;; cluster s11 [ score:fs = -33.57333115273301 ] [ score:ft = -33.93961876513661 ] [ score:ms = -32.6529742867516 ] [ score:mt = -33.397218081762475 ] elonn 1 7459 2542 m s u s11 ;; cluster s16 [ score:fs = -33.29482735979043 ] [ score:ft = -33.687616298740195 ] [ score:ms = -32.189984103971135 ] [ score:mt = -33.13899965310298 ] elonn 1 10001 3051 m s u s16 elonn 1 13086 912 m s u s16 ;; cluster s9 [ score:fs = -33.4457701986847 ] [ score:ft = -34.70059869569136 ] [ score:ms = -33.958162156208914 ] [ score:mt = -34.79598011488008 ] elonn 1 6039 692 f s u s9 i have extract starting time(3rd column), duration of speaking time(4th column) , last column(speaker name).
in below segment
elonn 1 6039 692 f s u s9 6039 starting time of segment. 692 duration of segment. s9 speaker name.
the following shell script wrote extracts whole segment , stores in file.
echo "enter audio file name. (file must of .wav format)" read filename echo "enter path of audio file" read path echo "enter folder name" read outputfolder mkdir -p $outputfolder echo "processing $filename" ./ilp_diarization2.sh $path/$filename.wav 120 $outputfolder grep "$filename.*s" $outputfolder/$filename/$filename.g.3.seg > cat
you can use wak , so:
var=$(awk '{ print $3" "$4" "$nf }' filename) or
awk '{ print $3" "$4" "$nf }' filename > outputfile $number refers space delimited (awk's default) piece of data concerned with.
Comments
Post a Comment