unicode - Regex to delete emojis from string -


i have list of unicode emojis , want strip emojis them (i.e want whole first part , name @ end of row). sample rows these ones:

1f468 1f3fd 200d 2695 fe0f   ; fully-qualified # 👨🏽‍⚕️ man health worker: medium skin tone 1f469 1f3ff 200d 2695        ; non-fully-qualified # 👩🏿‍⚕ woman health worker: dark skin tone 

(from have deleted spaces sake of simplicity). want match [non-]fully-qualified part # , emoji, can delete them sed. have tried following regex

 sed -e 's/\<[on-]*fully-qualified\># *.+?(?=[a-za-z]) //g'  

which tries match words [non-]fully-qualified space, # symbol, , whatever can find (non-greedy) until first letter, , replace empty string.

i have output:

1f468 1f3fd 200d 2695 fe0f   ; man health worker: medium skin tone 1f469 1f3ff 200d 2695        ; woman health worker: dark skin tone 

i have tried several posted answers no avail, , besides, i'm trying match pattern between 2 boundaries i'm having trouble

edit: i'm trying run command in git bash shipped git windows

i'm still not pretty sure, might work:

sed 's/;.*fully-qualified\s*#[^a-za-z]*/; /' 

this replace semicolon ;, followed character .*, followed "fully-qualified" text, followed number of spaces, followed hashtag, followed character not a-za-z [^a-za-z], , replace semicolon followed space.

to sure [a-za-z] captures a z , a z without other characters, seems problem, quick fix command use lc_all=c:

lc_all=c sed 's/;.*fully-qualified\s*#[^a-za-z]*/; /' file 

Comments

Popular posts from this blog

python - Operations inside variables -

Generic Map Parameter java -

arrays - What causes a java.lang.ArrayIndexOutOfBoundsException and how do I prevent it? -