linux - Remove duplicate words/string from a tab separated file -


i want remove duplicate words/strings large tab separated file using linux commands.

names            john, cnn, mac, tommy, mac, patrick, ngc, discovery, john, cnn, adam, patrick cities            san jose, santa clara, san franscisco, new york, san jose, santa clara 

the above file format, want retain tabs , commas after removing duplicate words.

names            john, cnn, mac, tommy, patrick, ngc, discovery, adam cities            san jose, santa clara, san franscisco, new york 

any appreciated.

awk 'begin {          fs = ", |\t"      }      {           printf "%s\t", $1           delim = ""           (i = 2; <= nf; i++) {               if (! ($i in seen)) {                   printf "%s%s", delim, $i                   delim = ", "               }               seen[$i]           }           printf "\n"           delete seen      }' inputfile 

if you're not using gnu awk (gawk) can't delete array, use split("", array) instead.


Comments

Popular posts from this blog

jquery - Invalid Assignment Left-Hand Side -

java - Play! framework 2.0: How to display multiple image? -

gmail - Is there any documentation for read-only access to the Google Contacts API? -