Microsoft
Software
Hardware
Network
Question : Bash scripts: how to strip these html tags
Hi experts,
Here is a html file. How can I print the html tags and "common words" such as "in", "the", "a" into one file "tag_file.txt"; and print others into another file "content_file.txt"?
I use Bash scripts. I am a newbie in Bash field. Any help is highly appreciated!
For example: in the code snippt:
All the html tags such as "
" and "the" are all printed to "tag_file.txt" whereas the rest should be printed to "content_file.txt"
Code Snippet:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13:
3019337
story
the element is +
-
Open in New Window
Select All
Answer : Bash scripts: how to strip these html tags
Do you mean like this?
tr ' >' '\n' < file | grep '<\|\<\(in\|the\|a\)\>' > tag_file.txt
Random Solutions
Get the position of a MovieClip with respect to the stage
550 Relaying Denied message when trying to send email fom Outlook from Sendmail server
scheduling start date and end date in a custom list
problem using poolmon to find windows 2003 SP1 memory leak.
Select count(*) query
split comma delimited values into seperate columns
How do OEM OS reinstallation CDs determine serial numbers without user input?
Server 2008 TS Gateway and VPN
OMA-ActiveSync Problems 2003
SQL Maintenance Plan: Transaction Log Backup failing