Strip HTML from text strings with perl
In the age of ever-increasing spam, stripping html from text strings can be a very useful function, especially when processing form input. If you strip HTML code from form input, spammers will quickly give up.
This article will show you how to accomplish this task.
Strip HTML
This example will strip all HTML markup, including the text between tags.
$text = strip_html($text);
sub strip_html {
my $string = shift;
$string =~ s/<[^>]+>(.*)<[^>]+>//ig;
$string =~ s/<[^>]+>//ig;
return($string);
}
Strip HTML Tags
This example will strip HTML tags only, leaving the text between tags.
$text = strip_html_tags($text);
sub strip_html_tags {
my $string = shift;
$string =~ s/<[^>]+>//ig;
return($string);
}
posted September 2, 2009 in Perl
