Jump to content

getting rid of HTML formattiong in incoming emails


coverup
 Share

Recommended Posts

Can anybody give a hint on how can I get rid of html formatting in incoming email? I've looked at several tools such as demime, but they remove all MIME stuff including attachments. Other are for servers, not for email clients.

 

I don't want to get rid of all attachments, I need almost all of them. It's only when I read mail from home using elm or pine, html formatting becomes really annoying. Does anybody know a simple script that could be used with pine or elm to remove html rubbish? What about kmail or netscape?

 

Thanks.

Va.

Link to comment
Share on other sites

http://www.museum.state.il.us/ismdepts/lib...eg-replace.html

 

Example 3. Convert HTML to text



// $document should contain an HTML document.

// This will remove HTML tags, javascript sections

// and white space. It will also convert some

// common HTML entities to their text equivalent.



$search = array ("'<script[^>]*?>.*?</script>'si",  // Strip out javascript

                "'<[/!]*?[^<>]*?>'si",  // Strip out html tags

                "'([rn])[s]+'",  // Strip out white space

                "'&(quot|#34);'i",  // Replace html entities

                "'&(amp|#38);'i",

                "'&(lt|#60);'i",

                "'&(gt|#62);'i",

                "'&(nbsp|#160);'i",

                "'&(iexcl|#161);'i",

                "'&(cent|#162);'i",

                "'&(pound|#163);'i",

                "'&(copy|#169);'i",

                "'(d+);'e");  // evaluate as php



$replace = array ("",

                 "",

                 "1",

                 """,

                 "&",

                 "<",

                 ">",

                 " ",

                 chr(161),

                 chr(162),

                 chr(163),

                 chr(169),

                 "chr(1)");



$text = preg_replace ($search, $replace, $document);

     



Note: Parameter limit was added after PHP 4.0.1pl2.

 

This is PHP and I'm not sure how it would work.

Link to comment
Share on other sites

I couldn't test the PHP file, but sed works wonders:

 

sed -e 's/<[^>]*>//g' myfile.html

 

This removes anything between < and > so if you get email with links like this <a href="blah">Click here</a>

 

You wouldn't be able to see where it pointed though.

 

You could pipe it to another location rather than your inbox like this:

 

sed -e 's/<[^>]*>//g' /location/of/my/Inbox > /location/of/my/html-removed/Inbox

Link to comment
Share on other sites

You might want to check out sylpheed.

 

One problem, sylpheed seems to be incompatible with kmail and netscape mail. How do I import exisiting mail folders in sylpheed? I googled this topic, but could not find anything that would answer this question.

 

Any help? Many thanks.

Va.

Link to comment
Share on other sites

Can anybody give a hint on how can I get rid of html formatting in incoming email? I've looked at several tools such as demime, but they remove all MIME stuff including attachments. Other are for servers, not for email clients.

 

configure rightly your /etc/mailcap or your ~/.mailcap. For example, I use mutt, and my ~/.mailcap has this entry(1):

text/html; lynx -force_html -dump %s; copiousoutput

 

so every html formated email is parsed by lynx before opened in mutt.

 

Also my ~/.muttrc has this entry:

auto_view text/html #Automatically use entries from ~/.mailcap

 

I'm sure that pine or elm will work in a similar way

 

(1) search in google for hundreds of variants and examples of configuring your mailcap, and use that one that fits better your needs (for example if you are always in X you might want something like this: text/html; netscape -remote 'openURL(%s)';

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
 Share

×
×
  • Create New...