Converting HTML To Plain Text In PHP For E-mail


Answer :

Use html2text (example HTML to text), licensed under the Eclipse Public License. It uses PHP's DOM methods to load from HTML, and then iterates over the resulting DOM to extract plain text. Usage:



// when installed using the Composer package
$text = Html2Text\Html2Text::convert($html);

// usage when installed using html2text.php
require('html2text.php');
$text = convert_html_to_text($html);


Although incomplete, it is open source and contributions are welcome.



Issues with other conversion scripts:




  • Since html2text (GPL) is not EPL-compatible.

  • lkessler's link (attribution) is incompatible with most open source licenses.



here is another solution:


$cleaner_input = strip_tags($text);

For other variations of sanitization functions, see:


https://github.com/ttodua/useful-php-scripts/blob/master/filter-php-variable-sanitize.php



Converting from HTML to text using a DOMDocument is a viable solution. Consider HTML2Text, which requires PHP5:




  • http://www.howtocreate.co.uk/php/html2texthowto.html

  • http://www.howtocreate.co.uk/php/

  • http://www.howtocreate.co.uk/jslibs/termsOfUse.html



Regarding UTF-8, the write-up on the "howto" page states:




PHP's own support for unicode is quite poor, and it does not always handle utf-8 correctly. Although the html2text script uses unicode-safe methods (without needing the mbstring module), it cannot always cope with PHP's own handling of encodings. PHP does not really understand unicode or encodings like utf-8, and uses the base encoding of the system, which tends to be one of the ISO-8859 family. As a result, what may look to you like a valid character in your text editor, in either utf-8 or single-byte, may well be misinterpreted by PHP. So even though you think you are feeding a valid character into html2text, you may well not be.




The author provides several approaches to solving this and states that version 2 of HTML2Text (using DOMDocument) has UTF-8 support.



Note the restrictions for commercial use.



Comments

Popular posts from this blog

Converting A String To Int In Groovy

"Cannot Create Cache Directory /home//.composer/cache/repo/https---packagist.org/, Or Directory Is Not Writable. Proceeding Without Cache"