html - How can I create a basic human readable plain text representation of XHTML using Java? -


given simple xhtml, i'd create human readable plain text version of it. involve removing html tags, adding or preserving whitespace.

for example, input:

<div> <p>this text, <b>bold</b>.</p> <ul>   <li>point one</li>   <li>point two</li> </ul> </div> 

would become:

"this text, bold. point 1 point two" 

(commas between lis ideal... :)

jericho html parser. can either strip tags or call on "renderer" class tries mimick (eg bulleted lists tabbed)


Comments

Popular posts from this blog

linux - Mailx and Gmail nss config dir -

c# - Is it possible to remove an existing registration from Autofac container builder? -

php - Mysql PK and FK char(36) vs int(10) -