Android - How To Get Plain HTML Using EvaluateJavascript From Webview? JSOUP Not Able To Parse The Result HTML
Answer :
You should use JsonReader to parse the value:
webView.evaluateJavascript("(function() {return document.getElementsByTagName('html')[0].outerHTML;})();", new ValueCallback<String>() { @Override public void onReceiveValue(final String value) { JsonReader reader = new JsonReader(new StringReader(value)); reader.setLenient(true); try { if(reader.peek() == JsonToken.STRING) { String domStr = reader.nextString(); if(domStr != null) { handleResponseSuccessByBody(domStr); } } } catch (IOException e) { // handle exception } finally { IoUtil.close(reader); } }
});
for remove the UTFCharacthers use this function:
public static StringBuffer removeUTFCharacters(String data) { Pattern p = Pattern.compile("\\\\u(\\p{XDigit}{4})"); Matcher m = p.matcher(data); StringBuffer buf = new StringBuffer(data.length()); while (m.find()) { String ch = String.valueOf((char) Integer.parseInt(m.group(1), 16)); m.appendReplacement(buf, Matcher.quoteReplacement(ch)); } m.appendTail(buf); return buf; }
and call it inside the onReceiveValue(String html) like this:
@Override public void onReceiveValue(String html) { String result = removeUTFCharacters(html).toString(); }
You will obtain a string with clean html.
Bye, Alex
try this
v=StringEscapeUtils.unescapeJavaScript(v.substring(1,v.length()-1));
unescapeJavaScript
is from apache commons-lang
So many string processing for android webview, why...
The removeUTFCharacters
method provided in the previous answer is not clean enough.There still remain stuffs like \"
.
Comments
Post a Comment