$('#table_A-1 ~ div tbody')
Next up is to write some javascript to iterate over each tr in the tbody and write out the UID and name in Javascript so I can paste it into my file. A bit of trial and error later and I come up with the following:
(function () {
var elements = document.querySelectorAll('#table_A-1 ~ div tbody tr');
var result = "";for(var i=0; i < elements.length; i++) {
result += "'" + elements[i].childNodes[1].childNodes[1].innerText + "':'" +
elements[i].childNodes[3].childNodes[1].innerText + "',\n";
}
return result;
})();
Which generates exactly what I want! I paste the resulting string into a new file and try it out - but its not working. For some reason, the lookup on UID is not matching. I look a bit closer and notice that the values in the HTML have some non printable characters in them:
1.2.840.10008.5.1.4.1.​1.​2
I make another change to my javascript to strip out non printable charcters:
(function () {
var elements = document.querySelectorAll('#table_A-1 ~ div tbody tr');
var result = "";for(var i=0; i < elements.length; i++) {
result += "'" + elements[i].childNodes[1].childNodes[1].innerText.replace(/[^\x20-\x7E]+/g, '') + "':'" +
elements[i].childNodes[3].childNodes[1].innerText.replace(/[^\x20-\x7E]+/g, '') + "',\n";
}
return result;
})();
And now I have the data I want! Here is a link to the resulting javascript. Pretty cool little hack demonstrating the power of what you can do with Javascript in a web browser. This same strategy can be used to quickly extract data from any web page into any format you want.
LOL, I always feel the same way about XML.
ReplyDeleteThis "fear of HTML" is pretty amusing, when you consider that I use the HTML form of tables to encode the DocBook XML source of the standard (from which the HTML and CHTML and everything else is derived); so your JavaScript could just as easily look for <table/> elements with an id or label attribute in the DocBook XML without having to skip the rendering cruft in the HTML.
ReplyDeleteThere are also a bunch of XSL-T stylesheets in the "support" folder of the "sourceandrenderingpipeline" file in the distribution of each release, which were intended to inspire folks to do this sort of thing. E.g., you could just do <xsl:template match="docbook:table[@label = 'A-1']"/>, etc.
I.e., XSL-T is nothing to be afraid of either.
David