I did not get any replies from the group. Anyway I took a look at 2
products, including metalogix. I don't remember the other one's name.
The conclusion I made is that if you html file is not well structured,
metalogix can help you to certain degree, but you still need to do
lots of manual work. Metalogix uses xpath to extract data from html file.
You could cleanup the html file before using metalogix.