HtmlZap ATL ActiveX Control
HtmlZap isn't exciting to look at. In fact, it's invisible! However, this little control can make your life much easier if you need to parse HTML files for one reason or another. In essence, HtmlZap is a "canned" HTML parsing engine. If you feed HtmlZap an HTML file, it will dice the text and tags faster than a Ginsu® and give you the pieces one at a time for your code to digest. The control takes on all the tedious work of stepping through the HTML code; all you need to do is act on what it finds. You can even modify the tag by setting properties and write it out again! Since I wrote this control, I've used it to perform dozens of tedious HTML reformatting and copying tasks. It makes processing HTML so simple that you can afford to write little Visual Basic programs to solve your HTML mangling problems. I have one (now very large) VB project that I use whenever I need to do something non-trivial to an HTML file. I just add a new button or form for each task, write a few lines of code, click the new button, and get on with life.
Download
Installing HtmlZapInstalling HtmlZap is fairly straightforward. Begin by downloading the HtmlZap ZIP file. Once you've got the ZIP file, decompress it into a fresh directory. The only remaining step is to "register" the control; most programming languages and development environments for Windows provide a simple way of doing this. In VB 6, you select the "Project / Components" command, then click the "Browse" button on the "Controls" page of the resulting dialog box. VB will pop up a "File Open" dialog box; navigate to the directory where
you unzipped HtmlZap, then "open" To use the control in your VB application, make sure the "HtmlZap ATL Control" item is checked in the Components list, then simply draw an HtmlZap control on the form where you want to use it. Change the name to something useful, then use the control's properties and methods in your code. That's it! Though I've concentrated on Visual Basic so far, there's no reason why you couldn't use HtmlZap controls with other languages. HtmlZap should work with any language or development that supports ActiveX (OCX) controls, including scripting languages like VBScript and ASP.
You may want to look at the HtmlZap help page and the sample code. Using HtmlZap as a COM object instead of a componentWhile using HtmlZap as a component on your form is simple, it may be more efficient to use it as object declared locally to the method where you want to use it. You can also declare the control using a VB statement like:
Important: If you want to use HtmlZap in this way you should add it to your VB project in the "References" dialog box, not the "Components" dialog box.
Using HtmlZap with Web ActiveX objectsHtmlZap was built with Microsoft's ATL lightweight COM framework. That's
jargon which basically means that the control is very small (64K, not bad
even for a DOS app!) and that it doesn't use any of the gargantuan MFC DLLs.
In particular, HtmlZap doesn't need Boiled down to the bottom line, this means that HtmlZap needs only the basic APIs supplied by Windows 9x and NT. You don't have to ship any "redistributable" DLLs with your HtmlZap-based application (unless, of course, your app needs them). This is especially important if you're building ActiveX controls for the Web (somebody out there must be! <g>). If your ActiveX control needs redistributable DLLs, the people who visit your Web site are going to have to wait for those DLLs to download before your control can start. Not a good thing, since some of those DLLs are near a megabyte in size. You can use HtmlZap to build Web ActiveX components with confidence, because it's entirely self-reliant, and it only adds 64K to your download.
HtmlZap and scripting languagesHtmlZap can be called from scripting languages like ASP and VBScript. Here's an example of a VBScript that prints all the hyperlinks in an HTML document to the screen: ' ' HtmlZap test ' option explicit
dim hz
set hz = CreateObject("HtmlZap.HtmlZap.1")
hz.load "index.htm"
while not hz.eof
if hz.tagname = "a" then wscript.echo hz.param("href") end if
hz.next wend
HtmlZap Source CodeThe source code for the HtmlZap component is available below. The current version was compiled and tested on Windows 7 x64 using Microsoft Visual Studio 2010. Using the sourceIf you create something new using the source supplied from this site, please remember:
Most importantly, if you fix a bug or add an enhancement, please let me know! I'd love to incorporate your improvements into the official source! Platform SDKImportant! If you decide to build HtmlZap from the source code,
you need to have Microsoft's Platform SDK installed on your development
system. The default set of libraries and header files installed with Visual
Studio will not work, you'll get errors like: Download sourceHtmlZap's source code is available from GitHub: Legal StuffHtmlZap is freeware published under the GNU General Public License without any warranties of any kind whatsoever! Use at your own risk! HtmlZap is Copyright © 1997, 2001, 2002, 2013 Michael Newcomb.
Contact InformationIf you have any questions, comments, feature suggestions, or problems, please don't hesitate to send me mail at htmlzap@miken.com.
Revision HistoryVersion 1.1.1 -- 23 January 2013
Version 1.1 -- 23 June 2002
ArchivePrevious versions of HtmlZap.
Last revised: 21 January 2013
|