Oh, man. φεῦ φεῦ!
Before I decided on the wiki format for Scholiastae.org, I spent some quality time with the Perseus XML. I tried running it through a validating XML parser. That was not a happy experience, mostly due to missing DTD files (local and remote both). Missing entity definitions made me give up on using data that way. Now when I pilfer an open text from Perseus, I tweak a crude, hand-rolled XML parsing engine to rip out only the bits I want.
Diogenes interacts nicely with the dictionaries, but it does not really understand the Perseus XML format — just the scary record-based file system that the TLG continues to use. I have it in my mind that someone is funding Heslin to make Diogenes work with Perseus data, too, but I cannot find proof of that right now.