Validation & use of XmlResolver

Mar 18, 2014 at 7:59 AM
Hi Jeff,
Great job on Gespio. I’m using it to parse SEC filing and noticed a few things.

1.When the schema is loaded, it’s validated, but only records the first error, which is always reported as schema node is missing. I’ve got a change which records all the errors, along with their source code locations.
  1. The validation is only 1 level deep. To fix the validation errors, you need to load the additionally referenced schemas. Not sure how much other user’s scenarios need full validation, but providing the option seems at least sensible.
  2. Like others in this forum, I want to use a custom XmlResolver to cache common linkbases and schemas. I added an XmlResolver property to the XbrlDocument to facilitate this.
I’ve got code changes for all 1-3. I’m not familiar with Codeplex – what’s the best way to get fixes to you for consideration?

Will include tests, and happy to help out with bugs, etc.
Coordinator
Mar 18, 2014 at 10:25 AM
Hi Mark,

Thank you for the kind words!

Thank you for your willingness to submit a patch! To do so, click the "SOURCE CODE" tab near the top of the page and then select the "Upload Patch" link just under the tab row once you get to the Source Code page. I would be more than happy to take a look at your patches! Once I see them, I will most likely be able to review them and discuss them in a blog post at Gepsio.wordpress.com. When I do, I will provide a link to the blog post here, so that you can read about the follow up.

I may end with some difficulty in the longer term, because I am building a custom implementation of XbrlSchema for the future. I am doing this out of necessity, as I am readying Gepsio to support .NET 4.5, WinRT, Windows Phone 8, and possibly Xamarin, and the base XmlSchema class is not available on all of those platforms. I have no idea why, but that's the way it is. In the interim, though, I would be happy to review your patches and fold them in to what I have today.

Thanks for using Gepsio,
Jeff Ferguson
Coordinator
Mar 18, 2014 at 1:16 PM
Hello again, Mark,

The page at http://msdn.microsoft.com/en-us/library/vstudio/gg597392(v=vs.100).aspx seems to imply that the XmlResolver class is available in the Portable Class Library, so your XmlResolver work should survive Gepsio's Portable Class Library port (note my comment in my previous message that "I am readying Gepsio to support .NET 4.5, WinRT, Windows Phone 8, and possibly Xamarin") even if I have to provide an updated internal implementation of XbrlSchema. Thank you again for the idea!

Thanks again for using Gepsio,
Jeff Ferguson
Mar 19, 2014 at 1:18 AM
Hi Jeff,
I submitted a patch for implementing the XmlResolver. This seems to have been requested a fair bit, and I include a test case, which leverages one of your existing Xbrl files as sample data.

ExecuteXBRLCONFCR320070305Testcases failed when I got the source out, and this is still the case, but everything else passes.

I decided against automatically enabling deep validation of schemas. It's not what Gepsio does at present, and it throws up errors in many of the existing test cases. If someone were using Gepsio to fully validate Xbrl, then they'd probably want this, but there is a hefty performance hit.

Regarding XmlSchema support in the PCL. Microsoft go to some length to keep the mobile versions of .NET as small as possible. Although it makes your task harder, I suspect schema validation is rarely required in client apps. Most will parse the XML data, and give up if it's not what they expect - why is probably the same thing they'd do if they got a schema failure.

I noticed from my testing, that the taxonomy is loaded twice, once as a schema and then as XmlDocument. I've not looked into why. Have you considered conditionally compiling Gepsio? Perhaps on mobile platforms, you skip the schema set validation.

Good job again.
Mark.
Coordinator
Mar 19, 2014 at 2:17 AM
Hi Mark,

Many thanks for the patch. I will review it soon. I'm sure that it works well.

The failure of the ExecuteXBRLCONFCR320070305Testcases unit test is the reason that Gepsio is not yet a 1.0 product. I consider Gepsio to be out of "Alpha/CTP" mode once that test passes. That test represents the full conformance suite from the XBRL organization, and, once Gepsio passes that unit test, it will have passed the full conformance suite and I can advertise it as a 100% compliant XBRL parser and validator (which is what takes Gepsio above and beyond just a parser -- parsing is one thing, but validation of XBRL semantics is quite another). Until that time, the ExecuteXBRLCONFCR320070305Testcases unit test will indeed fail.

I think you're on target with your PCL/XmlSchema comments. The difficulty is that schema types have semantic meaning in XBRL. For example, monetaryItemTypes defined in XBRL schemas have to have an ISO 4217 namespace and semantics for currency values, and Gepsio needs type information from the schema to validate that sort of thing. In fact, my current (temporary) roadblock with "PCL Gepsio" is with the validation of monetary types, which have test cases in the aforementioned XBRL-CONF-CR-320070305 conformance suite. I'm afraid that there is no getting around the need for schema types if I hope for this thing to be compliant.

Thanks again for your comments, patches, and suggestions. Keep them coming. Folks like you keep this thing going. Feel free to post here in the forums, send email to Gepsio@outlook.com, visit the blog at Gepsio.wordpress.com, visit on Facebook at facebook.com/Gepsio, or follow the Twitter updates at twitter.com/gepsioxbrl.

Keep in touch,
Jeff Ferguson
Coordinator
Mar 19, 2014 at 10:15 PM