Previous Thread
Next Thread
Print Thread
MLIST Limits #23343 01 May 18 12:14 AM
Joined: Jun 2001
Posts: 153
O
OmniLedger - Tom Reynolds Offline OP
Member
OP Offline
Member
O
Joined: Jun 2001
Posts: 153
I finally got around to porting my old xml parser over into mlist for what should be better performance. I've been intending to do it since they were demonstrated at the last conference, but the existing parser has been working fine, and I've been too busy to give it the time.

Anyway, I finally had cause to look into it due to some new imports causing issues.

I've knocked a version together that seems to handle parsing an xml file in the way I'd expect. However, what are the limits and traps I should look out for?

Previously we've only really parsed relatively small xml files, primarily for config, but I've now got an xml file thats fairly large, with over 600 records of data in it, and lots of tags as a result.

I've using a multi-layered mlist to map all of the individual tags and their relationships, to then easily allow you to pull out specific pieces of data as required.

With small files, everything is fine, but with the large file I'm encountering a segmentation fault, which I'm guessing is likely related to the number of relationships that are being built. Reading the logs the handle table is being expanded to handle this data, and I'm guessing I'm just exhausting it!

We've currently on 6.4.1547.7

Code
01-May-18 13:01:36 [p11904-10]<XML:3b6> Expanded global handle table from 500 to 1000 handles
01-May-18 13:01:36 [p11904-10]<XML:3b6> Expanded global handle table from 1000 to 2000 handles
01-May-18 13:01:36 [p11904-10]<XML:3b6> Expanded global handle table from 2000 to 4000 handles
01-May-18 13:01:42 [p11904-10]<XML:3b6> Expanded global handle table from 4000 to 8000 handles
01-May-18 13:01:58 [p11904-10]<XML:3b6> Expanded global handle table from 8000 to 16000 handles
01-May-18 13:02:00 [p11904-10]<XML:36> SIGSEGV trapped on: TSKAAJ (omni)
01-May-18 13:02:00 [p11904-10]<XML:36> Last instr: 0xff, line #: 0, location counter: fec03c94, last sbr: LSTLIN, last file #1 

Re: MLIST Limits #23344 01 May 18 03:53 AM
Joined: Jun 2001
Posts: 11,945
J
Jack McGregor Online Content
Member
Online Content
Member
J
Joined: Jun 2001
Posts: 11,945
I'm not sure there should be any limit, other than the usual vague boundaries of available resources (memory, etc.) and the explicit 32 bit limitation on pointers. At some point, the DOM model of parsing the entire document into memory at once becomes impractical and you need to switch to a stream-type parsing algorithm, but "600 records" doesn't sound anywhere near big enough to start worrying about.

The largest sample XML file that I used for testing is about 2MB with about 65K ... elements. Interestingly the log shows the same expansion to 16000 handles. But it doesn't fail. So it's not clear to me if my sample was just beneath some unseen limit. I'm guessing it is something else, because 16000 handles doesn't seem like that much either.

I'll email you the sample XML file I was using, in case you want to see if your parser runs into the same issue. It would also be helpful to try the sample parser in the EXLIB 908053 directory (MLISTXML, MLISTXML2, MLISTXML3) against your sample document. That should help pin down whether the issue is related to the document or to the parsing logic.

If you can send me the document, and ideally your parsing function, I'll get to the bottom of it.


Moderated by  Jack McGregor, Ty Griffin 

Powered by UBB.threads™ PHP Forum Software 7.7.3