ISAM-A - Secondary Index & Compressed Records
#23850
18 Oct 18 04:16 AM
|
Joined: Jun 2001
Posts: 153
OmniLedger - Tom Reynolds
OP
Member
|
OP
Member
Joined: Jun 2001
Posts: 153 |
Hope the conference went well!
I’m gearing up for disappearing at any moment, as Kate is due on the 4th of November, so could be any day now. As a result I’ve been looking into a few little projects that don’t have immediate delivery times, and won’t be disrupted by me disappearing!
So I’ve been looking into the possibility of storing deltas of ISAM records for the purposes of auditing the system. FileHooks are responsible for catching the data changes, and then filing off the deltas, and everything is working fine on that front. However, I’m trying to work out the best way of actually storing them. As such I was looking into actually dipping our toes into ISAM-A, as they appear to have a compression feature built in that would work perfectly with deltas.
However, in trying to make use of them I’ve hit a bit of a snag, and I’m not sure if there is a but somewhere, or if I’m just missing something.
I’ve explored two different ways of making the files, either with ISMUTL, or with ALLOCATE'INDEXED. ALLOCATE'INDEXED works perfectly, but is missing the ability to enable the data compression. I can create the same core files with ISMUTL, but for some reason I'm having issues getting more than the primary key setup.
I'm only really having this issue with ISMUTL, and using the options it provides, the key is listed as being correctly build using the STAT/REBUILD options etc. There are no issues with setup, but upon trying to use the secondary index I get a "No such secondary key number" error.
I'm trying to resolve this with a very simple file structure, just two 6 byte string keys for the primary and secondary, with a block of data behind them.
|
|
|
Re: ISAM-A - Secondary Index & Compressed Records
#23851
18 Oct 18 05:42 AM
|
Joined: Jun 2001
Posts: 11,945
Jack McGregor
Member
|
Member
Joined: Jun 2001
Posts: 11,945 |
Everyone missed you, but your comrade Stuart (and Viv) performed admirably in all the essential roles of the event (eating, drinking, schmoozing, ...) (If we can't manage to decouple your reproductive cycle from the Conference cycle, you can at least rest assured that your backup is solid.)
As for your ISAM-A scheme, I'm all for promoting ISAM-A (as I think it offers several advantages over the old ISAM, more than enough to justify the extra cost). But I may need to play with your idea a bit before deciding if it really is the right application for it (not to mention whether there are some inconsistencies that need to be addressed).
It sounds to me like you want a fixed record size of 12 (for the two six-byte keys) and a total record size of 12 plus whatever your maximum data record size is.
Offhand it seems that ALLOCATE'INDEXED should handle this by setting the REC'SIZE field to the total record size and the VARIABLE'REC'SIZE field to the difference between that and 12. (I'm not sure why ISMUTL uses the total and fixed, while ALLOCATE'INDEXED uses the total and variable, but in theory they should both arrive at the same place.) The compression option is implicit whenever dealing with variable record sizes.
As for the efficacy of the compression, it only squeezes contiguous runs of nulls, 0's and spaces. This is generally about the best you can do for small blocks of data like individual records, at least without having a 'dictionary' of common patterns available. It's the same scheme now employed in the LOGFIL: type of file hook.
Currently the LOGFIL: hooks write out both the pre- and post- versions of each changed record, along with the "envelope", so it's a lot more verbose than just a "delta".
I'm curious whether you intend to either trust that the pre- version of the record matches the last post- version logged, or perhaps verify it, in order to avoid redundantly re-logging the pre- version on each change. There didn't seem to be any good way to do that in the general-purpose file hook, particularly without indexing, but perhaps in your version it would be realistic. But there's also the question of whether the extra work involved in minimizing the amount of log data written is a good trade-off.
One of the things we talked about at the Conference was the increased pressure on software to log everything so as to be able to go back and determine who changed anything. With that objective, it seems like ease of implementation and lowest performance overhead would be more important than saving disk space. Particularly since you can apply some kind of more effective compression algorithm to the entire log files in background (as you archive them hourly, daily, or whatever) than you can to the the individual records.
Well I'm rambling. I'll play with it a bit later today and maybe have a more definite answer by tomorrow.
Best of luck to both of you on your upcoming release!
|
|
|
Re: ISAM-A - Secondary Index & Compressed Records
#23852
18 Oct 18 04:10 PM
|
Joined: Jun 2001
Posts: 11,945
Jack McGregor
Member
|
Member
Joined: Jun 2001
Posts: 11,945 |
I modified the EXLIB example program ISPVAR.BP[908,33] to have 2 indices, so it should now be similar to what you are trying to do. (Just RUN ISPVAR and it will build the file and then load data into it from program source files in the [9??,*] directories.) The compression overall isn't terribly effective, since the total IDX size is still slightly larger than the total of the file data being loaded into it, but given the overhead of the two indices, the general overhead of index structures in general, and the fact that it is hard to do much compression one record at a time, perhaps it's as good as it's likely to get.
|
|
|
Re: ISAM-A - Secondary Index & Compressed Records
#23853
18 Oct 18 08:18 PM
|
Joined: Jun 2001
Posts: 153
OmniLedger - Tom Reynolds
OP
Member
|
OP
Member
Joined: Jun 2001
Posts: 153 |
Looks like you might have forgotten to push it to the bitbucket repository 
|
|
|
Re: ISAM-A - Secondary Index & Compressed Records
#23854
19 Oct 18 03:54 AM
|
Joined: Jun 2001
Posts: 11,945
Jack McGregor
Member
|
Member
Joined: Jun 2001
Posts: 11,945 |
Oops! (One week out of the office and I start forgetting basic things.) Should be good now.
|
|
|
Re: ISAM-A - Secondary Index & Compressed Records
#23855
22 Oct 18 09:14 PM
|
Joined: Jun 2001
Posts: 153
OmniLedger - Tom Reynolds
OP
Member
|
OP
Member
Joined: Jun 2001
Posts: 153 |
Thanks Jack, that all works perfectly, but only on the primary still.
It's possible I have an incomplete setup. I've not looked into ISAM-A in any great depth, as it was already functional on our development system, but I'm guessing that's possibly down to some old testing Barry did with it years back. As such I'm not actually familiar with the additional licence costs, or if what additional libraries or modules may need to be installed. It's more than likely we have some old components that might not be properly building the ISAM files.
Can you point me in the right direction for some documentation, or give me the details, so I can check everything is installed and setup correctly?
I've turned ISAM tracing on, and I'm simply getting the following using a simple FIND to test for existing data:
23-Oct-18 10:13:04 [p26845-7] FIND #1 23-Oct-18 10:13:04 [p26845-7] Trapped Basic Error #223 (No such secondary key number) at location counter 135
|
|
|
Re: ISAM-A - Secondary Index & Compressed Records
#23856
23 Oct 18 04:26 AM
|
Joined: Jun 2001
Posts: 11,945
Jack McGregor
Member
|
Member
Joined: Jun 2001
Posts: 11,945 |
In order to use ISAM-A, you need the ISMA flag on the license, as seen in ABOUT.LIT:
License Options: ISMA ATE ATS PDFX APEX AXL ASQL
Without that, I think you get some cryptic error, but not as cryptic as the one you're getting. So I'm guessing that your in-house development license has the option. (In general, we try to enable all the possible options for the development licenses.)
The library itself is static linked into the ashell or ashw32 executable, so that's not a factor. Although the version of the library might be an issue if running an older copy. You can see that with ISMUTL, which displays it on the top line, something like:
A-Shell Indexed Sequential Access Method Version: 6.11q MS9632/0999
(It's the ltter after 6.11 which matters.) If you have something much below q, maybe your version doesn't support the feature. Although support for multiple indices was there from the beginning, there could be a conflict with the variable length/compression feature.
Also, in the process of investigating this, I realized that we haven't configured the options for ISAM-A under Linux the same as for Windows. (Inexplicably, the shipping version under Windows doesn't enable variable length records; that's on my list to resolve but apparently none of the Windows users are missing it.)
As for the cost, I believe it is a 10% surcharge on the base license. That's probably not an issue if you're using it extensively, but it might be a stumbling block if you just want to use it for one file. It's mostly a pass-through cost though so I'm not sure we have much wiggle room.
If the light doesn't suddenly turn on here, feel free to send me the file you create with ISMUTL (along with the relevant A-Shell and ISAM-A/ISMUTL versions) and I'll see if it's any different than mine.
|
|
|
Re: ISAM-A - Secondary Index & Compressed Records
#23857
30 Oct 18 04:30 AM
|
Joined: Jun 2001
Posts: 153
OmniLedger - Tom Reynolds
OP
Member
|
OP
Member
Joined: Jun 2001
Posts: 153 |
I've not had much luck with the code you sent over, versions etc all seem to be correct, and it seems to have no issue creating the file or accessing it. But aborts if you attempt to interact with the second index.
On a related note, I've been exploring MySQL exports as an alternative, as we're planning on mirroring our data in MySQL via filehooks to give us some additional options for interfaces. While looking into a solution using MySQL I cam across MariaDB which is a branch of MySQL. It has some nice versioning utilities built into it. As such it might solve everything in one hit, rather than managing several solutions.
It seems to work perfectly fine with the existing MySQL interfaces, but I was wondering if you'd explored it at all.
|
|
|
Re: ISAM-A - Secondary Index & Compressed Records
#23858
30 Oct 18 07:15 AM
|
Joined: Jun 2001
Posts: 11,945
Jack McGregor
Member
|
Member
Joined: Jun 2001
Posts: 11,945 |
Sorry for the quality control failure here - indeed it was throwing a spurious invalid key # error when trying to access secondary keys within the context of a compressed or variable length record due to misinterpretation of some bit flags. (Apparently no one has ever tried to actually use a secondary key with variable length records; probably because variable length records are kind of exotic and rarely used.)
What version/platform are you interested in having this work in? (I'm inclined to leave 6.4 "stable", i.e. as-is, but could easily release the patch as 6.4.1558.5 if that's where you're at. Otherwise it would show up in 6.5.1649.7 or 6.5.1650.0, depending on whether it gets bundled with some other things in progress.)
As for MariaDB versioning, I haven't yet had a chance to investigate it, although it certainly sounds interesting. It was new in MariaDB 10.3; I'm not yet even sure if the existing ASQL connector is drop-in compatible with it or not. I'm guessing it needs a recompile.
As for the merits of versioning directly integrated into the DB tables vs. separated in log files, I guess my initial reaction would be fear, i.e. fear of possible ramifications of complicating the database with all of that extra information and configuration options. But it does appear that they have thought of most of the concerns, such as allowing you to partition the history to keep it out of the way of the active data, getting rid of old changes, excluding certain columns from versioning, etc.
So apart from putting a little more pressure on the DB adminstrator's management skills, it may actually be close to an ideal solution. At least in places where you had other reasons to want to store or mirror your data in a relational database. For people who just want to maintain change logs of ISAM or random data, the new LOGFIL: type of file hooks seem like a much simpler solution.
I'm definitely going to carve out some time to investigate this, starting with making sure we have a working ASQL connector.
|
|
|
Re: ISAM-A - Secondary Index & Compressed Records
#23859
30 Oct 18 09:11 PM
|
Joined: Jun 2001
Posts: 153
OmniLedger - Tom Reynolds
OP
Member
|
OP
Member
Joined: Jun 2001
Posts: 153 |
Thanks Jack. We are currently still in 6.4, but this would be part of a larger future release, so I'm not opposed to moving to 6.5 for new features.
I did a little testing with MariaDB myself, and it appears to be functional with the existing connector. From the little I've so far read, this isn't unusual though, as they appear to have designed it to keep it compatible with existing MySQL implementations.
As for the LOGFIL file hooks, they were also on my list of things to look into! The largest thing I could see needing from them is the ability for them to store more about the environment that triggered them. This is mainly related to holding onto information about the user that actually performed the action. FileHooks already pass the user information, but from what I recall that is the linux user. The issue for us is that every user uses the same linux login, and their application account is managed separate to that inside our software. My existing FileHooks implementation is able to get this data as it's a subroutine executed in the running process, so has access to some of that information. The LOGFIL is outside of our control, so we'd need an additional way of passing information to it in some way.
|
|
|
Re: ISAM-A - Secondary Index & Compressed Records
#23860
31 Oct 18 01:08 AM
|
Joined: Nov 2006
Posts: 2,278
Stephen Funkhouser
Member
|
Member
Joined: Nov 2006
Posts: 2,278 |
Hi Tom,
We also have a single Linux login, and application enforced authentication system. We use MX_SETUSRNAM to override the server login, and that is set correctly in MX_FILEHOOK event envelopes. I assume that should also be the case for LOGFIL.
Stephen Funkhouser Diversified Data Solutions
|
|
|
Re: ISAM-A - Secondary Index & Compressed Records
#23861
31 Oct 18 03:06 AM
|
Joined: Jun 2001
Posts: 153
OmniLedger - Tom Reynolds
OP
Member
|
OP
Member
Joined: Jun 2001
Posts: 153 |
Thanks Stephen, I hadn't actually noticed that before. Barry's old user management code has been in place for so long that I unwisely didn't go looking too deeply into it.
This will actually be very useful!
|
|
|
Re: ISAM-A - Secondary Index & Compressed Records
#23862
31 Oct 18 04:39 PM
|
Joined: Jun 2001
Posts: 11,945
Jack McGregor
Member
|
Member
Joined: Jun 2001
Posts: 11,945 |
Ok, so here's a 6.5 update which fixes the ISAMA issue. (If you need another platform/variation, let me know.) ash65notes.txt ash-6.5.1650.0-w32-upd.zip ash-6.5.1650.0-w32c-upd.zip ash-6.5.1650.0-el7-upd.tz compil-6.5.888-el7.tz As for the ASQL connector, it doesn't surprise me that the existing ones work with the MariaDB or MySQL servers (as you say, that's by design). What I'm not yet sure about is whether there will be a library incompatibility between the SQL connector and the MariaDB client library (like you ran into recently with MySQL 8). I still have to look into that.
|
|
|
Re: ISAM-A - Secondary Index & Compressed Records
#23863
31 Oct 18 08:42 PM
|
Joined: Jun 2001
Posts: 153
OmniLedger - Tom Reynolds
OP
Member
|
OP
Member
Joined: Jun 2001
Posts: 153 |
Thanks Jack, will take a look at get back to you.
|
|
|
Re: ISAM-A - Secondary Index & Compressed Records
#23864
25 Nov 18 09:11 PM
|
Joined: Jun 2001
Posts: 153
OmniLedger - Tom Reynolds
OP
Member
|
OP
Member
Joined: Jun 2001
Posts: 153 |
I got completely distracted away from this as Daisy was born just over 2 weeks ago now! 6.5 should be fine for this thanks Jack. I haven't had a chance to look back into it since I've been back, but should this feature roll out it will definitely be in a general update to the system, so there no need to be too precious about sticking to 6.4.
Thanks!
|
|
|
Re: ISAM-A - Secondary Index & Compressed Records
#23865
26 Nov 18 02:36 AM
|
Joined: Jun 2001
Posts: 11,945
Jack McGregor
Member
|
Member
Joined: Jun 2001
Posts: 11,945 |
Excellent! And congratulations!
|
|
|
|
|