A question for the DB gurus ...
Given a data set out of my control (I only load the data into my system) , and which has 30 million records, and which is uniquely identified by three keys
AddressID (9 digit int)
OrgID (9 digit int)
Type (1 char)
what would be the most efficient way of indexing said record in order to identify it ?
1) three fields, one unique index with the members AddressID, OrgID and Type
2) 1 composite char field (AddressID-OrgID-Key) and one unique index with the composite field as the only member
The keys are only ever used to id a record for deletion or update (we get an update file with these three keys).
It throws up interesting questions about the index mechansims used by progress .
I know that integers are stored more efficiently than char, so both the record and index size would be a lot bigger with a composite key, but is finding a single record using a single (composite) key more or less efficient than a three-member key (field a, field b, field c)
My inclination is to use the three-field strategy rather than the composite, but really want to hear the opinions of others more older and wiser . No insults meant . Except for Tom. You probably haven't seen the faecesbook comment he made about me
I would be less concerned about some putative efficiency than I would about rightness. You have three distinct things so I would vote for three distinct fields. No telling how you might want to access it some day.
Consulting in Model-Based Development, Transformation, and Object-Oriented Best Practice http://www.cintegrity.com
Are you saying you will catenate the three field values together into a single string (converting ints to char) and then index that?
If that is what you mean, then defining an index where the key has three field compnents will be slightly better. They key values will be a little smaller, the index will be a little smaller, and performance will a little faster. But maybe not enough to notice.
Catenating the three fields together into an additional char fields will make the record bigger. So by doing that you will consume more storage space for the records as well as for the index.
No, I have not seen that faecesbook comment. Not any other.
Not to mention that, if one *ever* wants to access the components, one has to have *both* the concatenated string version *and* the integer version or pay the price (ugh) of trying to access via substring.
That is what I meant when I said the record would be bigger, but reading what I wrote, I see I was not very clear.
The record would be bigger for several reasons:
0) two copies of the data: three separate fields and the catenated field.
1) the integers are smaller as integer than when converted into characters.
2) the more field values there are, the more overhead there is in a record.
Furthermore, some cleverness would be required to get the catnenated field to sort right if you cared about the ordering of the index entries. Consider the strings "1", "2", "11".
Your requirement is unclear to me.
When you say "is uniquely identified by 3 keys" do you mean that each of them could work? Or that only all 3 in combination make up a unique identifier?
I suspect that you mean that only the combination of the 3 fields is expected to be unique.
Are there queries where only 1 of the 3 is important? How about 2? Are the leading components always the same?
IMHO an index with 3 components is your best approach. Composite fields pretty much always end up biting you in the posterior. They are, after all, a violation of 3rd normal form.
As for the alleged FB comment -- I complimented your better half! I should think that some gratitude would be in order...
What is 3rd normal form? What is wrong with violating it? Will I have to pay a fine? Will someone make a notation in my permanent folder?
1) only the combination of the 3 fields is expected to be unique
2) the only query on this key would be for all three components. There
is no other combination
3) I have gone down the 3 component index route
FB: yeaaahhhh - the other half was very suitably happy . I wasn't .
"Each attribute must be a fact about the key, the whole key and nothing but the key." -- Wiorkowski & Kull