dbTalk Databases Forums  

Internal database structure

comp.databases.pick comp.databases.pick


Discuss Internal database structure in the comp.databases.pick forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Noah
 
Posts: n/a

Default Internal database structure - 09-16-2005 , 03:28 PM






I must be a real geek. I got interested in the internal stucture of the
underlying database and now I just can't stop.

HELP! SAVE ME FROM MYSELF! <GRIN>

Seriously, can anyone save me some time and point me to a document that
details the Pick/Unidata internals?

Thanks,

Noah
replace spam with myname


Reply With Quote
  #2  
Old   
rog
 
Posts: n/a

Default Re: Internal database structure - 09-16-2005 , 03:56 PM






"Noah" <spam.hart (AT) gmail (DOT) com> wrote

Quote:
I must be a real geek. I got interested in the internal stucture of
the
underlying database and now I just can't stop.

HELP! SAVE ME FROM MYSELF! <GRIN

Seriously, can anyone save me some time and point me to a document
that
details the Pick/Unidata internals?
http://www.u2ug.org/

Will point you to a user email list that may help with specific
problems.

http://www-306.ibm.com/software/data/u2/pubs/library/

Will then let you download specific Unidata documents.

Good luck.

rog




Reply With Quote
  #3  
Old   
Mark Brown
 
Posts: n/a

Default Re: Internal database structure - 09-16-2005 , 06:38 PM



I think I can safely say without fear of law suit or having my legs broken:


<hdr> data <sm> <hdr> data <sm><sm>

Header is different for different implementations. In R83, it was a 4 bytes
Ascii Hex length of the item. That's why R83 items were limited to 8FFF or
32K.

On D3, there's an 8 byte header with a 4 byte length field and 4 bytes of
"flags" that tell if the item is new, stolen, moved, updated since last file
save, deleted, etc.

Data is just a string of characters with embedded delimiters.

Some versions (ARev, IIRC) store a length field at the start of every
"segment" defined as a delimited string. That way they can have embedded
binaries and arrays within arrays.

Every frame has an area outside the data area for storage of LINK data.
Links usually contains forward and backward FIDs and forward and backward
"number of contiguous" frames for binaries such as PickBasic object code.

A segment mark <sm> ends the item. Two segment marks in a row end the
group. On D3, because we are "half-word aligned", every item starts on an
even byte (0, 22, 2022) and every item ends on an odd byte. In that case,
there MAY be two segment marks together to pad out the actual length of the
item. Also, in D3, groups are updated in one of two ways: pull-up or
in-place. If the update is pull-up, the item is removed from the group, the
group is pulled up to fill in the empty space and the new item is appended
to the end. If it's update in-place (item length <= original or a group
update lock is in effect) the item may be padded with segment marks, but the
length field still points to the last item (sm) if the item. It's just that
when the item is processed, processing stops at the first segment mark.

Mark Brown
x-Pick cave-dweller (Rick Davies has a picture to prove it)

"The only kind of gun control this country needs is a sharp eye and a steady
hand."


"Noah" <spam.hart (AT) gmail (DOT) com> wrote

Quote:
I must be a real geek. I got interested in the internal stucture of the
underlying database and now I just can't stop.

HELP! SAVE ME FROM MYSELF! <GRIN

Seriously, can anyone save me some time and point me to a document that
details the Pick/Unidata internals?

Thanks,

Noah
replace spam with myname




Reply With Quote
  #4  
Old   
David Morris
 
Posts: n/a

Default Re: Internal database structure - 09-17-2005 , 01:10 AM



Mark Brown once wrote in <FHIWe.7121$Gh.4232 (AT) tornado (DOT) socal.rr.com>...
Quote:
I think I can safely say without fear of law suit or having my legs broken:


hdr> data <sm> <hdr> data <sm><sm

Header is different for different implementations. In R83, it was a 4 bytes
Ascii Hex length of the item. That's why R83 items were limited to 8FFF or
32K.

On D3, there's an 8 byte header with a 4 byte length field and 4 bytes of
"flags" that tell if the item is new, stolen, moved, updated since last file
save, deleted, etc.
I don't know whether it's unique to mvBase or not, but there are also
'indirect' items... I'm pretty certain it's not. Indirect items are
items longer than (I think) 1k (on a 2k frame system). Instead of
storing the item 'in group', the item is stored in linked space outside
the file and a pointer to the start frame of the item is stored in the
file itself. It's quite reasonable to get a mixture of direct and
indirect items in the same file. It's invisible at application level.


--
David Morris


Reply With Quote
  #5  
Old   
Tony Gravagno
 
Posts: n/a

Default Re: Internal database structure - 09-17-2005 , 05:22 AM



David Morris <david (AT) 127 (DOT) 0.0.1> wrote:
Quote:
I don't know whether it's unique to mvBase or not, but there are also
'indirect' items... I'm pretty certain it's not. Indirect items are
items longer than (I think) 1k (on a 2k frame system). Instead of
storing the item 'in group', the item is stored in linked space outside
the file and a pointer to the start frame of the item is stored in the
file itself. It's quite reasonable to get a mixture of direct and
indirect items in the same file. It's invisible at application level.
D3 does this too with items over 1700 characters (D3 has 4k frames).
A file with a DP pointer will store all items in pointer space
regardless of size. Before creating a file like this you need to
think of how it will be used, sort of like U2 file types.

T


Reply With Quote
  #6  
Old   
Noah
 
Posts: n/a

Default Re: Internal database structure - 09-19-2005 , 01:42 PM



That's clear Mark, and the investigation I've done, does show a
structure similar to this ...
For each File:
000 Modulo 4 WORDS (Aka GROUPS)
....
013 1 (Blocksize -1)
<snip>

For each Frame (ie, group * &h400)
xx04 WORDS LEFT IN BLOCK
<snip>
xx14 NKEYS
xx16 KEY BUFFER LENGTH
xx18 OFFSET to Start of Data
<snip>
xx20 KEY 1 DATA LOCATION
xx24 KEY 1 DATA SIZE
xx28 KEY 2 DATA LOCATION
xx32 KEY 2 DATA SIZE
<snip>

Without wanting you to get your legs broken ...
Sometimes, the start of the block data is not at (block size + 1
)*&h400, but is offset a number of words. Isn't the framesize a
multiple of 1024?

Thanks,

Noah


Reply With Quote
  #7  
Old   
Mark Brown
 
Posts: n/a

Default Re: Internal database structure - 09-19-2005 , 03:56 PM



that depends on the implementation. The frame is a multiple of 1024, but
they may be .5K (R83) to 4K (D3 and others) to higher in D3's FSI.

Mark


"Noah" <spam.hart (AT) gmail (DOT) com> wrote

Quote:
Without wanting you to get your legs broken ...
Sometimes, the start of the block data is not at (block size + 1
)*&h400, but is offset a number of words. Isn't the framesize a
multiple of 1024?

Thanks,

Noah




Reply With Quote
  #8  
Old   
dennis bartlett
 
Posts: n/a

Default Re: Internal database structure - 09-21-2005 , 10:03 AM



actually it depends on what the guy really means by wanting to know the
structure... in days of yore we commonly got terrible sores in our
databases - these blistering pustules were called GFE's and had to be
exorcised using the mighty powers begat from fiddling about in debug.

A GFE is named thus as an accronym from Group Format Error, a condition
so vile that imminent loss of data was certain, not just suspected,
unless a skillful practioner of the unheard of art of GFE repairs was
present.

What said practioner had to do was list the contents of the hard disk
in frames, taking note of start and end delimiters. Each frame begat a
special section at the start that recorded such things as the number of
the frame which led to this frame, the number of the frame to which the
HDD controller would jump after leaving this frame, and, most
importantly, how long this frame was (in bytes).

With this detail, the druid would squint knowingly at the screen,
counting off bytes, verifying these hidden details. The GFE
notification (usually a system log error) would have given him or her
an offset address at which to start the weary count. The error itself
would have caused one frame to report an incorrect next frame, or two
frames to report a single frame as both their destinations.

Changes were made via the debugger, using a series of peeks and pokes.
At every one of these the audience would hold their breath, knowing
that had this high priest of the dark arts not been available they
would have been doomed to recapture many hours of data.

Once the change had been made, and the whole succession of naughty
frames duly dealt with, long sighs of relief were made and hearty
adulation applied to the geek, but alas, all these things were not
remembered the next day, whereupon the airhead was relegated to the
back office to await the glory that would once again be showered upon
their blessed souls at the next GFE!

To this day other operating environments have similar tribulations,
only they were clever enough to cover the symptoms using busy screens
like scandisk, or the BSOD. However they cover it up, you can be sure
that you never really needed to lose that data - the airheads could
have reclaimed it. You can be sure... :-)


Reply With Quote
  #9  
Old   
Mark Brown
 
Posts: n/a

Default Re: Internal database structure - 09-21-2005 , 02:01 PM



Sort of...

The most common GFE of all times happened after a power failure of some
sort. Whatever was in memory didn't get flushed, so the overflow table
wasn't updated. The next time the system started, it grabbed the old
overflow table and starts handing out frames. If you had just used one of
those to link on your your file just before the failure, you might have a
data frame now pointing into a report or another file. When two groups link
forward to the same frame, one always loses.

Mark

"dennis bartlett" <dennis (AT) freshmarksystems (DOT) co.za> wrote

Quote:
actually it depends on what the guy really means by wanting to know the
structure...
However they cover it up, you can be sure
that you never really needed to lose that data - the airheads could
have reclaimed it. You can be sure... :-)




Reply With Quote
  #10  
Old   
dennis bartlett
 
Posts: n/a

Default Re: Internal database structure - 09-22-2005 , 08:09 AM



yeah, true.

most pick systems I ever worked on had so much duplication of data one
could actually recover if one just had the time, but yes, anything in
cache is lost. The essential bummer was when the GFE wasn't picked up
until that segment of the data was accessed, and then all hope was
lost.

anyhow we survived, we grew old, we created forums (or rather the youth
did) just so we could fondly reminisce about "those days" while we fish
around in the glass for our teeth... (hee hee)

by the way, how come my email addy is showing up when you quoted my
reply?


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.