![]() | |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
Hi, I'm currently working on a system that consists of several sites. Each site runs a system with its own local database, and then periodically each site updates the "master" site with its data, allow the data to be viewed centrally. It must be done like this because each site may not always be on the network, but the data must always be available from the central server. To enable the master database to contain all the site database information, I use a composite key, with the first part being the site identification, and the second part being the row identification. This allows user with row id 5 for instance to exist more than once because paired with the site id it is still unique. On top of this, there is another layer. Each site must be contained within an "Enterprise", which is just another identifier. So, I have a table called enterprise which contains the list of site ids. So far then, I have this (obviously cut down to only show the relevant columns) : - table: enterprise column: id table: site column: id table: user column: site_id column: id column: label etc.. With the site_id and id making the composite key. Now, I've just had to add the concept of "enterprise" level users. Initially the only clean way I could think of doing this is having two separate user tables; enterprise_user and site_user : - table: enterprise_user column[key]: enterprise_id column[key]: id column: label etc.. table: site_user column[key]: site_id column[key]: id column: label etc.. However this does not seem very "normalized".. Other ways I considered are : - table: user column[key]: id column[key]: enterprise_id column: site_id column: label etc.. But this does not make sense as an enterprise user is not associated with any site. I could have a dummy site, but I really do not like that. I could just set site_id to NULL, but I'm not a big fan of NULL in database and it just seems a bit hacky. So then I thought of adding a link table: table: user column[key]: id column[key]: enterprise_id etc.. table: user_site <== link table column: enterprise_id column: user_id column: site_id etc.. with a unique constraint on enterprise_id, user_id (the composite key that identifies a user in an enterprise). I think that also works.. basically with no entry in the "user_site" table, they are enterprise users. An entry in user_site effectively limits them to being a site user. However.. this might create complications in the "master" update process. Part of the batch update process involves updating user information. Enterprise users (only editable at the master) must be propagated down to each site, so that the site is "aware" of enterprise users. Site users (only editable at the site) must be propagated up to the master, purely so all the (read only) data at the master is complete. So I'm wondering if having the two tables separate will be just much simpler (although not being terribly efficient in terms of pretty much repeating a table structure). I hope all that makes sense! Thanks, Mark. |
#3
| |||
| |||
|
|
Mark S. (UK) wrote: Hi, I'm currently working on a system that consists of several sites. Each site runs a system with its own local database, and then periodically each site updates the "master" site with its data, allow the data to be viewed centrally. It must be done like this because each site may not always be on the network, but the data must always be available from the central server. To enable the master database to contain all the site database information, I use a composite key, with the first part being the site identification, and the second part being the row identification. This allows user with row id 5 for instance to exist more than once because paired with the site id it is still unique. On top of this, there is another layer. Each site must be contained within an "Enterprise", which is just another identifier. So, I have a table called enterprise which contains the list of site ids. So far then, I have this (obviously cut down to only show the relevant columns) : - table: enterprise column: id table: site column: id table: user column: site_id column: id column: label etc.. With the site_id and id making the composite key. Now, I've just had to add the concept of "enterprise" level users. Initially the only clean way I could think of doing this is having two separate user tables; enterprise_user and site_user : - table: enterprise_user column[key]: enterprise_id column[key]: id column: label etc.. table: site_user column[key]: site_id column[key]: id column: label etc.. However this does not seem very "normalized".. Other ways I considered are : - table: user column[key]: id column[key]: enterprise_id column: site_id column: label etc.. But this does not make sense as an enterprise user is not associated with any site. I could have a dummy site, but I really do not like that. I could just set site_id to NULL, but I'm not a big fan of NULL in database and it just seems a bit hacky. So then I thought of adding a link table: table: user column[key]: id column[key]: enterprise_id etc.. table: user_site <== link table column: enterprise_id column: user_id column: site_id etc.. with a unique constraint on enterprise_id, user_id (the composite key that identifies a user in an enterprise). I think that also works.. basically with no entry in the "user_site" table, they are enterprise users. An entry in user_site effectively limits them to being a site user. However.. this might create complications in the "master" update process. Part of the batch update process involves updating user information. Enterprise users (only editable at the master) must be propagated down to each site, so that the site is "aware" of enterprise users. Site users (only editable at the site) must be propagated up to the master, purely so all the (read only) data at the master is complete. So I'm wondering if having the two tables separate will be just much simpler (although not being terribly efficient in terms of pretty much repeating a table structure). I hope all that makes sense! Thanks, Mark. After some further thought, I've realised there is another problem: user name (or label as I called it above) uniqueness. A user name at a site may be re-used, as the site is pretty much a self contained system. So a user "fred" can exist at site 1, and at site 2. Enterprise users however must remain unique. So.. site users are unique to sites, and enterprise users are unique to the enterprise (which is the "container" for sites). This brings up a problem with the normalized design: table: user column[key]: enterprise_id column[key]: id column: label add constraint unique on user(enterprise_id, label) table: user_site <== link table column: enterprise_id column: user_id column: site_id This is fine for keeping users unique at the enterprise level, but doesn't work out for site users when they've been propagated back to the "master" database. So, currently I'm thinking two separate tables. I'm not repeating data, I'm just repeating some elements of a table, so perhaps it isn't that bad? Having read these posts back, I'm not sure I've explained the scenario very clearly. I hope it makes some sense at least. Mark. |
#4
| |||
| |||
|
#5
| |||
| |||
|
|
Mark S. (UK) wrote: Mark S. (UK) wrote: Hi, I'm currently working on a system that consists of several sites. Each site runs a system with its own local database, and then periodically each site updates the "master" site with its data, allow the data to be viewed centrally. It must be done like this because each site may not always be on the network, but the data must always be available from the central server. To enable the master database to contain all the site database information, I use a composite key, with the first part being the site identification, and the second part being the row identification. This allows user with row id 5 for instance to exist more than once because paired with the site id it is still unique. On top of this, there is another layer. Each site must be contained within an "Enterprise", which is just another identifier. So, I have a table called enterprise which contains the list of site ids. So far then, I have this (obviously cut down to only show the relevant columns) : - table: enterprise column: id table: site column: id table: user column: site_id column: id column: label etc.. With the site_id and id making the composite key. Now, I've just had to add the concept of "enterprise" level users. Initially the only clean way I could think of doing this is having two separate user tables; enterprise_user and site_user : - table: enterprise_user column[key]: enterprise_id column[key]: id column: label etc.. table: site_user column[key]: site_id column[key]: id column: label etc.. However this does not seem very "normalized".. Other ways I considered are : - table: user column[key]: id column[key]: enterprise_id column: site_id column: label etc.. But this does not make sense as an enterprise user is not associated with any site. I could have a dummy site, but I really do not like that. I could just set site_id to NULL, but I'm not a big fan of NULL in database and it just seems a bit hacky. So then I thought of adding a link table: table: user column[key]: id column[key]: enterprise_id etc.. table: user_site <== link table column: enterprise_id column: user_id column: site_id etc.. with a unique constraint on enterprise_id, user_id (the composite key that identifies a user in an enterprise). I think that also works.. basically with no entry in the "user_site" table, they are enterprise users. An entry in user_site effectively limits them to being a site user. However.. this might create complications in the "master" update process. Part of the batch update process involves updating user information. Enterprise users (only editable at the master) must be propagated down to each site, so that the site is "aware" of enterprise users. Site users (only editable at the site) must be propagated up to the master, purely so all the (read only) data at the master is complete. So I'm wondering if having the two tables separate will be just much simpler (although not being terribly efficient in terms of pretty much repeating a table structure). I hope all that makes sense! Thanks, Mark. After some further thought, I've realised there is another problem: user name (or label as I called it above) uniqueness. A user name at a site may be re-used, as the site is pretty much a self contained system. So a user "fred" can exist at site 1, and at site 2. Enterprise users however must remain unique. So.. site users are unique to sites, and enterprise users are unique to the enterprise (which is the "container" for sites). This brings up a problem with the normalized design: table: user column[key]: enterprise_id column[key]: id column: label add constraint unique on user(enterprise_id, label) table: user_site <== link table column: enterprise_id column: user_id column: site_id This is fine for keeping users unique at the enterprise level, but doesn't work out for site users when they've been propagated back to the "master" database. So, currently I'm thinking two separate tables. I'm not repeating data, I'm just repeating some elements of a table, so perhaps it isn't that bad? Having read these posts back, I'm not sure I've explained the scenario very clearly. I hope it makes some sense at least. Mark. Mark, I don't think theory-based answers to your questions exist, and I also find design via usenet counterproductive at best. You are struggling with a number of issues including distribution and partial fault tolerance. When needed, these are very important considerations. As a general principle, it is best to handle your distribution and fault tolerance needs at the physical level of discourse without exposing any of that in your logical data model. You are trying to handle it entirely at the logical level, which I think is probably inappropriate. Pragmatically, it is best to consider these issues at the earliest stages and to use your distribution and fault tolerance requirements as a basis for comparing dbms products and then choosing the dbms product that best addresses your distribution needs. Once you have chosen a dbms product, which it sounds like you have already done, then it is best to consider the specific features that dbms product has for handling distribution, replication, fault tolerance et cetera physically with the least impact to your logical design. Many products now have features for distribution or replication, for assigning globally unique numeric identifiers, for reconciling off-line changes and more. Perhaps even the product you use has such features. The issue you have with user id's is a basic requirements and design issue. Assuming you are using user id in the conventional way, the purpose of a user id is to uniquely identify users of a system. Pragmatically, the scope of such identity has to be an entire independent system. Generally, that means globally throughout a system. In your case, that may not be true. Individual sites, though, are not independent systems because they are tied together into enterprises. It may be your enterprises are entirely independent from one another. Or it may be they have interdependencies too. Sometimes, it makes sense to scope user ids to have partial independence, which is what email addresses do. The entire globally unique id is the full email address with a user name, the @ symbol, and a domain, while the locally unique id is simply the user name. Doing so simplifies some things while complicating others. In any case, I think you need to decide the scope of your user ids based on what users will expect and what meets the needs of the system you are designing. You may have to trade off conflicts between them. In the end, at this point in time, design is more art than science. |
#6
| |||
| |||
|
|
Bob Badour wrote: |
|
Thank you for your well thought out and detailed answer Bob; it makes a lot of sense and you've given me a lot to think about. Take care, Mark. |
![]() |
| Thread Tools | |
| Display Modes | |
| |