![]() | |
#1
| |||
| |||
|
#2
| |||||||||||
| |||||||||||
|
|
I'm hoping to write another blog entry (have been slack for a while) related to modeling data in MV and I could use some help. I thought I would state that up front so you know to ignore if you have no interest in such. Where traditional relational theory is easy to teach because it has that nice 1, 2, 3 NF thing going for it so you can teach how to put the data in first normal form and then go from there, typical MV s/w developers might be more inclined to say something like "you just do it" or "model the data the way it makes sense" or "model the data to align with reality." My take is that there really are some best practices that could be taught, even if many of them are not as mathematical as the normal forms. Perhaps some of the following are agreeable? --designing a solution without sub-values, just files, attributes including single and multi-values as well as associated multi-values |
|
--using a single file for all of the relatively small lookup tables (valid values for marital status, gender, hair color, customer status, etc) which is frowned upon in the SQL world (we discussed this a while back on cdp) |
|
--using files for metadata and generating dicts from those, using these files as the basis for data entry forms/screens/pages & related constraints as well as user-maintained business rules |
|
--adding some misc user-defined single and multi-valued fields to various files without any specified intent so users can use them as needed to help tailor the system for their needs over time (like the star in gmail) |
|
--creating return links so you can navigate from file to file in both directions (things like that really fly in the face of relational theory, but hey ho) |
|
--adding associated multivalues with a status code and date stamp for any status codes where history should be kept (rather than requiring data marts and data warehouses for such information) and using this same pattern for other specific attributes as needed, howbeit sparingly. |
|
--keeping properties (not well-defined), whether single or multi-valued, as attributes of an entity (also not well-defined other than entity = "thing") in a single file definition rather than splitting them out into separate files |
|
--splitting out attributes into separate files based on functional dependencies, such as is done in relational normalization, but getting data into 3rd normal form without the need of 1NF. |
|
--identifying the main entities about which users will want to report and beefing up those dictionaries (entities) with a lot of virtual fields (or using 3rd party tools for reporting) |
|
--choosing naming conventions, such as plural names for multivalues, singular for single-values, and plural for entities (relational standards are for tables to be singular, but LIST statements read much better with plural names) |
|
I'm guessing I could come up with a bunch of other ideas for data modeling best practices with multi-values since the above possibilities are jumping off my fingertips, but this gives the idea of what I'm looking for. So, in answer to the question of what might be some industry best practices for modeling/designing data for a Pick/MV implementation, which of the above do you agree with and what would you add? Thanks. --dawn P.S. In the resulting blog entry, I'll acknowledge all those who help build or refine the list unless you request no such acknowledgement. |
#3
| |||||||||||||||||||||||||
| |||||||||||||||||||||||||
|
|
Hi Dawn, I'm going to top, inter-, and bottom post - I hope for clarity. |
|
I think it's easiest if one considers the Pick data structure to be a physical implemenation of a version of the Relational Model. |
|
For me, that seems to guide my efforts without restricting what I can do. dawn wrote: I'm hoping to write another blog entry (have been slack for a while) related to modeling data in MV and I could use some help. I thought I would state that up front so you know to ignore if you have no interest in such. Where traditional relational theory is easy to teach because it has that nice 1, 2, 3 NF thing going for it so you can teach how to put the data in first normal form and then go from there, typical MV s/w developers might be more inclined to say something like "you just do it" or "model the data the way it makes sense" or "model the data to align with reality." My take is that there really are some best practices that could be taught, even if many of them are not as mathematical as the normal forms. Perhaps some of the following are agreeable? --designing a solution without sub-values, just files, attributes including single and multi-values as well as associated multi-values I agree that subvalues ought not to be used. Primarily this is because they are not supported by the enquiry language. Secondarily because the structure of an item becomes too complex to be readily understood. |
|
However, I have seen them used in one circumstance that made sense to me. This was in an application that was originally written with ALL (Applied Language Liberator - I think). In ALL, an attribute could contain a table - with the value marks being record seperators and the subvalue marks acting as attribute marks. In order to report on the subvalues I used a dictionary program to convert them to multivalues which was something like this |
|
001: SUBROUTINE MULTISUB(THAT,ATTR,SUBV) 002: * 003: * Where data is kept in subvalues, this returns a multi-valued list 004: * from a given attribute of each occurrance of the specified subvalue. 005: * 006: THAT = '' 007: IF NOT(ATTR MATCHES '1N0N') THEN RETURN 008: IF NOT(SUBV MATCHES '1N0N') THEN RETURN 009: LINE = @RECORD<ATTR 010: YYNO = DCOUNT(LINE,@VM) 011: FOR YY = 1 TO YYNO 012: THAT<1,YY> = LINE<1,YY,SUBV 013: NEXT YY 014: RETURN --using a single file for all of the relatively small lookup tables (valid values for marital status, gender, hair color, customer status, etc) which is frowned upon in the SQL world (we discussed this a while back on cdp) This is a reasonable way to reduce the number of files. It makes perfect sense when this is regarded as a physical implementation of a relational model. Each table is its own relation from the logical point of view, but each one is an item in a file from the physical point of view. |
|
I would use multiple files to store tables of similar types. One might be for the simplest type where there is a code and a description (like gender). Another might be for more complex ones. For example those that have a code, a description, an implementation date, and an obsolescent date. For me, this is a bit tidier than cramming all the tables in one file. |
|
--using files for metadata and generating dicts from those, using these files as the basis for data entry forms/screens/pages & related constraints as well as user-maintained business rules This makes a lot of sense to me. In all the sites I have been on, the dictionaries get filled up with a whole bunch of alternatives for reporting. They just can't be relied on for understanding the application system. |
|
--adding some misc user-defined single and multi-valued fields to various files without any specified intent so users can use them as needed to help tailor the system for their needs over time (like the star in gmail) This doesn't really resonate with me. It's so easy to add new fields in that I don't see the need to pre-add them, as it were. |
|
--creating return links so you can navigate from file to file in both directions (things like that really fly in the face of relational theory, but hey ho) Again, this is a physical system. This sort of thing is just adding an index to the severely minimalist logical model. |
|
--adding associated multivalues with a status code and date stamp for any status codes where history should be kept (rather than requiring data marts and data warehouses for such information) and using this same pattern for other specific attributes as needed, howbeit sparingly. The history is its own relation. In the MV world we just have a useful way to clump it physically with the "source" relation. |
|
--keeping properties (not well-defined), whether single or multi-valued, as attributes of an entity (also not well-defined other than entity = "thing") in a single file definition rather than splitting them out into separate files In general, this ability to clump information is the heart of MV. |
|
However, it is not antithetical to the relational model; rather, it is orthogonal to it. |
|
--splitting out attributes into separate files based on functional dependencies, such as is done in relational normalization, but getting data into 3rd normal form without the need of 1NF. Yes, functionality is the reason for the clumping. At the logical level, the data should be fully normalised. |
|
It should be a mere mechanical process to convert the MV structure into a table structure that obviously follows the relational model. |
|
(Items are subtle, tables are clumsy?) |
|
--identifying the main entities about which users will want to report and beefing up those dictionaries (entities) with a lot of virtual fields (or using 3rd party tools for reporting) Yes, and this is why the dictionaries fill up with cruft. I don't really have a strong objection to this, it's just that I seem to want to use a word that someone else has used. The problem is that one cannot change the dictionary item in case some report somewhere throws a fit (or a did-not-fit). |
|
--choosing naming conventions, such as plural names for multivalues, singular for single-values, and plural for entities (relational standards are for tables to be singular, but LIST statements read much better with plural names) That's very much a matter of taste. A bit too specific to be a rule. |
|
I'm guessing I could come up with a bunch of other ideas for data modeling best practices with multi-values since the above possibilities are jumping off my fingertips, but this gives the idea of what I'm looking for. So, in answer to the question of what might be some industry best practices for modeling/designing data for a Pick/MV implementation, which of the above do you agree with and what would you add? Thanks. --dawn P.S. In the resulting blog entry, I'll acknowledge all those who help build or refine the list unless you request no such acknowledgement. To make the resulting blog as useful as possible, may I suggest you present the pros and cons of each point. |
|
I'm assuming there will be arguements presented on each side. Thank God this isn't C.D.T. where the trump arguement is "Read the same book I once did, you self-aggrandising ignorant - I'm gonna filter you out - nya nya nya - I'm not listening again!". |
|
A few things I do - coding standards really. It's not exactly on topic, but what the heck. 1. Always open a file to a variable of the same name. EG OPEN 'CUSTOMERS' TO CUSTOMERS because this makes for easier searching through the source code. I have seen the following variant advised (which seemed reasonable too) OPEN 'CUSTOMERS' TO CUSTOMER.F READ CUSTOMERS.R FROM CUSTOMERS.F, CUSTOMERS.I .... |
|
2. Don't use capital i for a loop variable. On a bad printout it's very difficult to dstinguish between capital i and numeric one. I use XX, YY, and ZZ bcause they are easy to search for - capital i is really, really hard - SO many hits. 3. Don't comment the bleeding obvious. I don't need "* OPEN FILES" set off and underlined, followed by 15 open file commands. Set them off with a little white space and I'll figure it out! |
|
4. Comment what is difficult to understand! 5. Comment _why_ something was done - I can read _what_ was done. 6. And for crying out loud, SOMEWHERE say what the program is there for! 7. Remove obsolete code, so I can readily see the logic that's used now. |
|
Use a versioning system to keep old stuff, not the live source file. |
|
You may have guessed that I'm supporting some badly commented code, and feel a bit frazzled sometimes. |
|
Regards, Keith. |
#4
| |||
| |||
|
|
I'm hoping to write another blog entry (have been slack for a while) related to modeling data in MV and I could use some help. I thought I would state that up front so you know to ignore if you have no interest in such. Where traditional relational theory is easy to teach because it has that nice 1, 2, 3 NF thing going for it so you can teach how to put the data in first normal form and then go from there, typical MV s/w developers might be more inclined to say something like "you just do it" or "model the data the way it makes sense" or "model the data to align with reality." My take is that there really are some best practices that could be taught, even if many of them are not as mathematical as the normal forms. Perhaps some of the following are agreeable? --designing a solution without sub-values, just files, attributes including single and multi-values as well as associated multi-values --using a single file for all of the relatively small lookup tables (valid values for marital status, gender, hair color, customer status, etc) which is frowned upon in the SQL world (we discussed this a while back on cdp) --using files for metadata and generating dicts from those, using these files as the basis for data entry forms/screens/pages & related constraints as well as user-maintained business rules --adding some misc user-defined single and multi-valued fields to various files without any specified intent so users can use them as needed to help tailor the system for their needs over time (like the star in gmail) --creating return links so you can navigate from file to file in both directions (things like that really fly in the face of relational theory, but hey ho) --adding associated multivalues with a status code and date stamp for any status codes where history should be kept (rather than requiring data marts and data warehouses for such information) and using this same pattern for other specific attributes as needed, howbeit sparingly. --keeping properties (not well-defined), whether single or multi-valued, as attributes of an entity (also not well-defined other than entity = "thing") in a single file definition rather than splitting them out into separate files --splitting out attributes into separate files based on functional dependencies, such as is done in relational normalization, but getting data into 3rd normal form without the need of 1NF. --identifying the main entities about which users will want to report and beefing up those dictionaries (entities) with a lot of virtual fields (or using 3rd party tools for reporting) --choosing naming conventions, such as plural names for multivalues, singular for single-values, and plural for entities (relational standards are for tables to be singular, but LIST statements read much better with plural names) I'm guessing I could come up with a bunch of other ideas for data modeling best practices with multi-values since the above possibilities are jumping off my fingertips, but this gives the idea of what I'm looking for. So, in answer to the question of what might be some industry best practices for modeling/designing data for a Pick/MV implementation, which of the above do you agree with and what would you add? Thanks. --dawn P.S. In the resulting blog entry, I'll acknowledge all those who help build or refine the list unless you request no such acknowledgement. |
#5
| |||
| |||
|
|
I'm hoping to write another blog entry (have been slack for a while) related to modeling data in MV and I could use some help. |
#6
| |||
| |||
|
#7
| |||
| |||
|
|
Having designed a couple of hundred MV applications over the past 30+ years, and having given this considerable thought, I am inclined to agree with the "you just do it" sentiment. The only best practice that comes to mind is "common sense." I don't have any words of wisdom on how to teach or acquire that. Most people know it when they see it. - Steve Alexander |
|
On 18 Oct 2006 15:11:42 -0700, "dawn" <dawnwolthuis (AT) gmail (DOT) com> wrote: I'm hoping to write another blog entry (have been slack for a while) related to modeling data in MV and I could use some help. I thought I would state that up front so you know to ignore if you have no interest in such. Where traditional relational theory is easy to teach because it has that nice 1, 2, 3 NF thing going for it so you can teach how to put the data in first normal form and then go from there, typical MV s/w developers might be more inclined to say something like "you just do it" or "model the data the way it makes sense" or "model the data to align with reality." My take is that there really are some best practices that could be taught, even if many of them are not as mathematical as the normal forms. Perhaps some of the following are agreeable? --designing a solution without sub-values, just files, attributes including single and multi-values as well as associated multi-values --using a single file for all of the relatively small lookup tables (valid values for marital status, gender, hair color, customer status, etc) which is frowned upon in the SQL world (we discussed this a while back on cdp) --using files for metadata and generating dicts from those, using these files as the basis for data entry forms/screens/pages & related constraints as well as user-maintained business rules --adding some misc user-defined single and multi-valued fields to various files without any specified intent so users can use them as needed to help tailor the system for their needs over time (like the star in gmail) --creating return links so you can navigate from file to file in both directions (things like that really fly in the face of relational theory, but hey ho) --adding associated multivalues with a status code and date stamp for any status codes where history should be kept (rather than requiring data marts and data warehouses for such information) and using this same pattern for other specific attributes as needed, howbeit sparingly. --keeping properties (not well-defined), whether single or multi-valued, as attributes of an entity (also not well-defined other than entity = "thing") in a single file definition rather than splitting them out into separate files --splitting out attributes into separate files based on functional dependencies, such as is done in relational normalization, but getting data into 3rd normal form without the need of 1NF. --identifying the main entities about which users will want to report and beefing up those dictionaries (entities) with a lot of virtual fields (or using 3rd party tools for reporting) --choosing naming conventions, such as plural names for multivalues, singular for single-values, and plural for entities (relational standards are for tables to be singular, but LIST statements read much better with plural names) I'm guessing I could come up with a bunch of other ideas for data modeling best practices with multi-values since the above possibilities are jumping off my fingertips, but this gives the idea of what I'm looking for. So, in answer to the question of what might be some industry best practices for modeling/designing data for a Pick/MV implementation, which of the above do you agree with and what would you add? Thanks. --dawn P.S. In the resulting blog entry, I'll acknowledge all those who help build or refine the list unless you request no such acknowledgement. |
#8
| |||
| |||
|
|
dawn wrote: I'm hoping to write another blog entry (have been slack for a while) related to modeling data in MV and I could use some help. Certainly not MV specific, but every developer can probably find a lot of useful best practices in Steve McConnell's book "Code Complete". |
|
Amazon links below my sig. IMO, it's a must read for any developer. |
|
-- Kevin Powick Tiny URL: http://tinyurl.com/ycjtbo Full URL: http://www.amazon.com/Code-Complete-...295161?ie=UTF8 |
#9
| |||||
| |||||
|
|
Hi I have snipped all as I agree with all your points Dawn and I don't want to upset Bruce's bandwidth. One I would like to add is separate fast moving data from static particularly where audit trails are required. For example I believe that recalculating the balance of an account from the transactions as the sole source of that information is bad accounting practise. |
|
However one should keep the check totals separate from the rest of the debtor control information or your audit update of a Debtor will become buried in a myriad of unnecessary transaction changes. |
|
Also separate data that should be encrypted from other data, so keep a bank information file separate from the rest of the debtor's file. |
|
Allow redundancy. A poster sometime back suggested keeping the full address data in the Invoice file as well as the key to that data. Not something I did but with many years history it can be handy for tax investigations etc. |
|
I Look forward to your blog resumption. Peter McMurray |
#10
| |||
| |||
|
|
However one should keep the check totals separate from the rest of the debtor control information or your audit update of a Debtor will become buried in a myriad of unnecessary transaction changes. I'm not sure I'm tracking here. Are you saying that if you have a value that you will have redundant in that you will store a total as well as the components of the total (rather than only deriving the total from the components), then you want to be sure that the amounts are partitioned from other data so that you do not have to recalc the total unnecessarily? Maybe an example baby data model to illustrate what you are suggestion would help, if you have a chance. Also separate data that should be encrypted from other data, so keep a bank information file separate from the rest of the debtor's file. Good point. When you say "separate data" I gather you mean that you want to have attributes in different entities files even if they are related to the same external (real world) entity. Allow redundancy. A poster sometime back suggested keeping the full address data in the Invoice file as well as the key to that data. Not something I did but with many years history it can be handy for tax investigations etc. Another good point on redundancy. These addresses are not really redundant in that one is a "point in time" address while the other is fluid. I Look forward to your blog resumption. Peter McMurray Thanks, Peter. --dawn |
![]() |
| Thread Tools | |
| Display Modes | |
| |