Quote:
I'm not using flags or procedural code without some thought as to alternatives. |
Let's talk about those options. First, I turned your dialect into
Standard SQL whenever possible and your data element names into
ISO-11179 forms.
Quote:
Database is about a Parts Catalog. We have Categories, Subcategories etc at infinitum (typically <= 5), and at the lowest level we have about 50,000 Parts. Both Categories and Parts may have a defined life time, so both tables have effective_date, expire_date, deleted_flag. We added a computed column active_flag, which is set to: |
(active_flagdbo.active_flag SetActiveFlag(active_flageffective_date,
active_flagexpire_date, active_flagdeleted_flag))
The scalar function SetActiveFlag is below. I felt the procedural code
was probably not very fast, so I was looking to rewrite. Ideally
something like:
SET @return = (effective_date < CURRENT_TIMESTAMP)
AND (expire_date > CURRENT_TIMESTAMP)
AND (deleted_flag = 0) <<
That is very nice in a procedural language that has BOOLEAN data
types. Did you notice that your mindset is still stuck in your native
language? Too many parens that we used in C to force order of
execution in early compilers, and the failure to use a NOT BETWEEN
short hand are the give-away that you are still thinking in C, C++,
VB, Fortran, or whatever. Also, an SQL programmer would put part of
this into a CHECK() constraint, not DML.
Quote:
But as we have concluded that's not possible in T-SQL. So I was looking for alternatives. |
But you have not written SQL yet! You are still faking your
procedural language in SQL and hitting a wall. Let's take it a step
at a time, so you can see how to think in sets. I cleaned up
SetActiveFlag follows (cleaned up just a little, but still procedural
code. You seem to like to do a lot of needless IF-THEN testing rather
than just using a RETURN() when you have an answer. That is bad
procedural programming and has nothing to do with SQL; you might want
to read Gries and some of the books on Structured Programming.
CREATE FUNCTION SetActiveFlag -- notice <verb><object> naming
(@effective_date DATETIME,
@expire_date DATETIME,
@deleted_flag BIT) -- <attribute><property> naming
RETURNS BIT
AS
BEGIN
DECLARE @return_flag BIT; -- useless local variable
IF @deleted_flag = 1
SET @return_flag = 0
ELSE IF @return_flag IS NULL
IF GETDATE() < @effective_date
SET @return_flag = 0;
--Is the current date after the expiration date?
IF @return_flag IS NULL
IF GETDATE() > @expire_date
SET @return_flag = 0;
--Must be OK
IF @return_flag IS NULL
SET @return_flag = 1;
RETURN @return_flag;
END;
Here is a version using SQL instead of structured procedural code.
CREATE FUNCTION SetActiveFlag
(@effective_date DATETIME, @expire_date DATETIME, @deleted_flag BIT)
RETURNS BIT
AS
RETURN (CASE
WHEN @deleted_flag = 1 THEN 0
WHEN CURRENT_TIMESTAMP NOT BETWEEN @effective_date AND @expire_date
THEN 0 ELSE 1 END);
This still stinks; we know that (deleted_flag, active_flag) have to
have opposite values, so at least one of them is redundant. We don't
do redundancy in an RDBMS. In fact, both of them are redundant; the
expiration date tells us if something is active.
This should be handled in the DDL that you never posted. SQL is
declarative. So some wild guesses are in order:
Quote:
Both Categories and Parts may have a defined life time .. |
I find that a bit weird; books come and go from a library, but the
Dewey Decimal Classification system remains pretty stable. Can I
assume that every part falls into a category? Do you set future
expiration dates? Lot of questions here. But let's start with a
typical history table:
CREATE TABLE InventoryHistory
(part_nbr CHAR(15) NOT NULL,
effective_date DATETIME NOT NULL,
PRIMARY KEY (part_nbr, effective_date),
expire_date DATETIME, -- null means current
CHECK (effective_date < expire_date),
part_category CHAR(10) NOT NULL
REFERENCES PartsCategories(part_category)
ON UPDATE CASCADE,
etc.);
Then you create an updatable VIEW of the current Inventory and use
it:
CREATE VIEW Inventory (...)
AS
SELECT ..
FROM InventoryHistory
WHERE CURRENT_TIMESTAMP <= COALESCE (expire_date,
CURRENT_TIMESTAMP);
The part categories can be in a nested sets model table (Google it if
you don't know this standard pattern or get a copy of TREES &
HIERARCHIES IN SQL). I am not sure how to handle the changing
categories, since I have no idea what the rate of change is or even
how they change (i.e. in Dewey Decimal, logic moved from the 100's
(Philosophy) to the 500's (math); new categories were created under
the 500's (nanotech); etc.) so the safest way is to create a forest
table.
Each tree in the forest will be at least 50,001 rows, according to
your specs, so I will guess they fall into 250 categories in the
average tree (I am not going to do the Bell numbers or what that
series is called for the number of possible trees). This is not big
on modern computer, and partitioning the table by time periods should
really help.
CREATE PartCategories -- leaving out a lot of constraints
(part_nbr CHAR(15) NOT NULL,
part_category CHAR(15) NOT NULL,
lft INTEGER NOT NULL CHECK (lft > 0),
rgt INTEGER NOT NULL CHECK (rgt > 0),
C HECK (lft < rgt), -- assume all the usual nested sets constraints
effective_date DATETIME NOT NULL
expire_date DATETIME, -- null means current
CHECK (effective_date < expire_date),
PRIMARY KEY (effective_date, part_category, part_nbr),
etc.)
The (effective_date, expire_date) pair defines the heirarchy during
that time period -- a tree in a forest identified by its effective
date. Again, you can use a VIEW to find the current hierarchy. Now
add a calendar table and use BEWTWEEN predicates with (effective_date,
expire_date) pairs in the history tables to determine the state of the
database for any day you have.
You can now write very simple SQL and app code using VIEWs because all
the effort has been moved to the DDL. You do not have to do data
integrity checking in hundred of programs. Your SQL will port and
should scale. But again, this is skeleton based on your narrative and
a lack of specs.