![]() | |
![]() |
| | Thread Tools | Display Modes |
#21
| |||
| |||
|
|
You can also persist the computed column so long as its deterministic and index it, saving you having to store the data twice. |
|
Good point Uri. You can also persist the computed column so long as its deterministic and index it, saving you having to store the data twice. In data modelling terms VIN should actually be a computed column - most encoding "keys" should be because they should be stored in their constituent parts. Tony. "Uri Dimant" <urid (AT) iscar (DOT) co.il> wrote in message news:Op#OaBejKHA.2132 (AT) TK2MSFTNGP05 (DOT) phx.gbl... --CELKO-- Why did you store a computed column in a table? Have you ever been to client for performnace tunning, are you familiar how SQL Server estimates the number of rows satisfying this condition using the column statistics ? What do you think about those approaches? WHERE UnitPrice * Quantity > 100 and ADDING a computed column on AS UnitPrice * Quantity SQL Server will detect the use of the computed column in the second example and it will create statistics on the computed column. The statistics will allow the optimizer to determine the appropriate cardinality estimation on the filter. "--CELKO--" <jcelko212 (AT) earthlink (DOT) net> wrote in message news:687c5bf6-50b5-4a56-ac28-30a31d7c9898 (AT) 35g2000yqa (DOT) googlegroups.com... Why did you store a computed column in a table? SQL is a declarative language. But you are using procedural code all over the place, in a TRIGGER and a UDF! You are still thinking as if you were writing for magnetic tape files or punch cards. You even use the word "record" because your mindset is all wrong. There is no such thing as an IDENTITY field. First of all, a column is nothing like a field. It is a proprietary, non-relational table property. Programmers who do not know RDBMS often use it to get a "record number" so that they can make their SQL look like a sequential file -- magnetic tape files or punch cards. The DDL skeleton would be something like this: CREATE TABLE Clients (client_id CHAR(7) NOT NULL PRIMARY KEY CHECK(client_id LIKE '<pattern>'), signup_date DATE DEFAULT CURRENT_DATE NOT NULL, (<build card number here>) AS card_nbr, etc); You need a way of getting a client identifier that can be validated and verified. IDENTITY has neither and it cannot even give you consecutive numbers. I would not expose the client id in the card number -- do you write your PIN on your bank card? You need to learn how to follow ISO-11179 data element naming conventions and formatting rules. Temporal data should use ISO-8601 formats. Code should be in Standard SQL as much as possible and not local dialect. |
#22
| |||
| |||
|
|
This is in correct. You can certinly insert many rows in one go. You miss read what I was saying Erland. Unless you use an instead of trigger combined with a cursor you cannot calculate the next number because you need to physically do a SELECT max( id ) + 1 query. You can imagine the performance of that! |
#23
| |||
| |||
|
|
If there is no index on id, performance will certainly be a problem. However, I would expect an index to exist on the primary key. In fact, I would take it for granted. |
|
Tony Rogerson (tonyrogerson (AT) torver (DOT) net) writes: This is in correct. You can certinly insert many rows in one go. You miss read what I was saying Erland. Unless you use an instead of trigger combined with a cursor you cannot calculate the next number because you need to physically do a SELECT max( id ) + 1 query. You can imagine the performance of that! If there is no index on id, performance will certainly be a problem. However, I would expect an index to exist on the primary key. In fact, I would take it for granted. If you insert many rows at one time, there is no problem to calculate the ids row all rows with row_number, as Celko demonstrated. In fact, in SQL 2005, this is the only way you can do this, and know which row that got which ID. Say that you have data for two tables in some semi-structured way, for instance an XML document. Take orders and order details. You insert the orders into the Orders table that has an IDENTITY column, but of the data you insert, there is nothing you can correlate with the IDENTITY column, even if there is information for this in the XML document. Thus, the OUTPUT clause is not going to help you. You have but one possibility to sort this situation out: run a cursor. In SQL 2008, there is a way out: rather than using INSERT you can insert the rows with MERGE, and then you can save the identificiation in the XML document to a table variable in the OUTPUT clause. -- Erland Sommarskog, SQL Server MVP, esquel (AT) sommarskog (DOT) se Links for SQL Server Books Online: SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx SQL 2000: http://www.microsoft.com/sql/prodinf...ons/books.mspx |
#24
| ||||
| ||||
|
|
Even if there is an index performance is going to suck unless you play with the commit size of the BCP and do it in chunks. |
|
The cursor in the INSTEAD OF trigger will be a killer and because of the extra transaction logging etc... performance will be dramatically slower. |
|
You can't parallel stream it either. |
|
On normal INSERT you can use the OUTPUT clause - I do this myself in some jobs I have for loading debt files from collection agencies. |
#25
| ||||
| ||||
|
|
Obviously, it is difficult to run a SELECT MAX from BCP. Then again, that is not really the typical usage scenario. |
|
Which INSTEAD OF trigger? I have not proposed any. Celko has not proposed any. Overall, using an INSTEAD OF trigger to generate a key does not strike me as the best idea. |
|
Yes, but you can only retrieve coluns that were inserted into the table. You cannot retrieve columns that were only in the source table. Which means that if you need to know the IDENTITY values to use for inserting into a subtable, you lose. |
|
Tony Rogerson (tonyrogerson (AT) torver (DOT) net) writes: Even if there is an index performance is going to suck unless you play with the commit size of the BCP and do it in chunks. Obviously, it is difficult to run a SELECT MAX from BCP. Then again, that is not really the typical usage scenario. The cursor in the INSTEAD OF trigger will be a killer and because of the extra transaction logging etc... performance will be dramatically slower. Which INSTEAD OF trigger? I have not proposed any. Celko has not proposed any. Overall, using an INSTEAD OF trigger to generate a key does not strike me as the best idea. You can't parallel stream it either. Yes, if you need to support high-concurrency inserts, there is all reason to use IDENTITY. But that is also more or less precise scenario where you should use it. On normal INSERT you can use the OUTPUT clause - I do this myself in some jobs I have for loading debt files from collection agencies. Yes, but you can only retrieve coluns that were inserted into the table. You cannot retrieve columns that were only in the source table. Which means that if you need to know the IDENTITY values to use for inserting into a subtable, you lose. -- Erland Sommarskog, SQL Server MVP, esquel (AT) sommarskog (DOT) se Links for SQL Server Books Online: SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx SQL 2000: http://www.microsoft.com/sql/prodinf...ons/books.mspx |
#26
| |||
| |||
|
|
Obviously, it is difficult to run a SELECT MAX from BCP. Then again, that is not really the typical usage scenario. But that is what we were talking about.... The whole point was cleko saying bcp scrambling identity to which I replied you can't effectively use bcp unless you use an instead of trigger and max( x ) + 1 etc.... go back and reread, hence my point about you miss reading what I'd posted. |
#27
| |||
| |||
|
|
The fact that IDENTITY has to queue the rows in a multi-row insertion in a non-deterministic fashion has been a big problem for me. It means there is no hope of validation. Obviously, the systems you've worked on in the past have been very small chunks of data. If you are loading lots of data you need parallel streams, absolutely in a parallel load scenario the number given is not deterministic but often we don't care, after all the ID 12 given by the issuing database to the email blah (AT) xyz (DOT) com AT INITIAL INSERT is meaningless so long as it NEVER changes. see if that works. But I also need to play with INSERT INTO Foobar (foo_seq, ..) VALUES ( (SELECT COALESCE (MAX(Foobar.foo_seq), 0) + ROW_NUMBER() OVER (ORDER BY some_sequencing_columns)) FROM NewFoobar) ..); Remember to check it on each product specifically for concurrency implications - I will be checking your research for errors. I already know the outcome on SQL Server. --ROGGIE-- "--CELKO--" <jcelko212 (AT) earthlink (DOT) net> wrote in message news:39a88c3c-d6b6-4608-af09-df2e19140e6b (AT) u7g2000yqm (DOT) googlegroups.com... You can certainly insert many rows in one go. The fact that IDENTITY has to queue the rows in a multi-row insertion in a non-deterministic fashion has been a big problem for me. It means there is no hope of validation. I need to look at the Standard CREATE SEQUENCE statement in depth to see if that works. But I also need to play with INSERT INTO Foobar (foo_seq, ..) VALUES ( (SELECT COALESCE (MAX(Foobar.foo_seq), 0) + ROW_NUMBER() OVER (ORDER BY some_sequencing_columns)) FROM NewFoobar) ..); or worse |
![]() |
| Thread Tools | |
| Display Modes | |
| |