![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
I have the below table that I need to load about 1 billion records into. I was wondering if anyone could offer some tips on how to optimize the process. I load data_table via a staging table that could have 1000 records - 50 million records. I use a left outer join to ensure that no duplicates are inserted. I have read that using one copy column transformation for all columns can increase performance, and I do have "Use Fast Load" selected. Also, both tables are on the same sever. Is there a way to speed up this load? Would it be a bad idea to use the transformation to load a csv text file and then use the bulk insert command to load the data? CREATE TABLE [dbo].[data_table] ( [project] [int] NULL , [responseid] [int] NULL , [attrib1] [varchar] (55) COLLATE SQL_Latin1_General_CP1_CI_AS NULL , [data] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL , [data_type] [varchar] (155) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ) ON [PRIMARY] CREATE CLUSTERED INDEX [icx__cluster] ON [dbo].[data_table]([responseid], [project]) WITH FILLFACTOR = 70, PAD_INDEX ON [PRIMARY] CREATE INDEX [ix__data_table__attrib1] ON [dbo].[data_table]([attrib1]) WITH FILLFACTOR = 70, PAD_INDEX ON [INDEXES] CREATE INDEX [ix__data_table__data] ON [dbo].[data_table]([data]) WITH FILLFACTOR = 70, PAD_INDEX ON [INDEXES] Other Notes: project: value is too large for a smallint, but there are only about 1200 project values responseid: unique to project. The largest project has 3.5 million responseid values attrib: ~100,000 unique values data: ~ 500,000 unique values |
#3
| |||
| |||
|
|
Hello Dave, A number of things for you to try 1. get rid of indexes These will hurt you during loading 2. Recovery model set to SIMPLE 3. How about not using DTS. What about pure TSQL? 4. Make sure the log has lots of room to expand likewise with the data files. 5. Make sure you have no triggers on the destination tables. Have a look in BOL for any other requirements for FAST LOAD Allan I have the below table that I need to load about 1 billion records into. I was wondering if anyone could offer some tips on how to optimize the process. I load data_table via a staging table that could have 1000 records - 50 million records. I use a left outer join to ensure that no duplicates are inserted. I have read that using one copy column transformation for all columns can increase performance, and I do have "Use Fast Load" selected. Also, both tables are on the same sever. Is there a way to speed up this load? Would it be a bad idea to use the transformation to load a csv text file and then use the bulk insert command to load the data? CREATE TABLE [dbo].[data_table] ( [project] [int] NULL , [responseid] [int] NULL , [attrib1] [varchar] (55) COLLATE SQL_Latin1_General_CP1_CI_AS NULL , [data] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL , [data_type] [varchar] (155) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ) ON [PRIMARY] CREATE CLUSTERED INDEX [icx__cluster] ON [dbo].[data_table]([responseid], [project]) WITH FILLFACTOR = 70, PAD_INDEX ON [PRIMARY] CREATE INDEX [ix__data_table__attrib1] ON [dbo].[data_table]([attrib1]) WITH FILLFACTOR = 70, PAD_INDEX ON [INDEXES] CREATE INDEX [ix__data_table__data] ON [dbo].[data_table]([data]) WITH FILLFACTOR = 70, PAD_INDEX ON [INDEXES] Other Notes: project: value is too large for a smallint, but there are only about 1200 project values responseid: unique to project. The largest project has 3.5 million responseid values attrib: ~100,000 unique values data: ~ 500,000 unique values |
#4
| |||
| |||
|
![]() |
| Thread Tools | |
| Display Modes | |
| |