![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
|
SSIS seems to lack a powerful feature that Informatica PowerCenter and GXS AI software provides. That is the ability to extract and group data in a semi-structured flat text file to XML with multi-level hierachal XML intutively without hand coding. With SSIS, I had to extract the formatted report flat text files in to SQL Server temp tables, then use lookups and OLE DB SQL commands to update the foreign key relationships across temp tables. Then manipulate the temp tables and load them to the working tables. That is a lot of work, and the performance is bad for this type of work. Using other products such PowerCenter, it can transform flat text file into XML. Then pass the XML to a stored procedure for additional extraction and process into multiple tables. The performance wise seems better. One solution now is to have the provider of flat text file provide us the unformatted report file in multiple grouping in CSV file, or provide us XML files so we can easily do the ETL part. Has anyone experience the same issues I'm having? Will SSIS have XML Destination capability like PowerCenter in the future? Thanks, Benjamin |
#2
| |||
| |||
|
|
Hello Benjamin, I don't know whether MS intend to release an XML destination adapter. As for your solution. Why did you need to drop into temp tables? You are right though in that it may be easier for you to ask the feed providers to drop to you multiple csv files, load them into the pipeline, use a merge join perhaps then eventually use a Script component as a destination and craft the XML yourself. Regards Allan Mitchell Konesans Ltd T +44 7966 476 572 F +44 2071 008 479 http://www.konesans.com SSIS seems to lack a powerful feature that Informatica PowerCenter and GXS AI software provides. That is the ability to extract and group data in a semi-structured flat text file to XML with multi-level hierachal XML intutively without hand coding. With SSIS, I had to extract the formatted report flat text files in to SQL Server temp tables, then use lookups and OLE DB SQL commands to update the foreign key relationships across temp tables. Then manipulate the temp tables and load them to the working tables. That is a lot of work, and the performance is bad for this type of work. Using other products such PowerCenter, it can transform flat text file into XML. Then pass the XML to a stored procedure for additional extraction and process into multiple tables. The performance wise seems better. One solution now is to have the provider of flat text file provide us the unformatted report file in multiple grouping in CSV file, or provide us XML files so we can easily do the ETL part. Has anyone experience the same issues I'm having? Will SSIS have XML Destination capability like PowerCenter in the future? Thanks, Benjamin |
#3
| |||
| |||
|
|
The reason I had to drop into temp tables was the structure of the formatted report flat file. The report has multiple headers, multiple details per header, multiple types of details having different detail columns, and some detail has sub-detail rows etc. So to parse out the data and join them together I had to first, add line numbers to the flat file using script component; the the line number is used for relating/join header to detail and sub-details. Each line or lines (header, detail, sub-details) I use split component to split the headers and attach to derive column component to extract the columns for each type. SSIS seems to execute tasks completely in sequence. E.g. if there are 100 lines, it'll pass all lines to the header extraction component first to get all headers. Then it'll go to the next step to extract all detail lines. I was hoping it wouldn't do that so I can join them more easily; e.g. when it finishes the first header it'll get it's detail line, then join them. Then goto the next header and detail line and join them again. But that wasn't the case. Is it possible to configure SSIS to do that in sequence for line by line is there something like (asynchronize/synchronize process)? So, for now it was extracting all sections, then I dump them to temp tables. Then the next flow was to join them together using lookup components. e.g. for each detail, lookup the header adjacent to this detail line; e.g. header line is 5 and detail line is 8 then header line 5 is the closes header, so it means header line 5 is related to detail line 8. In the lookup I had to use SQL query to sort and return only the closes header. That was the reason I had to use temp table; to sort and join the header and details. SSIS doesn't provide that type of sorting and joining capability. The Merge Join component only join the keys that are identical (not the relative join). And I couldn't join them in SSIS could I? maybe union them together and then use script component? hmm.... If the provider can provide us the data only flat files, then it'll be much easier. Then we probably won't even need XML. XML has a benefit where it's passed to stored procedure (SP) and SP would do all the ETL tasks. It would improve performance, and most likely faster than SSIS. Any suggestion or recommandation is appreciated. Thanks ![]() Thanks, Benjamin "Allan Mitchell" wrote: Hello Benjamin, I don't know whether MS intend to release an XML destination adapter. As for your solution. Why did you need to drop into temp tables? You are right though in that it may be easier for you to ask the feed providers to drop to you multiple csv files, load them into the pipeline, use a merge join perhaps then eventually use a Script component as a destination and craft the XML yourself. Regards Allan Mitchell Konesans Ltd T +44 7966 476 572 F +44 2071 008 479 http://www.konesans.com SSIS seems to lack a powerful feature that Informatica PowerCenter and GXS AI software provides. That is the ability to extract and group data in a semi-structured flat text file to XML with multi-level hierachal XML intutively without hand coding. With SSIS, I had to extract the formatted report flat text files in to SQL Server temp tables, then use lookups and OLE DB SQL commands to update the foreign key relationships across temp tables. Then manipulate the temp tables and load them to the working tables. That is a lot of work, and the performance is bad for this type of work. Using other products such PowerCenter, it can transform flat text file into XML. Then pass the XML to a stored procedure for additional extraction and process into multiple tables. The performance wise seems better. One solution now is to have the provider of flat text file provide us the unformatted report file in multiple grouping in CSV file, or provide us XML files so we can easily do the ETL part. Has anyone experience the same issues I'm having? Will SSIS have XML Destination capability like PowerCenter in the future? Thanks, Benjamin |
![]() |
| Thread Tools | |
| Display Modes | |
| |