![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
|
Hi, While checking a performance issue, I came across something really strange: From Sybase Central, I stopped and restarted a database on an engine which kept running. Then I opened ISQL which was the only connection to the newly started database, ran "call sa_flush_cache()", changed the plan settings to graphical with statistics, entered the select statement and hit Get Plan without having run the statement before. But the plan tells me it found most of the data in cache (see attachment). How can that happen? ASA 9.0.2.3182. Service start line is: -xtcpip -c2300m -ca 0 -m -o <file> -os1m -ti0 -tl0 <db file> -cw Frank |
#2
| |||
| |||
|
|
The DiskRead=73,408 statistic shows that the engine did in fact readthe table from disk. In this case, I/O hinting was used effectively. Those 73,408 disk reads were mostly group reads of 64K each. The DiskReadTable=1.1604e6 shows that 1.16 million table pages were readby the group reads (working out to a 4K page size) I/O hinting issues asynchronous read requests for 64K blocks, trying to keep ahead of the execution plan so that data is already loaded into the cache by the time the scan reads it. In this case, the I/O hinting worked perfectly, as seen by CacheReads = 3.4813e6 and CacheHits=3.4813e6. Every time the scan object tried to latch a page, it had already been read in by the I/O hinting. Note that individual pages can be latched multiple times to retrieve multiple rows off of the page, leading to CachReadTable DiskReadTable (and the number of table pages in that table). This behaviour is reduced in version 10.0 and above. You will note that QueryRowsBuffereFetch =1.5129e6 is less than RowsReturned=2.6733e6(and both may be lower than the number of rows in the table if there are predicates evaluated at the scan). In 10.0 and above, for this type of plan most of the rows would be buffer fetched and CacheRead should be lower (a savings of CPU time). Since it looks like the IO hinting was very effective at overlapping IO wait time with CPU time, I'd suggest that you look at the CPU costs in your query, particularly at the scan node. Do you have predicates that are expensive to evaluate? Are there UDF or other complex functions? Are there complications in the schema that make the scan expensive? The graphical plan with statistics adds CPU costs as it monitors the timing of fetching each row, and this effect can be exacerbated at the leavesof the plan as they see the most rows. Using fetchtst may give a more representative timing as it does not have the statistic monitors installed. Alternatively, you can select to only have the root-level statistics monitored when you fetch the graphical plan; that reduces the overheadto rows returned by the root. With the plan I am seeing, I would expect that fetchtst -ga would report that the CPU time is nearly equal to the total elapsed time. The command line switch -zt and sa_performance_diagnostics() may be of use, the procedure will report the time that the server spent blocked (waiting for IO, a lock, or for a shared engine data structure) and the time spent active (actively using CPU). My feel from the portion of the plan that you sent is that the time active will be most of the execution time, and that improving the CPU cost of the query should be the first line of attack.. In 10.0 and above, parallel execution plans could be of use for this query if you have multiple logical processors availalble. Another caution with caching behaviour is that the server can pre-warm the cache at server startup. Since you call sa_flush_cache() explicitly, you should avoid that complication but I mention it here for completeness. Regards, |
#3
| |||
| |||
|
|
Hi, While checking a performance issue, I came across something really strange: From Sybase Central, I stopped and restarted a database on an engine which kept running. Then I opened ISQL which was the only connection to the newly started database, ran "call sa_flush_cache()", changed the plan settings to graphical with statistics, entered the select statement and hit Get Plan without having run the statement before. But the plan tells me it found most of the data in cache (see attachment). How can that happen? ASA 9.0.2.3182. Service start line is: -xtcpip -c2300m -ca 0 -m -o <file> -os1m -ti0 -tl0 <db file> -cw Frank |
#4
| |||
| |||
|
|
As a side note, you can save a plan to a file directly from DBISQL. Here is my 'canned' request for plans that includes how to save the file... To generate a Graphical Plan with Statistics 1. From DBISQL, go to Tools | Options and set the "Plan" option to Graphical Plan with Statistics 2. Press Shift-F5 to get the plan 3. For ASA 8 and ASA9, Select File | Save As... and Set the file type to XML 4. For SA 10, Select File | Save Plan... 5. provide a file name 6. Click Ok You can view a saved graphical plan by 1. For ASA 8 and ASA9, from DBISQL, select File | Open... and Set the file type to XML 2. For SA 10, select File | Open Plan... 3. Select the Graphical plan file Frank Ploessel wrote: Hi, While checking a performance issue, I came across something really strange: From Sybase Central, I stopped and restarted a database on an engine which kept running. Then I opened ISQL which was the only connection to the newly started database, ran "call sa_flush_cache()", changed the plan settings to graphical with statistics, entered the select statement and hit Get Plan without having run the statement before. But the plan tells me it found most of the data in cache (see attachment). How can that happen? ASA 9.0.2.3182. Service start line is: -xtcpip -c2300m -ca 0 -m -o <file> -os1m -ti0 -tl0 <db file> -cw Frank |
#5
| |||
| |||
|
|
Chris, I know. But as these files contain some currently confidential information (prospect names in directory names ...), I prefered to not post them completely. Frank On Thu, 25 Oct 2007 15:16:29 +0200, Chris Keating (Sybase iAnywhere) keating_spam_free (AT) ianywhere (DOT) com> wrote: As a side note, you can save a plan to a file directly from DBISQL. Here is my 'canned' request for plans that includes how to save the file... To generate a Graphical Plan with Statistics 1. From DBISQL, go to Tools | Options and set the "Plan" option to Graphical Plan with Statistics 2. Press Shift-F5 to get the plan 3. For ASA 8 and ASA9, Select File | Save As... and Set the file type to XML 4. For SA 10, Select File | Save Plan... 5. provide a file name 6. Click Ok You can view a saved graphical plan by 1. For ASA 8 and ASA9, from DBISQL, select File | Open... and Set the file type to XML 2. For SA 10, select File | Open Plan... 3. Select the Graphical plan file Frank Ploessel wrote: Hi, While checking a performance issue, I came across something really strange: From Sybase Central, I stopped and restarted a database on an engine which kept running. Then I opened ISQL which was the only connection to the newly started database, ran "call sa_flush_cache()", changed the plan settings to graphical with statistics, entered the select statement and hit Get Plan without having run the statement before. But the plan tells me it found most of the data in cache (see attachment). How can that happen? ASA 9.0.2.3182. Service start line is: -xtcpip -c2300m -ca 0 -m -o <file> -os1m -ti0 -tl0 <db file> -cw Frank |
#6
| |||
| |||
|
|
Ivan, Thank you for this insightful and detailed answer. My original post was just as I did not understand the plan details about cachings, and you answered that in detaiil. Actually, this query is the fast version taking one minute, and the slow one is the exact same query but on a table half the size, but taking ten minutes. I keep searching for reasons of the bad performance. Frank On Wed, 24 Oct 2007 23:24:23 +0200, Ivan T. Bowman ibowman (AT) ianywhere (DOT) NOSPAM.com> wrote: The DiskRead=73,408 statistic shows that the engine did in fact read the table from disk. In this case, I/O hinting was used effectively. Those 73,408 disk reads were mostly group reads of 64K each. The DiskReadTable=1.1604e6 shows that 1.16 million table pages were read by the group reads (working out to a 4K page size) I/O hinting issues asynchronous read requests for 64K blocks, trying to keep ahead of the execution plan so that data is already loaded into the cache by the time the scan reads it. In this case, the I/O hinting worked perfectly, as seen by CacheReads = 3.4813e6 and CacheHits=3.4813e6. Every time the scan object tried to latch a page, it had already been read in by the I/O hinting. Note that individual pages can be latched multiple times to retrieve multiple rows off of the page, leading to CachReadTable DiskReadTable (and the number of table pages in that table). This behaviour is reduced in version 10.0 and above. You will note that QueryRowsBuffereFetch =1.5129e6 is less than RowsReturned=2.6733e6 (and both may be lower than the number of rows in the table if there are predicates evaluated at the scan). In 10.0 and above, for this type of plan most of the rows would be buffer fetched and CacheRead should be lower (a savings of CPU time). Since it looks like the IO hinting was very effective at overlapping IO wait time with CPU time, I'd suggest that you look at the CPU costs in your query, particularly at the scan node. Do you have predicates that are expensive to evaluate? Are there UDF or other complex functions? Are there complications in the schema that make the scan expensive? The graphical plan with statistics adds CPU costs as it monitors the timing of fetching each row, and this effect can be exacerbated at the leaves of the plan as they see the most rows. Using fetchtst may give a more representative timing as it does not have the statistic monitors installed. Alternatively, you can select to only have the root-level statistics monitored when you fetch the graphical plan; that reduces the overhead to rows returned by the root. With the plan I am seeing, I would expect that fetchtst -ga would report that the CPU time is nearly equal to the total elapsed time. The command line switch -zt and sa_performance_diagnostics() may be of use, the procedure will report the time that the server spent blocked (waiting for IO, a lock, or for a shared engine data structure) and the time spent active (actively using CPU). My feel from the portion of the plan that you sent is that the time active will be most of the execution time, and that improving the CPU cost of the query should be the first line of attack. In 10.0 and above, parallel execution plans could be of use for this query if you have multiple logical processors availalble. Another caution with caching behaviour is that the server can pre-warm the cache at server startup. Since you call sa_flush_cache() explicitly, you should avoid that complication but I mention it here for completeness. Regards, |
#7
| |||
| |||
|
|
From looking at the plan fragment you have posted, I agree that the scan of the small table is not doing 64K group reads. Instead, each table page is being read individually. If you use a file system monitor, you will see that the read size is smaller than the Big case and that the number of reads is higher. There are a few reasons why the 64K reads might not be used. One of these reasons can come up if you have rows that are continued across pages; 64K reads only operate on the head of rows, not on continuations. The statistic of 1.28 segments per row indicated that each row was an average of 1.28 segments. If the columns on the continuation segment were needed by the query, then the continuation page would need to be read as it would not have been read by an asynchronous hint. The problem of continued rows does not match the statistcs that you see, though. Other than continued rows, you mention the possibility that the table could be spread out. This is certainly a reason why 64K reads would not be used, as the group reads must start on a 64K boundary. It is possible that the pages are allocated in such a way that group reads can not be used (although the server does try to allocate table pages in clusters). Due to the attempt to allocate pages in clusters, I would not recommend being inordinately careful about table fragmentation within the dbspace. Of course, avoiding continued rows through appropriate PCTFREE settings is a really good idea. From your statistics, though, it looks like _no_ 64K reads at all were issued in the Small case (since DiskRead > DiskReadTable). I am not sure that the non-contiguous pages idea explains the stats you are seeing; I would have expected some blocks of contiguous pages even in a very non-contiguos table. If you are investigating this further, I would suggest creating a new dbspace then creating a copy of Small in the new dbspace. The pages should all be contiguous in that space, so performance ought to be as good as Big unless there is another factor at work besides contiguity. There are some other reasons why 64K reads might not be used. One is related to database file version, but since both Big and Small tables are in the same database, that should not apply. There may be some reason that I am not thinking of that Small would not use the group reads. I do not have a good answer based on the data available. I don't know of a good way in 9.0 to identify how table pages are allocated throughout the database. One mechanism is to use a filemonitor to observe the disk reads performed for a sequential scan, then correlate those back to an allocation pattern; that is a bit tedious. In 10.0 and above, the allocated pages of a table are stored in a long varbit column in the catalog and that column can be manipulated from SQL. I am interested in your performance case as a possible regression test for us (wide tables of 400+ columns can have unique performance problems). I don't know if there is a way for you to create a non-sensitive reproducible case for us to look at, but if you could I would like to investigate a bit further and possibly create regression performance tests based on your setup. Regards, -- Ivan T. Bowman SQLAnywhere Research and Development "Frank Ploessel" <fpl... (AT) d_e (DOT) i_m_s_h_e_a_l_t_h.c_o_m> wrote in message news p.t0rkbjmlj0bybf (AT) bonw00164 (DOT) internal.imsglobal.com...Ivan, Let me summarize my general performance issue: I have the exact same query in the same database, run on table Big (450 columns, 2.7 million rows), it took 40-60 columns. Run on table Small (400 columns, a subset of the columns of table Big in the same physical order, and 1.4 million rows), it took 9-12 minutes. As table Small only has half the size (545 thousand vs. 1160 thousand pages), this was not as expected. The query is: SELECT IntegerCol, coalesce(sum(numericCol * 1.), 0) FROM Big GROUP BY IntegerCol HAVING coalesce(sum(numericCol * 1.), 0) <> 0 having a result set of 16 rows after HAVING eliminates one row. ASA uses a hash join, so the main driver of runtime is the full table scan of the millions of records from the base table (Big or Small), which cannot be avoided. 1. I found the db file had 13000 fragments. Defragmentig it down to 8 fragments did not help much. 2. I found that table Small was fragmented (1.28 segments per row), while Big was not. Defragmenting it (actually creating a table with the same structure and running insert select, to keep the original available) brought the runtime down to about 90 to 120 seconds, better, but still much longer than on table Big. Going through the differences in the plans now, what I found was that in the reading from the base table - as expected - less rows and less pages were read from the new table Small than from Big. But the number of pages read is the same as the number of reads for table Small, while for table Big, Disk Table reads / DIsk reads was 15.8, so as I learnt from your post mainly 64K reads were issued. I assume ASA did not issue 64K read requests at all but 4K read requests for table Small. And probably the reason is that the table is spread too much across the database file and not saved continuously. And eventually, the higher number of I/O requests made the query on table Small slower. So this means we should take more care of fragmentation in future, preallocating database space before filling big tables to avoid table spreading in the db file, and try to avoid table fragmentation as far as feasable, maybe experimenting with pctfree settings. Do you think my interpretation sounds reasonable? Is there any possibility to find out if a table is saved in the DB file continously or not? Frank On Thu, 25 Oct 2007 14:46:13 +0200, Frank Ploessel fpl... (AT) d_e (DOT) i_m_s_h_e_a_l_t_h.c_o_m> wrote: Ivan, Thank you for this insightful and detailed answer. My original post was just as I did not understand the plan details about cachings, and you answered that in detaiil. Actually, this query is the fast version taking one minute, and the slow one is the exact same query but on a table half the size, but taking ten minutes. I keep searching for reasons of the bad performance. Frank On Wed, 24 Oct 2007 23:24:23 +0200, Ivan T. Bowman ibowman (AT) ianywhere (DOT) NOSPAM.com> wrote: The DiskRead=73,408 statistic shows that the engine did in fact read the table from disk. In this case, I/O hinting was used effectively. Those 73,408 disk reads were mostly group reads of 64K each. The DiskReadTable=1.1604e6 shows that 1.16 million table pages were read by the group reads (working out to a 4K page size) I/O hinting issues asynchronous read requests for 64K blocks, trying to keep ahead of the execution plan so that data is already loaded into the cache by the time the scan reads it. In this case, the I/O hinting worked perfectly, as seen by CacheReads = 3.4813e6 and CacheHits=3.4813e6. Every time the scan object tried to latch a page, it had already been read in by the I/O hinting. Note that individual pages can be latched multiple times to retrieve multiple rows off of the page, leading to CachReadTable DiskReadTable (and the number of table pages in that table). This behaviour is reduced in version 10.0 and above. You will note that QueryRowsBuffereFetch =1.5129e6 is less than RowsReturned=2.6733e6 (and both may be lower than the number of rows in the table if there are predicates evaluated at the scan). In 10.0 and above, for this type of plan most of the rows would be buffer fetched and CacheRead should be lower (a savings of CPU time). Since it looks like the IO hinting was very effective at overlapping IO wait time with CPU time, I'd suggest that you look at the CPU costs in your query, particularly at the scan node. Do you have predicates that are expensive to evaluate? Are there UDF or other complex functions? Are there complications in the schema that make the scan expensive? The graphical plan with statistics adds CPU costs as it monitors the timing of fetching each row, and this effect can be exacerbated at the leaves of the plan as they see the most rows. Using fetchtst may give a more representative timing as it does not have the statistic monitors installed. Alternatively, you can select to only have the root-level statistics monitored when you fetch the graphical plan; that reduces the overhead to rows returned by the root. With the plan I am seeing, I would expect that fetchtst -ga would report that the CPU time is nearly equal to the total elapsed time. The command line switch -zt and sa_performance_diagnostics() may be of use, the procedure will report the time that the server spent blocked (waiting for IO, a lock, or for a shared engine data structure) and the time spent active (actively using CPU). My feel from the portion of the plan that you sent is that the time active will be most of the execution time, and that improving the CPU cost of the query should be the first line of attack. In 10.0 and above, parallel execution plans could be of use for this query if you have multiple logical processors availalble. Another caution with caching behaviour is that the server can pre-warm the cache at server startup. Since you call sa_flush_cache() explicitly, you should avoid that complication but I mention it here for completeness. Regards, |
#8
| |||
| |||
|
|
After further discussion with a colleague, it doesn't seem likely that non-contiguous allocation of your Small table pages would lead the server to not use the 64K group reads. You would need to have each 64K block containing only 1 table page, and that is very unlikely (c.f. the birthday paradox). This layout particularly should not occur due to the server's attempt to grow a table in 64K blocks. There remain a few other reasons why group reads would not be used. One possibility is that a table bitmap has not yet been created for the table. The first sequential scan will create the bitmap if the table is sufficiently large, so it may be that the first execution of the query on Small would be slow (with no page bitmap) but then later ones should be faster. In order to investigate this issue further it may be most expedient to open a case with technical support and give us a copy of the database or other information to reproduce the problem. Regards, |
![]() |
| Thread Tools | |
| Display Modes | |
| |