Tuesday, April 28, 2009

Beware of index contention after mass delete and insert

One insert can cause an outage on your system:

If you delete large number of records from a table having an index and insert the same records as part of another transaction, then one of the subsequent inserts would have to perform tremendous amount of work while splitting an index block as it has to find a free block from the free list but the blocks reused from delete-insert operation were still there in free list and hence this operation would have to unlink them one by one until it finds an all empty block.

This can be demonstrated easily with the following test case (on 10.2.0.3 with 8k block size and no assm tablespace):
create tablespace idx1 datafile '?/dbs/idx1.dbf' size 500M
segment space management manual
extent management local uniform size 10M;

create table idx1(a number) tablespace idx1;

create index idx1_idx on idx1 (a) tablespace idx1 pctfree 0;

insert into idx1 select rownum from all_objects, all_objects where rownum <= 250000;

commit;

create table idx2 tablespace idx1 as select * from idx1 where 1=2;

insert into idx2
select * from idx1 where rowid in
(select rid from
(select rid, rownum rn from
(select rowid rid from idx1 where a between 10127 and 243625 order by a)
)
where mod(rn, 250) = 0
)
/

commit;

delete from idx1 where a between 10127 and 243625;

commit;

insert into idx1 select * from idx2;

commit;

insert into idx1 select 250000+rownum from all_objects where rownum <= 126;

commit;

select sql_id, executions, buffer_gets, disk_reads, cpu_time, elapsed_time, rows_processed, sql_text from v$sql
where sql_text like '%insert%idx1%' and sql_text not like '%v$sql%';

insert into idx1 values (251000);

commit;

select sql_id, executions, buffer_gets, disk_reads, cpu_time, elapsed_time, rows_processed, sql_text from v$sql
where sql_text like '%insert%idx1%' and sql_text not like '%v$sql%';


If you run the above statements, you will see that one insert statement consuming over 900 buffer gets. While it is in the middle of getting 900 blocks (some of them may be on the disk), all other inserts would have to wait for it to finish.

If you either coalesce or shrink the index with an alter statement, then this problem will go away (alternatively you can also do online index rebuild), but the best way to deal with it is by doing the delete and insert in one transaction.

I've opened bug# 8446989 with Oracle support on this one, hopefully it will be fixed soon.


Update (06/2012):
Oracle finally fixed this issue, you need to set event 45051 for enabling this fix. You will still about 150 buffer gets for one insert but most of them are for repeated reads so the impact is very minimal. Big thanks to Oracle for fixing this issue.

Thanks,
Sai.

9 Comments:

At June 21, 2009 at 6:54:00 AM PDT, Blogger Arup Nanda said...

Sai - thans for the post. I am wondering if the test can be reproduced with an ASSM tablespace. The issue could be linked to free buffer linking, rather than index block splitting and thus ASSM may have a different behavior.

 
At June 21, 2009 at 10:39:00 PM PDT, Blogger Saibabu Devabhaktuni said...

Thanks for your note Arup. Yes, you are right, even with ASSM tablespace a single insert statement is taking many buffer gets.

Thanks,
Sai.

 
At July 23, 2009 at 12:47:00 AM PDT, Blogger Unknown said...

Hello Sai,

i have a similar case
session1 in a loop
{
S11. insert max 1000 rows - feed from another process
S12.,makes further processing
S13.update those 1000 rows with status = B,
S14.back to S11
}

session 2 in a loop
{
S21. reads max 1000 rows with status = B
S22. makes further processing
S23. delete those 1000 rows with status = B
S24. back to S21
}

I noticed that after some time (45 days) I have high ITL waits in the index segment header. we are using 4k blocksize with ASSM
The only workaround is to kill those two sessions and restart after some minutes.
Your comments are welcomed

 
At February 12, 2010 at 3:12:00 PM PST, Anonymous Anonymous said...

Good post and this enter helped me alot in my college assignement. Say thank you you on your information.

 
At March 6, 2010 at 5:13:00 AM PST, Anonymous Anonymous said...

In my opinion here someone has gone in cycles

 
At March 13, 2010 at 9:23:00 PM PST, Anonymous Anonymous said...

In it something is. Earlier I thought differently, I thank for the help in this question.

 
At September 24, 2010 at 4:39:00 AM PDT, Anonymous Anonymous said...

Status 44 - Not Feasible to fix, to Filer

:(

 
At October 24, 2012 at 7:35:00 AM PDT, Anonymous sap pp said...

Creating an index will work but if the underlying table have millions of records, irrespective of how skinny the index is, it'd still have to count all the records one by one for the select count(*) query, which will be more expensive than having tracking table for big tables.

 
At April 24, 2013 at 12:24:00 PM PDT, Anonymous Anonymous said...

Hi Sai,

I am observing a similar issue on one of our testing environment and the version is 11.2.0.3 on Linux x86-64. I came across to your page while searching for details and thanks for sharing the details. I have raised a low severity SR with Oracle. However the bug# you opened with Oracle is published, but I am not able to see the Oracle suggested fix for setting the event 45051. May I please request your advice.

Thanks for your time.
LV

 

Post a Comment

<< Home