dbTalk Databases Forums  

corrupt block in ASM disk

comp.databases.oracle.server comp.databases.oracle.server


Discuss corrupt block in ASM disk in the comp.databases.oracle.server forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
lsllcm
 
Posts: n/a

Default corrupt block in ASM disk - 04-28-2011 , 03:10 AM






Hi All,

I meet one corrupt block issue in ASM disk. Below is replicate steps:

1. create tablespace
create tablespace aa_data
datafile
'+DATA/dbs11g/aa_data01.dbf' size 20M
EXTENT MANAGEMENT LOCAL AUTOALLOCATE
SEGMENT SPACE MANAGEMENT AUTO
/

2. It prompts the message:
ORA-01119: error in creating database file '+DATA/dbs11g/
aa_data01.dbf'
ORA-17502: ksfdcre:4 Failed to create file +DATA/dbs11g/aa_data01.dbf
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATAVOL1" may result in a data loss

3. check alert.log
WARNING: IO Failed. group:1 disk(number.incarnation):0.0xe96892e8
disk_path:ORCLATAVOL1
AU:2 disk_offset(bytes):2097152 io_size:4096 operation:Read
type:synchronous
result:I/O error process_id:11679
WARNING: cache failed reading from group=DATA fn=1 blk=0 count=1 from
disk= 0 DATAVOL1 kfkist=0x20 status=0x02 file=kfc.c line=10225
ERROR: cache failed to read group=DATA fn=1 blk=0 from disk(s): 0
DATAVOL1
ORA-15080: synchronous I/O operation to a disk failed
System State dumped to trace file /u01/app/grid/diag/asm/+asm/+ASM/
trace/+ASM_ora_11679.trc

4. check amdu log
/u01/app/grid/diag/asm/+asm/+ASM/trace/ amdu_2011_04_26_17_13_28
---------------------------- SCANNING DISK N0002
-----------------------------
Disk N0002: 'ORCLATAVOL1'
AMDU-00407: asmlib error!! function = [asm_close], error = [0], mesg =
[I/O Error]
AMDU-00200: Unable to read [262144] bytes from Disk N0002 at offset
[2097152]
AMDU-00201: Disk N0002: 'ORCLATAVOL1'
Allocated AU's: 3
Free AU's: 0
AU's read for dump: 2
Block images saved: 512
Map lines written: 2
Heartbeats seen: 0
Corrupt metadata blocks: 0
Corrupt AT blocks: 0

5. check dmesg
dmesg|more

Info fld=0x1fa81d1, Current sda: sense key Medium Error
Additional sense: Data synchronization mark error
end_request: I/O error, dev sda, sector 33194449
scsi6: ERROR on channel 0, id 0, lun 0, CDB: Read (10) 00 01 fa 81 d1
00 02 00 0

6. I use amdu dump the asm disk
amdu -dump 'DATA'

---------------------------- SCANNING DISK N0002
-----------------------------
Disk N0002: 'ORCLATAVOL1'
AMDU-00209: Corrupt block found: Disk N0002 AU [84926] block [0] type
[0]
AMDU-00201: Disk N0002: 'ORCLATAVOL1'
AMDU-00204: Disk N0002 is in currently mounted diskgroup DATA
AMDU-00201: Disk N0002: 'ORCLATAVOL1'
** HEARTBEAT DETECTED **
Allocated AU's: 84927
Free AU's: 12733
AU's read for dump: 82
Block images saved: 3774
Map lines written: 82
Heartbeats seen: 1
Corrupt metadata blocks: 1
Corrupt AT blocks: 0

I tried to use remap, but the issue still exists

remap DATA DATAVOL1 173928448-173928448

Can anyone help?

Thanks

Reply With Quote
  #2  
Old   
John Hurley
 
Posts: n/a

Default Re: corrupt block in ASM disk - 04-28-2011 , 08:19 AM






On Apr 28, 4:10*am, lsllcm <lsl... (AT) gmail (DOT) com> wrote:
Quote:
Hi All,

I meet one corrupt block issue in ASM disk. Below is replicate steps:

1. create tablespace
create tablespace aa_data
datafile
*'+DATA/dbs11g/aa_data01.dbf' size 20M
EXTENT MANAGEMENT LOCAL AUTOALLOCATE
SEGMENT SPACE MANAGEMENT AUTO
/

2. It prompts the message:
ORA-01119: error in creating database file '+DATA/dbs11g/
aa_data01.dbf'
ORA-17502: ksfdcre:4 Failed to create file +DATA/dbs11g/aa_data01.dbf
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATAVOL1" may result in a data loss

3. check alert.log
WARNING: IO Failed. group:1 disk(number.incarnation):0.0xe96892e8
disk_path:ORCLATAVOL1
* * * * *AU:2 disk_offset(bytes):2097152 io_size:4096 operation:Read
type:synchronous
* * * * *result:I/O error process_id:11679
WARNING: cache failed reading from group=DATA fn=1 blk=0 count=1 from
disk= 0 DATAVOL1 kfkist=0x20 status=0x02 file=kfc.c line=10225
ERROR: cache failed to read group=DATA fn=1 blk=0 from disk(s): 0
DATAVOL1
ORA-15080: synchronous I/O operation to a disk failed
System State dumped to trace file /u01/app/grid/diag/asm/+asm/+ASM/
trace/+ASM_ora_11679.trc

4. check amdu log
/u01/app/grid/diag/asm/+asm/+ASM/trace/ amdu_2011_04_26_17_13_28
---------------------------- SCANNING DISK N0002
-----------------------------
Disk N0002: 'ORCLATAVOL1'
AMDU-00407: asmlib error!! function = [asm_close], error = [0], mesg =
[I/O Error]
AMDU-00200: Unable to read [262144] bytes from Disk N0002 at offset
[2097152]
AMDU-00201: Disk N0002: 'ORCLATAVOL1'
* * * * * *Allocated AU's: 3
* * * * * * * * Free AU's: 0
* * * *AU's read for dump: 2
* * * *Block images saved: 512
* * * * Map lines written: 2
* * * * * Heartbeats seen: 0
* Corrupt metadata blocks: 0
* * * * Corrupt AT blocks: 0

5. check dmesg
dmesg|more

Info fld=0x1fa81d1, Current sda: sense key Medium Error
Additional sense: Data synchronization mark error
end_request: I/O error, dev sda, sector 33194449
scsi6: ERROR on channel 0, id 0, lun 0, CDB: Read (10) 00 01 fa 81 d1
00 02 00 0

6. I use amdu dump the asm disk
amdu -dump 'DATA'

---------------------------- SCANNING DISK N0002
-----------------------------
Disk N0002: 'ORCLATAVOL1'
AMDU-00209: Corrupt block found: Disk N0002 AU [84926] block [0] type
[0]
AMDU-00201: Disk N0002: 'ORCLATAVOL1'
AMDU-00204: Disk N0002 is in currently mounted diskgroup DATA
AMDU-00201: Disk N0002: 'ORCLATAVOL1'
** HEARTBEAT DETECTED **
* * * * * *Allocated AU's: 84927
* * * * * * * * Free AU's: 12733
* * * *AU's read for dump: 82
* * * *Block images saved: 3774
* * * * Map lines written: 82
* * * * * Heartbeats seen: 1
* Corrupt metadata blocks: 1
* * * * Corrupt AT blocks: 0

I tried to use remap, but the issue still exists

remap DATA DATAVOL1 173928448-173928448

Can anyone help?

Thanks
Got a good rman backup?

How many databases share this disk group?

One way to approach it is to get the disk fixed at the storage
level ... recreate the ASM disk group with force ... restore the
database. If approaching it like that you may need to startup nomount
with a pfile copy and then restore a controlfile backup then mount
then do an rman restore.

I for one do not store my rman disk backups in ASM disk groups.

Reply With Quote
  #3  
Old   
onedbguru
 
Posts: n/a

Default Re: corrupt block in ASM disk - 04-28-2011 , 08:50 PM



On Apr 28, 9:19*am, John Hurley <hurleyjo... (AT) yahoo (DOT) com> wrote:
Quote:
On Apr 28, 4:10*am, lsllcm <lsl... (AT) gmail (DOT) com> wrote:









Hi All,

I meet one corrupt block issue in ASM disk. Below is replicate steps:

1. create tablespace
create tablespace aa_data
datafile
*'+DATA/dbs11g/aa_data01.dbf' size 20M
EXTENT MANAGEMENT LOCAL AUTOALLOCATE
SEGMENT SPACE MANAGEMENT AUTO
/

2. It prompts the message:
ORA-01119: error in creating database file '+DATA/dbs11g/
aa_data01.dbf'
ORA-17502: ksfdcre:4 Failed to create file +DATA/dbs11g/aa_data01.dbf
ORA-15130: diskgroup "DATA" is being dismounted
ORA-15066: offlining disk "DATAVOL1" may result in a data loss

3. check alert.log
WARNING: IO Failed. group:1 disk(number.incarnation):0.0xe96892e8
disk_path:ORCLATAVOL1
* * * * *AU:2 disk_offset(bytes):2097152 io_size:4096 operation:Read
type:synchronous
* * * * *result:I/O error process_id:11679
WARNING: cache failed reading from group=DATA fn=1 blk=0 count=1 from
disk= 0 DATAVOL1 kfkist=0x20 status=0x02 file=kfc.c line=10225
ERROR: cache failed to read group=DATA fn=1 blk=0 from disk(s): 0
DATAVOL1
ORA-15080: synchronous I/O operation to a disk failed
System State dumped to trace file /u01/app/grid/diag/asm/+asm/+ASM/
trace/+ASM_ora_11679.trc

4. check amdu log
/u01/app/grid/diag/asm/+asm/+ASM/trace/ amdu_2011_04_26_17_13_28
---------------------------- SCANNING DISK N0002
-----------------------------
Disk N0002: 'ORCLATAVOL1'
AMDU-00407: asmlib error!! function = [asm_close], error = [0], mesg =
[I/O Error]
AMDU-00200: Unable to read [262144] bytes from Disk N0002 at offset
[2097152]
AMDU-00201: Disk N0002: 'ORCLATAVOL1'
* * * * * *Allocated AU's: 3
* * * * * * * * Free AU's: 0
* * * *AU's read for dump: 2
* * * *Block images saved: 512
* * * * Map lines written: 2
* * * * * Heartbeats seen: 0
* Corrupt metadata blocks: 0
* * * * Corrupt AT blocks: 0

5. check dmesg
dmesg|more

Info fld=0x1fa81d1, Current sda: sense key Medium Error
Additional sense: Data synchronization mark error
end_request: I/O error, dev sda, sector 33194449
scsi6: ERROR on channel 0, id 0, lun 0, CDB: Read (10) 00 01 fa 81 d1
00 02 00 0

6. I use amdu dump the asm disk
amdu -dump 'DATA'

---------------------------- SCANNING DISK N0002
-----------------------------
Disk N0002: 'ORCLATAVOL1'
AMDU-00209: Corrupt block found: Disk N0002 AU [84926] block [0] type
[0]
AMDU-00201: Disk N0002: 'ORCLATAVOL1'
AMDU-00204: Disk N0002 is in currently mounted diskgroup DATA
AMDU-00201: Disk N0002: 'ORCLATAVOL1'
** HEARTBEAT DETECTED **
* * * * * *Allocated AU's: 84927
* * * * * * * * Free AU's: 12733
* * * *AU's read for dump: 82
* * * *Block images saved: 3774
* * * * Map lines written: 82
* * * * * Heartbeats seen: 1
* Corrupt metadata blocks: 1
* * * * Corrupt AT blocks: 0

I tried to use remap, but the issue still exists

remap DATA DATAVOL1 173928448-173928448

Can anyone help?

Thanks

Got a good rman backup?

How many databases share this disk group?

One way to approach it is to get the disk fixed at the storage
level ... recreate the ASM disk group with force ... restore the
database. *If approaching it like that you may need to startup nomount
with a pfile copy and then restore a controlfile backup then mount
then do an rman restore.

I for one do not store my rman disk backups in ASM disk groups.
I would echo John's question. Do you have a good backup?

What version ASM?
RAC? Version?
What type of storage (direct-connect RAID? SCSI? SAN?)
How are the underlying devices partitioned? or are they?
What is your REDUNDANCY level? If you are using EXTERNAL with
individual direct-attached SCSI disks, you should be taken out and
shot.

I typically will partition the device such that:
p1 = first block block 1 to block 1
p2 = rest of the device (block 2 to the end)

and the partition used by ASM is p2 only.

What happens when you use the following syntax for creating the
tablespace? If you are going to use ASM, it is time to get out of
the "I gotta know what datafile my data is in..." DBA mentality. I
have used this on ELDB (V V VLDB??) environments with no performance
degradation. ASM is supposed to help make your life easier and if you
understand ASM, it will. Or you can continue to do things the hard
way.

make sure that
alter system set db_create_file_dest='+DATA';
or
alter system set db_create_file_dest='+DATA/sub-dir/sub-dir'; -- if
you really need to find your datafile.

and then
create tablespace abc;

These are default when using ASM so no need to specify them:
EXTENT MANAGEMENT LOCAL AUTOALLOCATE SEGMENT SPACE MANAGEMENT AUTO

Reply With Quote
  #4  
Old   
lsllcm
 
Posts: n/a

Default Re: corrupt block in ASM disk - 04-30-2011 , 06:33 AM



Yes, I have a backup.

I use dd to clean the disk and recreate the disk group, and use amdu
to extract pfile and control file.

I just want to better way or quicker way to fix the issue.

Thanks for your suggestion about tablespace creation.

I use scsi disks.

I am interesting about why partition like below:

<!-----
I typically will partition the device such that:
p1 = first block block 1 to block 1
p2 = rest of the device (block 2 to the end)
----->

Thanks

Reply With Quote
  #5  
Old   
lsllcm
 
Posts: n/a

Default Re: corrupt block in ASM disk - 05-01-2011 , 10:05 AM



Hi Onedbguru,

Why partition like below:

<!-----
I typically will partition the device such that:
p1 = first block block 1 to block 1
p2 = rest of the device (block 2 to the end)
----->

Thanks

Reply With Quote
  #6  
Old   
onedbguru
 
Posts: n/a

Default Re: corrupt block in ASM disk - 05-02-2011 , 09:35 PM



On May 1, 11:05*am, lsllcm <lsl... (AT) gmail (DOT) com> wrote:
Quote:
Hi Onedbguru,

Why partition like below:

*<!-----
*I typically will partition the device such that:
*p1 = first block block 1 to block 1
*p2 = rest of the device (block 2 to the end)
*-----

Thanks
Some OS's use the first block to store the VTOC (Solaris Volume Table
of Contents as an example ). If you overwrite this with ASM
information, you may no longer be able to access your the device. So,
I just make it a point to ensure that the OS won't do something silly
with my devices by reserving that first block.

In using ASM on a Solaris environment, when we did not reserve that
first block we would test by doing " dd if=/dev/zero of=/dev/...
bs=8192 count=10 ". The first time you do it, it works. Subsequent
attempts fail with I/O errors. Next, you have the SA re-enable the
device by reformatting it. So, bottom line is to use a standard
procedure that works on all platforms.

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.