Monday, July 2, 2012

Moving ASMLib disk to block devices (Non-ASMLib) in 11gR2 RAC

This post list the steps to migrate from ASMLib to block devices in a 11gR2 RAC environment. There's an earlier post which list the step to migrate from ASMLib to block devices in a 11gR2 standalone system.
The cluster users ASM for the vote disk and OCR and current configuration is
crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   6d44155fe5054fb5bfd2abd3dee8a5b2 (ORCL:CLUS1) [CLUSTERDG]
 2. ONLINE   05233de65ba64fbebf13238219316963 (ORCL:CLUS2) [CLUSTERDG]
 3. ONLINE   84202daea6964f1ebf0af8c38e5a88f5 (ORCL:CLUS3) [CLUSTERDG]
Located 3 voting disk(s).
All the ASMLib disks in the system are
kfod disk=all
--------------------------------------------------------------------------------
 Disk          Size Path                                     User     Group
================================================================================
   1:       5114 Mb ORCL:CLUS1                                
   2:       5114 Mb ORCL:CLUS2                                
   3:       5114 Mb ORCL:CLUS3                                
   4:      10236 Mb ORCL:DATA                                 
   5:      10236 Mb ORCL:FLASH                                
--------------------------------------------------------------------------------
ORACLE_SID ORACLE_HOME
================================================================================
     +ASM1 /opt/app/11.2.0/grid3
     +ASM2 /opt/app/11.2.0/grid3
ASM instances
SQL> select inst_id,name,path,label from gv$asm_disk order by 1;

   INST_ID NAME       PATH       LABEL
---------- ---------- ---------- ----------
         1 DATA       ORCL:DATA  DATA
         1 CLUS3      ORCL:CLUS3 CLUS3
         1 CLUS2      ORCL:CLUS2 CLUS2
         1 CLUS1      ORCL:CLUS1 CLUS1
         1 FLASH      ORCL:FLASH FLASH
         2 DATA       ORCL:DATA  DATA
         2 CLUS3      ORCL:CLUS3 CLUS3
         2 CLUS2      ORCL:CLUS2 CLUS2
         2 CLUS1      ORCL:CLUS1 CLUS1
         2 FLASH      ORCL:FLASH FLASH
1. The ASMLib to block device migration could be done in a rolling fashion. Before proceeding shutdown the database instance on the node that's being worked on.
srvctl stop instance -d rac11g2 -i rac11g21
2. Unlike in the previous cases it was not possible to test the migration on a one node and one a single disk group.
/etc/init.d/oracleasm querydisk -d FLASH
Disk "FLASH" is a valid ASM disk on device [8, 81]

ls -l /dev/sd*
...
brw-r----- 1 root disk 8, 80 Jun 25 13:03 /dev/sdf
brw-r----- 1 root disk 8, 81 Jun 25 13:04 /dev/sdf1
brw-r----- 1 root disk 8, 96 Jun 25 13:03 /dev/sdg
brw-r----- 1 root disk 8, 97 Jun 25 13:04 /dev/sdg1

chown oracle:asmadmin /dev/sdf1

SQL> alter diskgroup flash dismount;

Diskgroup altered.

SQL> alter system set asm_diskstring='/dev/sdf1','ORCL:CLUS1','ORCL:CLUS2','ORCL:CLUS3','ORCL:DATA' scope=memory sid='+ASM1';

System altered.
It is important to set the parameter change instance level (sid='+ASM1') otherwise following error will be thrown
SQL> alter system set asm_diskstring='ORCL:CLUS*','ORCL:DATA*','/dev/sdf1' scope=memory;
alter system set asm_diskstring='ORCL:CLUS*','ORCL:DATA*','/dev/sdf1' scope=memory
*
ERROR at line 1:
ORA-32008: error while processing parameter update at instance +ASM2
ORA-02097: parameter cannot be modified because specified value is invalid
ORA-15014: path 'ORCL:FLASH' is not in the discovery set
Mounting the diskgroup again results in diskgroup using the ASMLib disk not the value set on the ASM diskstring parameter, which was happening in non-rac 11gR2 and 11gR1 RAC environments.
Mon Jun 25 13:44:38 2012
SQL> alter diskgroup flash dismount
NOTE: cache dismounting (clean) group 3/0x9A384CBC (FLASH)
NOTE: messaging CKPT to quiesce pins Unix process pid: 6170, image: oracle@rac4.code.net (TNS V1-V3)
Mon Jun 25 13:44:38 2012
NOTE: LGWR doing clean dismount of group 3 (FLASH)
NOTE: LGWR closing thread 1 of diskgroup 3 (FLASH) at ABA 58.1895
NOTE: LGWR released thread recovery enqueue
...
...
Mon Jun 25 13:44:38 2012
NOTE: diskgroup resource ora.FLASH.dg is offline
NOTE: diskgroup resource ora.FLASH.dg is updated
ALTER SYSTEM SET asm_diskstring='/dev/sdf1','ORCL:CLUS1','ORCL:CLUS2','ORCL:CLUS3','ORCL:DATA' SCOPE=MEMORY SID='+ASM1';
SQL> alter diskgroup flash mount
NOTE: cache registered group FLASH number=3 incarn=0xa6c84cc2
NOTE: cache began mount (first) of group FLASH number=3 incarn=0xa6c84cc2
NOTE: Assigning number (3,0) to disk (ORCL:FLASH)
Mon Jun 25 13:44:51 2012
NOTE: GMON heartbeating for grp 3
GMON querying group 3 at 50 for pid 27, osid 6170
3. Therefore all the ASMLib disk in the ASM instance were migrated to block devices
# /etc/init.d/oracleasm querydisk -d FLASH
Disk "FLASH" is a valid ASM disk on device [8, 81]
# /etc/init.d/oracleasm querydisk -d data
Disk "DATA" is a valid ASM disk on device [8, 65]
# /etc/init.d/oracleasm querydisk -d clus1
Disk "CLUS1" is a valid ASM disk on device [8, 17]
# /etc/init.d/oracleasm querydisk -d clus2
Disk "CLUS2" is a valid ASM disk on device [8, 33]
# /etc/init.d/oracleasm querydisk -d clus3
Disk "CLUS3" is a valid ASM disk on device [8, 49]

# chown oracle:asmadmin /dev/sdb1
# chown oracle:asmadmin /dev/sdc1
# chown oracle:asmadmin /dev/sdd1
# chown oracle:asmadmin /dev/sde1

# ls -l /dev/sd*
...
brw-r----- 1 root   disk     8, 16 Jun 25 13:03 /dev/sdb
brw-r----- 1 oracle asmadmin 8, 17 Jun 25 13:04 /dev/sdb1
brw-r----- 1 root   disk     8, 32 Jun 25 13:03 /dev/sdc
brw-r----- 1 oracle asmadmin 8, 33 Jun 25 13:04 /dev/sdc1
brw-r----- 1 root   disk     8, 48 Jun 25 13:03 /dev/sdd
brw-r----- 1 oracle asmadmin 8, 49 Jun 25 13:04 /dev/sdd1
brw-r----- 1 root   disk     8, 64 Jun 25 13:03 /dev/sde
brw-r----- 1 oracle asmadmin 8, 65 Jun 25 13:04 /dev/sde1
brw-r----- 1 root   disk     8, 80 Jun 25 13:03 /dev/sdf
brw-r----- 1 oracle asmadmin 8, 81 Jun 25 13:04 /dev/sdf1
brw-r----- 1 root   disk     8, 96 Jun 25 13:03 /dev/sdg
brw-r----- 1 root   disk     8, 97 Jun 25 13:04 /dev/sdg1
4. Change the asm_diskstring on the ASM instance
SQL> alter system set asm_diskstring='/dev/sdb1','/dev/sdc1','/dev/sdd1','/dev/sde1','/dev/sdf1' scope=spfile sid='+ASM1';

System altered.
5. Create udev rules file in the node
# ASM OCR VOTE
KERNEL=="sdb[1]", OWNER="oracle", GROUP="asmadmin", MODE="660"
KERNEL=="sdc[1]", OWNER="oracle", GROUP="asmadmin", MODE="660"
KERNEL=="sdd[1]", OWNER="oracle", GROUP="asmadmin", MODE="660"

# ASM DATA
KERNEL=="sde[1]", OWNER="oracle", GROUP="asmadmin", MODE="660"

# ASM FLASH
KERNEL=="sdf[1]", OWNER="oracle", GROUP="asmadmin", MODE="660"
6. Shutdown the clusterware stack on the node. Unlike in 11gR1 RAC since clusterware is also dependent on ASM diskgroup a database instance shutdown alone is not enough to complete the migration. A full clusterware stack shutdown on the node that's being worked on is required. Other node can remain open.
# crsctl stop crs


7. Unload the oracleasm module
# /sbin/lsmod  | grep oracleasm
oracleasm              84136  1

# /etc/init.d/oracleasm stop
Dropping Oracle ASMLib disks:                              [  OK  ]
Shutting down the Oracle ASMLib driver:                    [  OK  ]

# /etc/init.d/oracleasm disable
Writing Oracle ASM library driver configuration: done
Dropping Oracle ASMLib disks:                              [  OK  ]
Shutting down the Oracle ASMLib driver:                    [  OK  ]

# /sbin/chkconfig oracleasm off
8. Start the crs and verify the new block devices are in use by monitoring the ASM alert log
# crsctl start crs

crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   6d44155fe5054fb5bfd2abd3dee8a5b2 (/dev/sdb1) [CLUSTERDG]
 2. ONLINE   05233de65ba64fbebf13238219316963 (/dev/sdc1) [CLUSTERDG]
 3. ONLINE   84202daea6964f1ebf0af8c38e5a88f5 (/dev/sdd1) [CLUSTERDG]
Located 3 voting disk(s).

SQL> ALTER DISKGROUP ALL MOUNT /* asm agent call crs *//* {0:0:2} */
NOTE: Diskgroups listed in ASM_DISKGROUPS are
         DATA
         FLASH
NOTE: Diskgroup used for Voting files is:
         CLUSTERDG
Diskgroup with spfile:CLUSTERDG
Diskgroup used for OCR is:CLUSTERDG
NOTE: cache registered group CLUSTERDG number=1 incarn=0xeadcf7b1
NOTE: cache began mount (not first) of group CLUSTERDG number=1 incarn=0xeadcf7b1
NOTE: cache registered group DATA number=2 incarn=0xeaecf7b2
NOTE: cache began mount (not first) of group DATA number=2 incarn=0xeaecf7b2
NOTE: cache registered group FLASH number=3 incarn=0xa6ccf7b3
NOTE: cache began mount (not first) of group FLASH number=3 incarn=0xa6ccf7b3
NOTE: Assigning number (1,0) to disk (/dev/sdb1)
NOTE: Assigning number (1,1) to disk (/dev/sdc1)
NOTE: Assigning number (1,2) to disk (/dev/sdd1)
NOTE: Assigning number (2,0) to disk (/dev/sde1)
NOTE: Assigning number (3,0) to disk (/dev/sdf1)
GMON querying group 1 at 4 for pid 23, osid 7880
NOTE: cache opening disk 0 of grp 1: CLUS1 path:/dev/sdb1
NOTE: F1X0 found on disk 0 au 2 fcn 0.0
NOTE: cache opening disk 1 of grp 1: CLUS2 path:/dev/sdc1
NOTE: F1X0 found on disk 1 au 2 fcn 0.0
NOTE: cache opening disk 2 of grp 1: CLUS3 path:/dev/sdd1
NOTE: F1X0 found on disk 2 au 2 fcn 0.0
NOTE: cache mounting (not first) normal redundancy group 1/0xEADCF7B1 (CLUSTERDG)
...
NOTE: cache mounting group 1/0xEADCF7B1 (CLUSTERDG) succeeded
NOTE: cache ending mount (success) of group CLUSTERDG number=1 incarn=0xeadcf7b1
GMON querying group 2 at 5 for pid 23, osid 7880
NOTE: cache opening disk 0 of grp 2: DATA path:/dev/sde1
NOTE: F1X0 found on disk 0 au 2 fcn 0.0
NOTE: cache mounting (not first) external redundancy group 2/0xEAECF7B2 (DATA)
...
NOTE: cache mounting group 2/0xEAECF7B2 (DATA) succeeded
NOTE: cache ending mount (success) of group DATA number=2 incarn=0xeaecf7b2
GMON querying group 3 at 6 for pid 23, osid 7880
NOTE: cache opening disk 0 of grp 3: FLASH path:/dev/sdf1
NOTE: F1X0 found on disk 0 au 2 fcn 0.0
NOTE: cache mounting (not first) external redundancy group 3/0xA6CCF7B3 (FLASH)
9. From the data views
SQL> select inst_id,name,path,label from gv$asm_disk order by 1;

   INST_ID NAME       PATH       LABEL
---------- ---------- ---------- ----------
         1 DATA       /dev/sde1
         1 CLUS3      /dev/sdd1
         1 CLUS2      /dev/sdc1
         1 CLUS1      /dev/sdb1
         1 FLASH      /dev/sdf1
         2 DATA       ORCL:DATA  DATA
         2 CLUS3      ORCL:CLUS3 CLUS3
         2 CLUS2      ORCL:CLUS2 CLUS2
         2 CLUS1      ORCL:CLUS1 CLUS1
         2 FLASH      ORCL:FLASH FLASH

10 rows selected.
10. Since it's verified that ASMLib to block device migration is working make the changes applicable to all ASM instances in the cluster by removing the instance specific asm_diskstring entry and changing the entry applicable to all instances
SQL> alter system reset asm_diskstring scope=spfile sid='+ASM1';

System altered.

SQL> alter system set asm_diskstring='/dev/sdb1','/dev/sdc1','/dev/sdd1','/dev/sde1','/dev/sdf1' scope=spfile;

System altered.
11. Make udev rules file on all the remaining nodes
# ASM OCR VOTE
KERNEL=="sdb[1]", OWNER="oracle", GROUP="asmadmin", MODE="660"
KERNEL=="sdc[1]", OWNER="oracle", GROUP="asmadmin", MODE="660"
KERNEL=="sdd[1]", OWNER="oracle", GROUP="asmadmin", MODE="660"

# ASM DATA
KERNEL=="sde[1]", OWNER="oracle", GROUP="dba", MODE="660"

# ASM FLASH
KERNEL=="sdf[1]", OWNER="oracle", GROUP="asmadmin", MODE="660"
12. Stop the cluster stack and unload and disable the oracleasm module starting on reboot and start the cluster stack
crsctl stop crs

# /sbin/lsmod  | grep oracleasm
oracleasm              84136  1
# /etc/init.d/oracleasm stop
Dropping Oracle ASMLib disks:                              [  OK  ]
Shutting down the Oracle ASMLib driver:                    [  OK  ]

# /etc/init.d/oracleasm disable
Writing Oracle ASM library driver configuration: done
Dropping Oracle ASMLib disks:                              [  OK  ]
Shutting down the Oracle ASMLib driver:                    [  OK  ]

# /sbin/chkconfig oracleasm off

crsctl start crs
13. Remove oracleasm libraries
# rpm -e oracleasmlib-2.0.4-1.el5
# rpm -e oracleasm-2.6.18-194.el5-2.0.5-1.el5
# rpm -e oracleasm-support-2.1.3-1.el5
This concludes the ASMLib to block device migration.

If the correct permission and ownership setting are not set on the block devices the start of the clusterware stack will fail and following could be observed in the ocssd.log
2012-06-25 14:03:35.314: [    CSSD][1093900608]clssnmReadDiscoveryProfile: voting file discovery string(/dev/sdb1,/dev/sdc1,/dev/sdd1,/dev/sde1,/dev/sdf1)
2012-06-25 14:03:35.314: [    CSSD][1093900608]clssnmvDDiscThread: using discovery string /dev/sdb1,/dev/sdc1,/dev/sdd1,/dev/sde1,/dev/sdf1 for initial discovery
2012-06-25 14:03:35.314: [   SKGFD][1093900608]Discovery with str:/dev/sdb1,/dev/sdc1,/dev/sdd1,/dev/sde1,/dev/sdf1:
2012-06-25 14:03:35.314: [   SKGFD][1093900608]UFS discovery with :/dev/sdb1:
2012-06-25 14:03:35.314: [   SKGFD][1093900608]Fetching UFS disk :/dev/sdb1:
2012-06-25 14:03:35.314: [   SKGFD][1093900608]OSS discovery with :/dev/sdb1:
2012-06-25 14:03:35.314: [   SKGFD][1093900608]Discovery advancing to nxt string :/dev/sdc1:
2012-06-25 14:03:35.314: [   SKGFD][1093900608]UFS discovery with :/dev/sdc1:
2012-06-25 14:03:35.314: [   SKGFD][1093900608]Fetching UFS disk :/dev/sdc1:
2012-06-25 14:03:35.314: [   SKGFD][1093900608]OSS discovery with :/dev/sdc1:
2012-06-25 14:03:35.314: [   SKGFD][1093900608]Discovery advancing to nxt string :/dev/sdd1:
2012-06-25 14:03:35.314: [   SKGFD][1093900608]UFS discovery with :/dev/sdd1:
2012-06-25 14:03:35.314: [   SKGFD][1093900608]Fetching UFS disk :/dev/sdd1:
2012-06-25 14:03:35.314: [   SKGFD][1093900608]OSS discovery with :/dev/sdd1:
2012-06-25 14:03:35.314: [   SKGFD][1093900608]Discovery advancing to nxt string :/dev/sde1:
2012-06-25 14:03:35.314: [   SKGFD][1093900608]UFS discovery with :/dev/sde1:
2012-06-25 14:03:35.314: [   SKGFD][1093900608]Fetching UFS disk :/dev/sde1:
2012-06-25 14:03:35.314: [   SKGFD][1093900608]OSS discovery with :/dev/sde1:
2012-06-25 14:03:35.315: [   SKGFD][1093900608]Discovery advancing to nxt string :/dev/sdf1:
2012-06-25 14:03:35.315: [   SKGFD][1093900608]UFS discovery with :/dev/sdf1:
2012-06-25 14:03:35.315: [   SKGFD][1093900608]Fetching UFS disk :/dev/sdf1:
2012-06-25 14:03:35.315: [   SKGFD][1093900608]OSS discovery with :/dev/sdf1:
2012-06-25 14:03:35.315: [    CSSD][1093900608]clssnmvDiskVerify: Successful discovery of 0 disks
Setting the correct permission and ownership and restarting the clusterware stack will resolve this issue.

Related Posts
Migrating block devices using ASM instance to ASMLib
Moving ASMLib disk to block devices (Non-ASMLib) in 11gR2 Standalone
Moving ASMLib disk to block devices (Non-ASMLib) in 11gR1 RAC

Useful Metalink Notes
How To Migrate ASMLIB Devices To Block Devices (Non-ASMLIB)? [ID 567508.1]