2008/12/23

DataGuard lost archive ; How to resyncronize?

One of my customer had a huge database (about 30 TB) which had a disastery solution based on phisical stadby database using dataguard.Version is 10.2.0.3. By some reason log transfer and apply service stopped .Created archive logs on primary db filled the space . To keep db running archived logs were deleted without being set to standby side. So Dataguard is no more sync with primary side.
As we had lost archived logs , no backup and not sent to anywhere what should I do ?
Setting up dataguard from scratch costs too much space and time for such big db.
Rman helps.
Action plan is:
1-determine last SCN on standby db
2-Stop log apply and transport services.
3-Backup primary database incremental ; from SCN last applied on standby db.
4-Transfer backup sets to standby side.
5-Register backup sets to stanby db
6-Recover standby db ;
7-
Create new standby control file
8-OPTIONAL - Transfer newly created files.
9-Re-start log apply and transfer services.


Details are with commands used:

1-determine last SCN on standby db

PRIMARY
SQL> SELECT CURRENT_SCN FROM V$DATABASE;
CURRENT_SCN
-----------
3360225821

STANDBY
SQL> SELECT CURRENT_SCN FROM V$DATABASE;
CURRENT_SCN
-----------
3215410716


2-Stop log apply and transport services.


2.1 stop redo sent on primary
alter system set log_archive_dest_state_2 ='defer' scope=both ;
2.2 stop redo apply on standby
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

3-Backup primary database incremental ; from SCN last applied on standby db.
--for faster backup try with multi channel
run {
allocate channel ch1 device type disk;
allocate channel ch2 device type disk;
allocate channel ch3 device type disk;
allocate channel ch4 device type disk;
allocate channel ch5 device type disk;
allocate channel ch6 device type disk;
BACKUP INCREMENTAL FROM SCN
3215410716 DATABASE FORMAT '/intl_migration/cdrdb/backup/tmpForStandby_%U' tag 'FORSTANDBY';
release channel ch1;
release channel ch2;
release channel ch3;
release channel ch4;
release channel ch5;
release channel ch6;
}

4-Transfer backup sets to standby side.

because the incremental backup was 1 TB size ; I needed to seperate under different mount points.
Don't worry about keeping them in different folders. We will register them.
SOURCE FOLDERS
/intl_migration/cdrdb/backup/

DEST FOLDER
/medftp/backupDG
/app3/backupDG/


bin
prompt
lcd /intl_migration/cdrdb/backup/
cd /medftp/backupDG
mput tmpForStandby_rkk2lbe4_1_1 tmpForStandby_rlk2lbe5_1_1 tmpForStandby_rmk2lbe7_1_1 tmpForStandby_rnk2lbe9_1_1 tmpForStandby_rok2lbeb_1_1 tmpForStandby_rpk2lbed_1_1 tmpForStandby_rqk2m14c_1_1 tmpForStandby_rrk2m19j_1_1 tmpForStandby_rsk2m2bt_1_1 tmpForStandby_rtk2m2eu_1_1 tmpForStandby_ruk2m2km_1_1 tmpForStandby_rvk2m3m2_1_1

bin
prompt
lcd /intl_migration/cdrdb/backup/
cd /app3/backupDG/
mput tmpForStandby_s0k2mmu4_1_1 tmpForStandby_s1k2mn2d_1_1 tmpForStandby_s2k2mnlu_1_1 tmpForStandby_s3k2mnut_1_1 tmpForStandby_s4k2mob3_1_1 tmpForStandby_s5k2moee_1_1 tmpForStandby_s6k2nc22_1_1 tmpForStandby_s7k2ncda_1_1 tmpForStandby_s8k2nd6s_1_1 tmpForStandby_s9k2ne6n_1_1 tmpForStandby_sak2ne8i_1_1 tmpForStandby_sbk2nf3c_1_1 tmpForStandby_ssk2o28c_1_1


5-Register backup sets to stanby db
OnStandby db
rman target /
RMAN> CATALOG START WITH '/app3/backupDG/tmpForStandby';
RMAN> CATALOG START WITH '/medftp/backupDG/tmpForStandby';

6-Recover standby db ;
one important note ;
because this is a backup taken for only phisical standby db sync ; noredo key word is required.
See : http://download.oracle.com/docs/cd/B19306_01/backup.102/b14191/rcmdupdb.htm#sthref955

RMAN>
run {
allocate channel ch1 device type disk;
allocate channel ch2 device type disk;
allocate channel ch3 device type disk;
allocate channel ch4 device type disk;
allocate channel ch5 device type disk;
allocate channel ch6 device type disk;
allocate channel ch7 device type disk;
allocate channel ch8 device type disk;

RECOVER DATABASE NOREDO;

release channel ch1;
release channel ch2;
release channel ch3;
release channel ch4;
release channel ch5;
release channel ch6;
release channel ch7;
release channel ch8;
}


7-Create new standby control file
Before re-starting log apply service on standby db; create a new standby controlfile in primary db , copy it to standby .Creating a new controlfile is my suggestion because during non transferred and applied logs ; some chages may be done affecting controlfile like adding redo members, adding datafile, adding new tablespaces...etc

7-1 shutdown standby db instance
7-2 create new standby control file move it to standby side destinations (generally 3).
SQL> alter database create standby controlfile as '/tmp/stby.ctl'; --on primary db
scp /tmp/stby.ctl oracle@stdbyserver:/oradata/ctl<1,2,3>/ctl.dbf

7-3 start standby db in mount , and start log apply service MenagedRecoveryProcess;
SQL> startup mount;


8-OPTIONAL - Transfer newly created files.
If new datafiles were added during the time that dataguard had been stopped as it happened to me; you need to copy the newly created files .They were not included incremental backup set;
and not created cause of stopped MRP.
8-1 determine all datafiles from database (remember we have just created a new controlfile , both primary and standby has same information)
SQL> spool '/tmp/hede.txt';
SQL> select 'file ' ,name from v$datafile;
# sh /tmp/hede.txt > fileSatus.txt
# cat fileSatus.txt grep cannot
/oradata/file004.dbf : cannot open
/oradata/file005.dbf : cannot open
Means we have to copy these 2 files to standby side.

8-2 After determining missing datafiles ; backup them as image copy in primary db ,copy to standby side.
BACKUP AS COPY DATAFILE '/oradata/file004.dbf' FORMAT '/tmp/file004.dbf' TAG stdbyImgCopy;
BACKUP AS COPY DATAFILE '/oradata/file005.dbf' FORMAT '/tmp/file005.dbf' TAG stdbyImgCopy;

scp /tmp/file004.dbf oracle@stdbyserver:/oradata/file004.dbf
scp /tmp/file005.dbf oracle@stdbyserver:/oradata/file005.dbf

9-Re-start log apply and transfer services.

9.1 start redo sent on primary
alter system set log_archive_dest_state_2 ='enable' scope=both ;

9.2 start redo apply on standby
SQL> startup mount;
SQL>ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION;

9-3 check if for any problems; you may encounter problems. Check alert.log and status of proceesses
SQL> SELECT PROCESS, STATUS, THREAD#, SEQUENCE#, BLOCK#, BLOCKS FROM V$MANAGED_STANDBY;

After success of this operation We were freed of time and space to re-establish all 30 TB database.
A similar workaound is documented in metalink for Oracle 9i : Doc ID:290817.1

Best Regards.



2008/12/19

Ethernet Card Changes , RAC crs stops

One of my customer had changed their Linux Server's mainboard.It was one of 2 nodes running oracle RAC. But after this change they complaint that crs was not working fine.
I connected to box and see it was really not working. After exploring I found that the ethernet card was disabled due to nonmatching MAC adress in config file and phisical hardware.This was interconnect used for priv addresses. The solution was commenting out MAC adress in config file.

[root@db1 tmp]# cat /etc/sysconfig/network-scripts/ifcfg-eth1
DEVICE=eth1
BOOTPROTO=none
#HWADDR=00:17:A4:F6:DF:96 ## this line is commented out because of mainboard and embeded ethernet card change ..!
ONBOOT=yes
TYPE=Ethernet
NETMASK=255.255.255.0
IPADDR=192.168.50.11
USERCTL=no
IPV6INIT=no
PEERDNS=yes
[root@db1 network-scripts]#