Monday, November 26, 2007

Setup of RDA on RAC and collection data for all nodes !

Today I was trying to collect RDA data for one of my Service Request with Oracle Support. So I thought it would be a good idea to record the way setup needs to be done.

Its a good idea to use the latest version of RDA always. If you need to setup RDA for Oracle RAC environment. Here are the simple steps :

1. Login to Metalink with your own account(as provided by company!)
2. Read Note : 314422.1 and 359395.1
3. Download RDA + SCM bundle for your platform or the standalone RDA(RDA+SCM is what Oracle Support Recommends)
4. Copy the RDA.zip or RDA.tar to the machine where you have Oracle Database installed.
5. unzip/untar the RDA
6. If you are on unix, confirm that the ssh/rsh is setup and you can access all nodes from one node.
7. Run the RDA setup using command : perl rda.pl -vX RDA::Remote setup_cluster / ('/' is used to provide SYSDBA login)
8. Setup asks few questions, most of the them are default if you have already sourced the Oracle environment before running setup
9. Once the setup is complete, it has created the RDA setup for all nodes based on "olsnodes" information
10. If the setup was successful, run the RDA collection for all nodes in one go using command :
perl rda.pl -v -e REMOTE_TRACE=1

On an average, the RDA collection for 4 nodes should not take more than 15 minutes. It shows you the progress for each node as it collects the data because of the parameter "REMOTE_TRACE=1" provided on the command line.

The file which you need to upload to support is the ZIP file created in the RDA directory. This file contains information from all the nodes. So most of the time this file could be very big in size - may be 100+ MB :) ..depends on how much log and other information you have still kept as it is all along !

That's it you are ready to provide the RDA collection data to Oracle Support!

Monday, November 12, 2007

Oracle OpenWorld 2007 Collateral / Papers

If you have missed Oracle OpenWorld '07 like me!, here you can get all Collateral / Papers in one link :

http://www.expobadge.com/dldev/dc/alldemogroundslist.cfm?

One good thing about Oracle OpenWorld is that you come to know what's happening in the Industry which is revolving around Oracle as a technology. If lot of people are talking about Fusion / 11g or something else ..you know that is something you also should know in order to keep the pace with the market :)

Another thing to watch is the Keynotes : http://www.oracle.com/openworld/2007/keynotes.html

Enjoy ! ..


Powered by ScribeFire.

Monday, October 8, 2007

Oracle Enterprise Manager 10g Diagnostics : EMDiag

Recently I had a issue where lot of targets were reporting "Status Pending" after blackout was over during scheduled maintenance. This happens sometimes due to various reasons. If its only 1 or 2 targets you can use the method I gave in my post

But if you many targets having issues, it is recommended to use EMDiagKit providded by Oracle Support to diagnose & fix the issues. Also it is good to know this utility if you are managing a large EM 10g environment. Also this will be handy when you deal with Oracle Support to log a ticket regarding any issue you are facing with EM repository.

What is EMDiag
The EMDiag kit is a troubleshooting tool that will enable you to extract necessary troubleshooting data from the EM Repository Schema.

Go through the Note:421638.1:EMDiagkit - Overview on Metalink if you are first time using this script.

I had this issue where more than 100 targets were reporting issues of status. to fix this I applied the following action plan :

EMDiag Installation
===============
1. Set ORACLE_HOME= the path to repository database.
2. Set ORACLE_SID=sid_repository
3. Unzip the file I sent you, emdiag.zip, into the directory /emdiag
4. cd to /emdiag/cfg and create the file repvfy.cfg by copying the
template
cd /emdiag/cfg
cp repvfy.cfg.template repvfy.cfg
5. Edit repvfy.cfg and change the following lines:

#ora_tns=my_tns_alias
level=2
to
ora_tns=
level=9

If you don't have an alias you can enter the value of the following property from the
OMS_HOME/sysman/config/emoms.properties file: oracle.sysman.eml.
mntr.emdRepConnectDescriptor

Make sure you remove all the escape characters: \

6. Make the files in /emdiag/bin executable

chmod +x /emdiag/bin/*

7. Run the install command:

cd \emdiag\bin
./repvfy install

VERIFY the Installation
==================
SQL> set serveroutput on
SQL> exec mgmt_diag.validate;
Repository version : 10.2.0.2.0
Repository type : CENTRAL
MGMT_DIAG version : 2007.0331
Number of repository tests : 285
Total enabled repository tests: 265
Number of object tests : 135
Total enabled object tests : 124

PL/SQL procedure successfully completed.

SQL> SELECT mgmt_diag.version FROM dual;

VERSION
----------
2007.0331

Run the Repository verification script to diagnose the issues
===========================================
[oracle@myserver bin]$ ./repvfy verify -level 9

Please enter the SYSMAN password:

Following the out of the above command
==============================
verifyAGENTS
101. Active Agents with clock-skew problems: 1
105. Agents not uploading any data: 3
600. Agents running in the future: 1
verifyASLM
100. Beacons with tests running behind schedule: 17
verifyBLACKOUTS
101. Active blackouts with no more targets in blackout: 1
verifyCREDENTIALS
100. Credential sets not pointing to the latest host metadata version: 2
verifyECM
100. Missing ECM snapshot metadata: 4
702. Generic snapshot delete backlog: 3
verifyJOBS
100. Job backlog: 1
105. Job Executions with no valid steps: 6
111. Stale DiscardState jobs: 91
112. Active executions without active steps: 6
201. Duplicate DiscardState jobs: 13
202. System jobs running for more than 24hours: 24
701. Orphaned job output records: 67777
verifyMETRICS
002. Disabled repository collections: 1
004. Duplicate metric threshold definitions: 1
700. Outstanding cleared metric errors: 15
704. Metric errors for mismatched category properties: 4
800. Cleared cluster repository metric failures: 2
verifyPOLICIES
002. Mismatches between Violations and Availability: 2
201. Oscillating Policy/Metric violations: 1
verifyPROVISIONING
600. Unconfigured software library: 1
verifyREPOSITORY
002. Missing DBMS_JOBS: 3
100. Missing RAW partitions: 6
101. Invalid objects in repository schema: 1
601. Database Timezone mismatch: 1
700. Partitioned tables with too many partitions: 1
804. PL/SQL packages without a package body: 1
805. Unanalyzed repository tables: 4
verifySYSTEM
700. Duplicate OMS parameters: 1
verifyTARGETS
102. Targets with Response Metric Errors: 1
106. Targets in questionnable state: 117 - This is the problem
109. Target types without host credential sets defined: 4
111. Targets not uploading any data: 11
602. Groups without members: 4
701. Duplicate targets from decomissioned Agents: 1
703. Unresolved deleted targets left-overs: 169
801. Unconfigured targets: 3
verifyUSERS
200. EM Accounts not granted MGMT_USER: 1
700. Non existing DB users for GC administrators: 1

Now we want to fix the Issue as per the code 106:
========================================
[oracle@myserver bin]$ ./repvfy verify targets -test 106 -fix

Please enter the SYSMAN password

Following is the output of the above command
===================================
verifyTARGETS
106. Targets in questionnable state: 117
Fix: 108 (Difference=9)

wait for few minutes and check the Enterprise Manager 1og console & you should see that the "Status" has been fixed for 108 targets mentioned as per the script. Well before you run to fix this, make sure that you review the detail diagnostics output by giving following command :

./repvfy verify -level 9 -detail

This will give you list of all targets under each category for your to review before you try to understand what fix to apply.

It is good to get familiar with this EMDiagkit on TEST environment before using it on production server.


Thursday, September 20, 2007

Resolving Issues from Oracle Enterprise Manager 10g Blackouts

There are instances when you put all or some targets in blackout using EM Grid Control 10g. But sometimes even after the blackout is ended either manually or expired itself. Some or all Targets shows up "Status Pending" on EM Grid Control. When you try to check the status of Agent on the Targets, they seems to be running fine.

Now workaround to fix this issue if you have few targets having this problem :

  • Login to Target you want to fix the Status "Status Pending"
  • Source the required environment variables
  • run command "emctl stop agent"
  • Goto $ORACLE_HOME/sysman/emd
  • mv or rm lastupld.xml and agntstmp.txt - mv to different file is also fine if you dont want to do rm !
  • run "emctl clearstate agent" - it will show following :
    • Oracle Enterprise Manager 10g Release 10.2.0.2.0.
      Copyright (c) 1996, 2006 Oracle Corporation. All rights reserved.
      EMD clearstate completed successfully
  • Start the agent using command "emctl start agent" - it will show following
    • Oracle Enterprise Manager 10g Release 10.2.0.2.0.
      Copyright (c) 1996, 2006 Oracle Corporation. All rights reserved.
      Starting agent ............... started.
  • Goto EM Grid Control and check the Status of the Target for which you performed above actions. The Target should show status "Up" with Green UP Arrow :)
The above procedure holds good generally with Agent status issues after blackouts. But if you have 100+ targets having this issue ? Then doing the above will take lot of time as well as effort.

So there comes a handy utility by Oracle called EMDiag Kit

EMDiag Kit :
Available to download from Metalink Note : 421053.1 - Downloading/installing/Using the EM Diagnostic Kit

I will post my experience with EMDiag Kit and how well it could help you resolve issues with EM Grid Control effectively. Specially when you are logging Service Request with Oracle Support, this will save a lot of time if you have it already installed!

As always said "Please use caution while using any information published in blog to apply in your production or test databases/environments. Make sure you understand what actions you are performing. Always have backups before performing any actions"