Pages

Tuesday, February 14, 2012

Performance monitoring

http://hungrydba.com/rman_clone.aspx


Analysing current wait events

V$SESSION_WAIT can be used to find what wait events sessions are hanging on.

-- use GV$SESSION for a global view of sessions on all instances across a RAC database
SELECT * FROM GV$SESSION_WAIT
-- WHERE SID = <SID>;

From 10g wait event columns have been included in V$SESSION.  This makes it much easier to get an instant view of what events sessions are waiting on.

Dealing with Enqueue Contention

Enqueues are memory structures that serialise access to database resources and are associated with a session or transaction.  In other words an enqueue is a lock that protects a shared resource, such as data, in order to prevent processes updating the same data simultaneously.

Finding Blockers

DBA_BLOCKERS and DBA_WAITERS are two very useful views for tracking down which session is blocking another when enqueue waits occur.  They are not installed by default by can be done so by running the catblock.sql script as follows:

SQL> @?/rdbms/admin/catblock.sql

However, these only appear to work well with single instance databases and alternative scripts need to be run to find locking issues across all on instances on a RAC database:

Find blocking sessions across a RAC database.

select dl.inst_id, s.sid, p.spid, dl.resource_name1, 
decode(substr(dl.grant_level,1,8),'KJUSERNL','Null','KJUSERCR','Row-S (SS)',
'KJUSERCW','Row-X (SX)','KJUSERPR','Share','KJUSERPW','S/Row-X (SSX)',
'KJUSEREX','Exclusive',request_level) as grant_level,
decode(substr(dl.request_level,1,8),'KJUSERNL','Null','KJUSERCR','Row-S (SS)',
'KJUSERCW','Row-X (SX)','KJUSERPR','Share','KJUSERPW','S/Row-X (SSX)',
'KJUSEREX','Exclusive',request_level) as request_level, 
decode(substr(dl.lockstate,1,8),'KJUSERGR','Granted','KJUSEROP','Opening',
'KJUSERCA','Canceling','KJUSERCV','Converting') as lockstate,
s.sid, sw.event, sw.seconds_in_wait sec
from gv$dlm_locks dl, gv$process p, gv$session s, gv$session_wait sw
where blocker = 1
and (dl.inst_id = p.inst_id and dl.pid = p.spid)
and (p.inst_id = s.inst_id and p.addr = s.paddr)
and (s.inst_id = sw.inst_id and s.sid = sw.sid)
order by sw.seconds_in_wait desc;

Find waiting sessions across a RAC database:

select dl.inst_id, s.sid, p.spid, dl.resource_name1, 
decode(substr(dl.grant_level,1,8),'KJUSERNL','Null','KJUSERCR','Row-S (SS)',
'KJUSERCW','Row-X (SX)','KJUSERPR','Share','KJUSERPW','S/Row-X (SSX)',
'KJUSEREX','Exclusive',request_level) as grant_level,
decode(substr(dl.request_level,1,8),'KJUSERNL','Null','KJUSERCR','Row-S (SS)',
'KJUSERCW','Row-X (SX)','KJUSERPR','Share','KJUSERPW','S/Row-X (SSX)',
'KJUSEREX','Exclusive',request_level) as request_level, 
decode(substr(dl.lockstate,1,8),'KJUSERGR','Granted','KJUSEROP','Opening',
'KJUSERCA','Cancelling','KJUSERCV','Converting') as lockstate,
s.sid, sw.event, sw.seconds_in_wait sec
from gv$dlm_locks dl, gv$process p, gv$session s, gv$session_wait sw
where blocked = 1
and (dl.inst_id = p.inst_id and dl.pid = p.spid)
and (p.inst_id = s.inst_id and p.addr = s.paddr)
and (s.inst_id = sw.inst_id and s.sid = sw.sid)
order by sw.seconds_in_wait desc;

There are many types of enqueues; however the most common ones are:

Enqueue
Description
TX
TX Transaction Locks – are acquired when a transaction initiates its first change and is held until the transaction does a COMMIT or ROLLBACK.  It is used mainly as a queuing mechanism so that other sessions can wait for the transaction to complete.  TX locks can occur when multiple sessions want to update the same row of data; as part of enforcing integrity from unique or primary key constraints; multiple sessions trying to update the same bitmap index fragment; or insufficient ITL slots to perform the change when multiple sessions try to update data in the same block.  Insufficient ITL slots are probably the most common cause of this type of locking issue.  When a transaction needs to lock a row of data it places details regarding which row(s) are locked in the Interested Transaction Lists (ITL) section of the block header.  When another session wants to lock a row it checks what rows are already locked in the ITL.  If the relevant row is not locked it creates another ITL slot detailing the new lock details otherwise if the row is locked it waits for it to become free.  However if there is not sufficient number of ITL slots available (controlled by INITRANS and MAXTRANS settings on the Table / Index) or there is insufficient space in the block to create a new ITL slot then the transaction must wait.  This situation can be eased by increasing the value of INITRANS/MAXTRANS (recommended to leave MAXTRANS at default value) and/or increasing the PCTFREE value on the relevant Tables / Indexes. Though setting these values too high will mean space could be wasted when there is not sufficient room to put more data into the block.  Look for a high value of ITL Waits in the V$SEGSTAT table for candidates where ITL contention is occurring.
TM
DML Enqueue – occurs during DML to prevent DDL to the affected object. Most common cause of a TM enqueue is un-indexed foreign keys.  Therefore it is important to ensure that all foreign keys are indexed to avoid this type of locking issue.
ST
Space Transaction Enqueue – is used for space management and allocation for dictionary-managed tables.  Use Locally Managed Tablespaces, or try to preallocate extents to avoid this type of locking issue.
HW
High-Water Enqueue – is used with the high-water mark of a segment, manually allocating extents to the relevant object can help avoid this type of locking issue.



Identifying Open Uncommitted Transactions

select s.INST_ID, s.SID, s.SERIAL#,s.USERNAME,s.MACHINE
from gv$transaction t , gv$session s
where t.INST_ID = s.INST_ID
and t.ses_addr = s.SADDR;


select s.INST_ID, s.USERNAME,  s.machine, s.program, 'ALTER SYSTEM KILL SESSION '''||s.sid||','||s.serial#||''';' kill_sql
from gv$session s
where s.SID  in ( SELECT b.sid
                  FROM dba_objects a,
                       gv$lock b
                  WHERE a.object_type = 'TABLE'
                  and
                   a.object_name = 'OBJECT NAME'
              AND a.object_id = b.id1
                  );

Identify objects behind HW enqueue events

Identify the P3 value for the HW enqueue event:
select P3 from v$session_wait where event = 'enq: HW - contention';

Determine the correct File and Block Number using above P3 value:
select DBMS_UTILITY.DATA_BLOCK_ADDRESS_FILE(&P3) FILE#,
DBMS_UTILITY.DATA_BLOCK_ADDRESS_BLOCK(&P3) BLOCK#
from dual;

Determine the object to which this block belongs to using the above FILE# and BLOCK#:
select owner, segment_type, segment_name
from dba_extents
where file_id = &FILE
and &BLOCK between block_id and block_id + blocks - 1;

Additionally, if the lock contention is currrently observed, we can find out the underlying segment using the following query:
select DBMS_UTILITY.DATA_BLOCK_ADDRESS_FILE(ID2) FILE#,
DBMS_UTILITY.DATA_BLOCK_ADDRESS_BLOCK(ID2) BLOCK#
from v$lock
where type = 'HW';



Monitoring Latch Contention

From 10g the wait event details are greatly improved and you can easily see which latch is causing an issue; e.g. wait event in V$SESSION_WAIT will show up as something like:

            latch: cache buffers chains

In 9i the only event details that were supplied are that the session was waiting on a latch free event.  The following SQL can be used to find which latch the session is hanging on (obviously uncomment the SID bit if you want a specific session):

SELECT name, 'Child '||child#, gets, misses, sleeps
    FROM v$latch_children
   WHERE addr in (select p1raw from v$session_wait where event = 'latch free'/* and sid = 246 */)
  UNION
  SELECT name, null, gets, misses, sleeps
    FROM v$latch
   WHERE addr in (select p1raw from v$session_wait where event = 'latch free'/* and sid = 246 */);

Cache Buffer Chains

If you have "cache buffer chains" latches then this is to do with hot blocks.

The particular objects with these hot blocks can be identified using the following SQL:

In 9i:

select * from dba_objects where object_id in
(select objd from v$bh
where block# in
(SELECT dbablk
                  FROM x$bh
                 WHERE hladdr in
                 (SELECT addr
    FROM v$latch_children
   WHERE addr in (select p1raw from v$session_wait where event = 'latch free')
      and name = 'cache buffers chains'
  UNION
  SELECT addr
    FROM v$latch
   WHERE addr in (select p1raw from v$session_wait where event = 'latch free')
   and name = 'cache buffers chains')
                 and tch > 50 ));

In 10g:

select * from dba_objects where object_id in
(select objd from v$bh
where block# in
(SELECT dbablk
                  FROM x$bh
                 WHERE hladdr in
                 (SELECT addr
    FROM v$latch_children
   WHERE addr in (select p1raw from v$session where event = 'latch: cache buffers chains')
      and name = 'cache buffers chains'
  UNION
  SELECT addr
    FROM v$latch
   WHERE addr in (select p1raw from v$session where event = 'latch: cache buffers chains')
   and name = 'cache buffers chains')
                 and tch > 50 ));

The above SQL should be run multiple times; if the same object appears in the list frequently then these are the ones which are likely to have hot block contention.

Personally I like to track which SQL is being run when the cache buffer chains waits occur; for example in10g you can run:

select sql_text from v$sqlarea
where hash_value in
(select sql_hash_value from v$session
where event = 'latch: cache buffers chains');

Identifying these SQL statements and then tuning them to reduce the number of logical IOs required I have found to be the best way to tackle cache buffers chains waits.

If tuning is not possible then it is advisable to rebuild the relevant tables / indexes with a higher pct free setting to reduce the amount of data which will be stored on each block.

Also look in V$SEGSTAT to find objects with a high number of “buffer busy waits” as these are also candidates for the cause of hot block contention.





If you encounter shared pool or library cache type latches these usually point to a fault an application design issue.  Perhaps queries are being executed frequently with different literal values rather than using bind variables or you have multiple versions (e.g. same SQL but against tables in different schemas) of the identical SQL.

The following SQL is useful for identifying statements which could be causing shared pool issues:

Finding SQL using literal values:

SELECT substr(sql_text, 1, 40) "SQL", count(*), sum(executions) "TotExecs"
FROM v$sqlarea
WHERE executions < 5
GROUP BY substr(sql_text, 1, 40)
HAVING count(*) > 30
ORDER BY 2;

Finding SQL with a high version count:

SELECT address,
       hash_value,
       version_count,
       users_opening,
       users_executing,
       substr(sql_text, 1, 40) "SQL"
FROM v$sqlarea
WHERE version_count > 10;

Find Statements using a high amount of shared pool memory:

SELECT substr(sql_text, 1, 40) "Stmt",
       count(*),
       sum(sharable_mem) "Mem",
       sum(users_opening) "Open",
       sum(executions) "Exec"
FROM v$sql
GROUP BY substr(sql_text, 1, 40)
HAVING sum(sharable_mem) > 1000000
order by 3 desc;

Note: See “The problem with bind variables” before deciding whether to eliminate the use of literal values.



Library Cache Pins

Oracle uses library cache pins to manage library cache concurrency; to identify sessions waiting on a library cache pin use the following SQL:

select SID, P1RAW from v$session_wait
where event = 'library cache pin';

P1RAW is the handle of the library cache object which the waiting session wants to acquire a pin on.  The actual object being waited on can be found using the following SQL:

  SELECT kglnaown "Owner", kglnaobj "Object"
    FROM x$kglob
   WHERE kglhdadr='&P1RAW';

The below SQL is the main query to find blockers; though you will need the P1RAW value from above:

  SELECT s.sid, kglpnmod "Mode", kglpnreq "Req"
    FROM x$kglpn p, v$session s
   WHERE p.kglpnuse=s.saddr
     AND kglpnhdl='&P1RAW';

Mode is the mode in which the pin is wanted. This is a number thus:
            2 - Share mode
            3 - Exclusive mode

An X request (3) will be blocked by any pins held S mode (2) on the object.
An S request (2) will be blocked by any X mode (3) pin held, or may queue behind some other X request.

Typically you will need to kill the session holding the pin to release the blocking event.

However, frequent library cache pin waits particularly in a LIVE system could point to an application issue with objects being altered or recompiled while application users are working on the same objects.

No comments:

Post a Comment