Miladin Modrakovic’s Blog: Oraclue

July 28, 2009

Identifying PID/SPID for killed session in 11g

Filed under: Troubleshooting — oraclue @ 5:58 pm

Many times I  killed session with alter system kill session command ( or Toad )
and then session would just hang without dying.Then when I tried to find which OS process is related to killed session I could not identify it directly by using v$session and v$process view and joining them using addr column.According to Oracle this is expected.Here is explanation:

When a session is killed, the session state object(and all the child state objects under the session state object) move out from under the original parent process state object, and are placed under the pseudo process state object (which is expected, given the parent/child process mechanism on Unix). PMON will clean up all the state objects found under the pseudo process state object. That explains why PADDR changes in V$SESSION when a session is killed. New PADDR you are seeing in v$SESSION is the address of the pseudo process state object. This shows up in system state under PSEUDO PROCESS for group DEFAULT: V$PROCESS still maintains the record of the original parent process.

So to finally get rid of that session I had to use different workarounds.One of the workaround was to eliminate PSEUDO processes.

select spid, program
from v$process where program != ‘PSEUDO’
and addr not in
(select paddr from v$session);  and of course avoid killing background processes

select spid, program from v$process
where program!= ‘PSEUDO’
and addr not in (select paddr from v$session)
and addr not in (select paddr from v$bgprocess);

Thanks to 11g  there are two new columns in v$session view :

CREATOR_ADDR - state object address of creating process
CREATOR_SERIAL# - serial number of creating process

CREATOR_ADDR is the column that can be joined with the PADDR column in V$PROCESS to uniquely identify
the killed process corresponding to the former session.

So new query would look like:

select * from v$process where addr=(select creator_addr from v$session where sid=< sid used in alter system kill session command >);

In addition to these columns there are views

V$DETACHED_SESSION

V$PROCESS_GROUP

which can help in resolving this issue.

Oracle Linux Test (OLT) Kit

Filed under: Tools — oraclue @ 5:23 pm

Oracle Linux Tests are designed to verify Linux kernel functionality and stability essential for the Oracle Database.

The OLT kit can perform different tests like installation, stress and destructive tests on virtualized and non-virtulized systems.

More info at:

http://oss.oracle.com/projects/olt/

Also Oracle’s  program Oracle Validated Configurations offers pre-tested and validated achitectures with documents.

http://www.oracle.com/technology/tech/linux/validated-configurations/index.html

July 22, 2009

Accessing Fixed Tables using Direct Access

Filed under: Internals — oraclue @ 5:18 pm

Few years ago I wrote article about direct memory access.

www.petefinnigan.com/Storing_Data_Directly_From_Oracle_SGA.pdf

Data are retrieved  directly from SGA.Main advantage is speed .

The Oradebug utility has command called direct_accesswhich as names suggest access fixed tables directly.

Help command will list all options:

SQL> oradebug help direct_access
DIRECT_ACCESS  <set/enable/disable command | select query> Fixed table access
SQL> oradebug setmypid
Statement processed.

First option is to  define content type.Self explained.

SQL> oradebug direct_access set content_type = ‘text/plain’
Statement processed.
Enable writing to trace file.Query output written to trace.

SQL> oradebug direct_access enable trace
Statement processed.
Disable writing to trace file:

SQL> oradebug direct_access disable trace
Statement processed.
Disable display on screen.Otherwise you will see zillions rows flying on screen.Good for tables with few rows.

SQL> oradebug direct_access disable reply
Statement processed.
and most important option:
oradebug direct_access select * <fixed_table>

 
Here is simple query to list fixed tables

select kqftanam from x$kqfta

I did speed test using  favorite X$KSMMEM  with few  million rows (more…)

July 17, 2009

Direct path reads and serial table scans in 11g

Filed under: Performance — oraclue @ 6:02 pm

To answer comment by Frits about Doug Burns observation of using full table scans that resulted in direct path  reads.

 Here is explanation from Oracle itself:

There have been changes in 11g in the heuristics to choose between direct path reads or reads through buffer cache for serial table scans.
In 10g, serial table scans for “large” tables used to go through cache (by default) which is not the case anymore.  In 11g, this decision to read via direct path or through cache is based on the size of the table, buffer cache size and various other stats.

Direct path reads are faster than scattered reads and have less impact on other processes because they avoid latches.

References: High ‘direct path read’ waits in 11g    Note: 793845.1

July 13, 2009

Wide Table Select ( Row Shipping )

Filed under: Internals — oraclue @ 2:22 pm

In 10g Oracle  introduces a special performance feature called “row shipping” or “wide table select”.Row shipping is feature which allows row data from the datablock to be shipped directly to the client.

Aperently , this feature had some issues in earlier version of 10g  and fix was to disable the “row shipping” feature by default.Oracle introduced “fix” in version 10.2 ( that’s the lowest version I have to test anyway ) .

The fix introduces the hidden parameter “_enable_row_shipping” which now defaults to FALSE ( I am still talking about Oracle version 10g ) . The parameter can be set to TRUE nstance wide to re-enable the row shipping feature.

So default for 10.2.0.3:

SQL> select * from v$version;

BANNER
—————————————————————-
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 – 64bi
PL/SQL Release 10.2.0.3.0 – Production
CORE    10.2.0.3.0      Production
TNS for Linux: Version 10.2.0.3.0 – Production
NLSRTL Version 10.2.0.3.0 – Production

SQL> col name head NAME for a25
SQL> col value head VALUE for a20
SQL> col description head DESCRIPTION for a40
SQL> set linesize 167
SQL> Select  nam.ksppinm                           name,
2          val.ksppstvl                          value,
3          val.ksppstdf                          is_default,
4          decode(bitand(nam.ksppiflg/256,1),
5                 1,’True’,
6                   ‘False’
7          )                                     is_session_modifiable,
8          decode(bitand(nam.ksppiflg/65536,3),
9                 1,’Immediate’,
10                 2,’Deferred’ ,
11                 3,’Immediate’,
12                   ‘False’
13          )                                     is_system_modifiable,
14          decode(bitand(val.ksppstvf,7),
15                 1,’Modified’,
16                 4,’System Modified’,
17                   ‘False’
18          )                                     is_modified,
19          decode(bitand(val.ksppstvf,2),
20                 2,’True’,
21                   ‘False’
22          )                                     is_adjusted,
23          nam.ksppdesc                          description
24  from
25          x$ksppi        nam,
26          x$ksppsv       val
27  where
28          nam.indx = val.indx
and    nam.ksppinm like lower(‘%&1%’);
29  Enter value for 1: shipp
old  29: and    nam.ksppinm like lower(‘%&1%’)
new  29: and    nam.ksppinm like lower(‘%shipp%’)

NAME                      VALUE                IS_DEFAUL IS_SE IS_SYSTEM IS_MODIFIED     IS_AD DESCRIPTION
————————- ——————– ——— —– ——— ————— —– —————————————-
_enable_row_shipping      FALSE                TRUE      True  Immediate False           False use the row shipping optimization for wide table selects

Some of the workarround were special event and patch:

1. One off patches for this bug use the event 10040 to enable
/ disable row shipping instead of the parameter.
The default with this fix is NOT to use row shipping.
If 10040 is set then row shipping will be enabled again
but this should only be set instance wide and not in a session.
This fix introduces event 10040 which disables the
“row shipping” feature when set to any level > 0.

e.g.

disable wide table select (a.k.a row shipping)

alter system set events ‘10040 trace name context forever, level 1′;

2. Bug 4690401 had patches available for 10.1 whereby setting
event 10040 would DISABLE “row shipping”. The event usage
in that fix is OPPOSITE to the event usage in this fix.
Both fixes provide a means to disable the feature.

and this is what Oracle document says:

From 10.2.0.2 onwards you can use the hidden parameter _enable_row_shipping=FALSE. And the event will not be valid from 10.2.0.2 onwards.

For Oracle 11g: (more…)

July 7, 2009

Troubleshooting SQL*NET

Filed under: Troubleshooting — oraclue @ 9:57 pm

Few weeks ago I found very good white paper which explains sqlnet trace files.I wrote about it in previous blog

http://oraclue.com/2009/05/27/examining-oracle-net-trace-files/

 I was troubleshooting sqlnet  issue today and used very same article  to read trace files.It was very useful.

Also I used  tool that I completely forgot .I am talking about  Oracle Trace Assistant  or Oracle binary  trcasst

For large trace files this can be very useful.It is very easy to use:

 Usage : trcasst [options] <filename>
      [options]  default values are -odt -e0 -s
      <filename>  always last argument
    -o[c|d][u|t][q]  Net Services and TTC information
      [c]  Summary of Net Services information
      [d]  Detailed Net Services information
      [u]  Summary of TTC information
      [t]  Detailed TTC information
      [q]  SQL commands (used together with u)
    -s  Statistics
    -e[0|1|2]  Error information, default is 0
      [0]  Translate NS error numbers
      [1]  Error translation
      [2]  Error numbers without translation
    -l[a|i <connection_id>]  Connection information
      [a]  List all connections in a trace file
      [i <connection_id>]  Decode a specified connection

Most useful option in my case was error information or  ( -e )

trcasst -e  sqlnet_tracefile.trc

It gives NS & NT Errors Translation or  TNS familiar error number.Then at least you can get some idea what is going on.Oracle has explanation for Error Stack Components which  helps to narrow issue.

Do not forget to  check with your network admin and see if there is any firewall or F5 involved .

References:

Oracle Net8 Administrator’s Guide – Chapter 10 Troubleshooting

Metalink:  Network Products and Error Stack Components  Doc ID:  39662.1

July 1, 2009

Process diagnostic

Filed under: Uncategorized — oraclue @ 7:46 pm

Each Oracle process has a process state object.Process is running session  and session open transaction.Typically process has only one session object.

To dump a process state  I normally use:

alter session set events ‘immediate trace name processstate level 10′ or

oradebug dump processstate 10

This dump will produce file which has many different information about process itself like process global information, dump of memory , session wait history etc.

The oradebug  unit test harness command has option ( ksdxutdiagpid ) that will produce similar dump but smaller in size and with some information that are not included in processstate dump with level 10.

So here is my short list of commands :

oradebug setmypid
alter system flush buffer_cache;
select * from dba_extents;
oradebug unit_test_nolg ksdxutdiagpid
oradebug tracefile_name

First part of dump file has general process information like pid, sid, session serial etc

*** 2009-06-29 15:45:14.517
Process diagnostic dump for oracle@apollo (TNS V1-V3), OS id=6957,
pid: 29, proc_ser: 139, sid: 238, sess_ser: 30736
——————————————————————————-

Next section  has information about memory, swap and process.

loadavg : 0.21 0.17 0.13
memory info: free memory = 0.00M
swap info:   free = 0.00M alloc = 0.00M total = 0.00M
F S UID        PID  PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY          TIME CMD
0 S oracle    5334     1  0  75   0 – 695798 -     Jun11 ?        00:00:42 ora_lgwr_test
0 S oracle    6957  6956  1  78   0 – 692473 pipe_w 15:41 ?       00:00:02 oracletest11g (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
0 S oracle   25659     1  0  75   0 – 695799 -     Jun01 ?        01:07:44 ora_lgwr_demo

Third part is a short stack dump with all Oracle functions

Short stack dump: <-ksedsts()+315<-ksdxfstk()+32<-ksdxdocmdmultex()+3456<-ksdxdocmdmult()+29<-ksudmp_proc_short_stack()+697<-ksdhng_diag_proc_int(
)+2760<-ksdhng_diag_proc()+27<-ksdhng_diag_proc_ut()+139<-ksdxutdiagpid()+114<-ksdxuth()+1249<-ksdxen_int()+5656<-ksdxen()+14<-opiodr()+1220<-ttcp
ip()+1208<-opitsk()+1449<-opiino()+1026<-opiodr()+1220<-opidrv()+580<-sou2o()+90<-opimai_real()+145<-ssthrdmain()+177<-main()+215<-__libc_start_ma
in()+244<-_start()+41

Next part has information about wait stack and wait state:

Current Wait Stack:
0: waiting for ‘process diagnostic dump’
=0, =0, =0
wait_id=22666 seq_num=22667 snap_id=1
wait times: snap=0.153272 sec, exc=0.153272 sec, total=0.153272 sec
wait times: max=30.000000 sec
wait counts: calls=0 os=0
in_wait=1 iflags=0×1a0
Wait State:
auto_close=0 flags=0×22 boundary=(nil)/-1

and last part is dedicated to session wait history and sampled session history:

Session Wait History:
0: waited for ‘SQL*Net message from client’
driver id=62657100, #bytes=1, =0
wait_id=22665 seq_num=22666 snap_id=1
wait times: snap=0.002478 sec, exc=0.002478 sec, total=0.002478 sec
wait times: max=infinite
wait counts: calls=0 os=0
occurred after 0.002582 sec of elapsed time
1: waited for ‘db file sequential read’
file#=3, block#=1b179, blocks=1
wait_id=22664 seq_num=22665 snap_id=1
wait times: snap=0.000013 sec, exc=0.000013 sec, total=0.000013 sec
wait times: max=infinite
wait counts: calls=0 os=0
occurred after 0.000127 sec of elapsed time

———-
The history is displayed in reverse chronological order.

sample interval: 1 sec, max history 120 sec
—————————————————
[1 sample,                                                          15:45:14]
waited for ‘db file sequential read’, seq_num: 22340
p1: ‘file#’=0×2
p2: ‘block#’=0×9a53
p3: ‘blocks’=0×9a53
time_waited: >= 0 sec (still in wait)
[1 sample,                                                          15:45:13]
idle wait at each sample
[1 sample,                                                          15:45:12]
waited for ‘db file sequential read’, seq_num: 18907
p1: ‘file#’=0×2
p2: ‘block#’=0×13e03
p3: ‘blocks’=0×13e03
time_waited: 0.003633 sec (sample interval: 0 sec)
[1 sample,                                                          15:45:11]
waited for ‘db file sequential read’, seq_num: 16332
p1: ‘file#’=0×1
p2: ‘block#’=0×2b01
p3: ‘blocks’=0×2b01
time_waited: 0.005140 sec (sample interval: 0 sec)
[10 samples,                                             15:45:01 - 15:45:10]

Cool thing is that one command dump all of this.

Blog at WordPress.com.