bugPOOL and CORAL - Bugs: bug #58917, Oracle DIAGNOSTIC_DEST directory...

 
 
Show feedback again

You are not allowed to post comments on this tracker with your current authentication level.

bug #58917: Oracle DIAGNOSTIC_DEST directory created in user $HOME on client (following ORA-24550)

Submitted by:  Andrea Valassi <valassi>
Submitted on:  2009-11-16 13:02  
 
Category: CORAL (generic)Severity: 3 - Normal
Priority: 5 - NormalStatus: Fixed
Privacy: PublicAssigned to: Andrea Valassi <valassi>
Open/Closed: ClosedFixed Release: None
Effort: 0.00Planned Release: 

(Jump to the original submission Jump to the original submission)

2011-11-25 17:25, comment #12:

For the record: the Oracle signal handler producing these ORA-24550 can probably be completed disabled (e.g. if one wants to use the ROOT signal handler) by setting DIAG_SIGHANDLER_ENABLED=FALSE. Thanks to Luca for the suggestion.
Andrea

Andrea Valassi <valassi>
Project AdministratorIn charge of this item.
2010-08-19 15:18, comment #11:

Rename to mention ORA-24550.

Andrea Valassi <valassi>
Project AdministratorIn charge of this item.
2009-12-08 16:06, comment #10:

I have created an sqlnet.ora file in /afs/cern.ch/sw/lcg/external/oracle/11.2.0.1.0/admin/
with the two lines to disable the ADR.

Oracle recommends using different sqlnet.ora for 10g and 11g clients, so I did not modify the central sqlnte.ora which so f was used for all client applications (9i and 10g).

Thanks to Ana, the LCG_Interface for oracle has been modified, http://lcgcmt.cvs.cern.ch/cgi-bin/l...
This is used for LCGCMT-preview (for th enext 57x release?), but not (yet?) for the 56 patches used for 56d.

Bug closed.
Andrea

Andrea Valassi <valassi>
Project AdministratorIn charge of this item.
2009-11-19 16:58, comment #9:

I can also confirm that the ORA-24550 (and the oradiag dump from ADR) only happen if the client (random cycler test) is running oracle 11g. With oracle 10g the failure is less verbose (which makes it look less catastrophic, although it probably is not - actually the extra info in oracle 11g is probably very useful!).
Andrea

Andrea Valassi <valassi>
Project AdministratorIn charge of this item.
2009-11-19 16:38, comment #8:

I keep this open at lowest priority for the moment. That is to say: for the moment we keep it as it is (ADR dumps will go to $HOME).

Andrea Valassi <valassi>
Project AdministratorIn charge of this item.
2009-11-19 16:36, comment #7:

I am following up with Oracle also whether the DIAG from ADR can be controlled other than though sqlnet.ora. I discussed a possible enhancement request.

"My request for controlling the DIAG programmatically through OCI, or thorough environment variables, rather than only through sqlnet.ora, is not a blocker. Of course if sqlnet.ora is the only way to control this, then I will find a way to use sqlnet.ora. The reason why I consider this to be a limitation is that it may force us to have several different sqlnet.ora and tnsnames.ora for different applications. Presently all applications at CERN in completely different domains (administration, physics, accelerator control etc) and using different technologies (OCI, OCCI etc) and client versions (10g, 11g, maybe still 9i somewhere) generally use the same TNS_ADMIN with a single shared tnsnames.ora (which is updated very frequently, several times a week, by several teams in parallel). We also have an sqlnet.ora there, but essentially it contains almost no information. Most of the application-specific controls are handled by each application.

So far I have never had to touch sqlnet.ora for the set of applications I am responsible for. If I have to change it to handle DIAG, I see two ways. Option 1, I modify the central sqlnet.ora in our central TNS_ADMIN, after dicussing with various other teams. Option 2, I change my application to use an application-specific TNS_ADMIN, with a tnsnames.ora that references/includes the central tnsnames.ora, and an application-specific sqlnet.ora. In any case I will have to try one of these options (probably option 2) until/unless the ER is implemented."

In any case we will probably have to implement one of the two options above for the time being. The third option is to do nothing, and then users may sometimes see oradiag_<user> directories appear in their $HOME. But frankly I imagine this should happen relatively infrequently. It is now relatively well understood what caused this in the nightlies (see bug #59026): a misconfigured coral server (missing Oracle libs) and a probably-buggy random cycler test (not catching exceptions properly).

Andrea

Andrea Valassi <valassi>
Project AdministratorIn charge of this item.
2009-11-19 16:24, comment #6:

I am attaching the two files. The interesting thing is that they contain two disturbing sets of messages:

First:
[AutoCreate Relation]: following error encountered and ignored:
DIA-48316: Message 48316 not found; No message file for product=RDBMS, facility=DIA; arguments: [ADR_CONTROL]
DIA-48210: Message 48210 not found; No message file for product=RDBMS, facility=DIA
DIA-48166: Message 48166 not found; No message file for product=RDBMS, facility=DIA; arguments: [/afs/cern.ch/user/a/avalassi/oradiag_avalassi/diag/clients/user_avalassi/host_1039731343_76/metadata/ADR_CONTROL.ams] [0]
DIA-48122: Message 48122 not found; No message file for product=RDBMS, facility=DIA; arguments: [/afs/cern.ch/user/a/avalassi/oradiag_avalassi/diag/clients/user_avalassi/host_1039731343_76/metadata/ADR_CONTROL.ams] [0]
DIA-27040: Message 27040 not found; No message file for product=RDBMS, facility=DIA
-> Can this mean that some error messages are missing in the instant client ociicus?

Second:
Directory does not exist for read/write [/afs/cern.ch/sw/lcg/external/oracle/11.2.0.1.0/slc4_ia32_gcc34/lib/log] []
-> Can this mean that it is expecting to be able to write in the Oracle lib installation directory?

Andrea

(file #11469, file #11470)

Andrea Valassi <valassi>
Project AdministratorIn charge of this item.
2009-11-19 16:19, comment #5:

For the record, I was able to reproduce the oradiag creation in bug #59026. I copy the same post I made there.

I can now also confirm that the observed (Oracle) client crash is responsible for the oradiag dump described in bug #58917.

This is how to reproduce it:
- on the client, \rm -rf ~/oradiag_avalassi
- on the server, mv away OracleAccess.so
- on the client, qmtest run integration.randomcycler_oracle-coral

This will fail with the following errors on the client

CORAL/RelationalPlugins/coral Success Coral server technology: coral
CORAL/RelationalPlugins/coral Success Coral server protocol:
CORAL/RelationalPlugins/coral Success Coral server host: coralserver
CORAL/RelationalPlugins/coral Success Coral server port: 40007
CORAL/RelationalPlugins/coral Error Exception caught in Session::startUserSession: Connection on "oracle://lcg_coral_nightly/lcg_coral_nightly" cannot be established ( CORAL : "ConnectionPool::getSessionFromNewConnection" from "CORAL/Services/ConnectionService" )
ORA-24550: signal received: [si_signo=6] [si_errno=0] [si_code=-6] [si_int=0] [si_ptr=(nil)] [si_addr=0x6913]
sskgds_getcall: WARNING! *** STACK TRACE ABORTED ***
sskgds_getcall: WARNING! *** UNREADABLE FRAME FOUND ***
sskgds_getcall: invalid fp = 0x1
kpedbg_dmp_stack()+255<-kpeDbgCrash()+66<-kpeDbgSignalHandler()+151<-skgesig_sigactionHandler()+214<-__kernel_vsyscall()+16<-00000000<-00974B5E<-0000503E
CORAL/RelationalPlugins/coral Error Exception caught in Session::startUserSession: Connection on "oracle://lcg_coral_nightly/lcg_coral_nightly" cannot be established ( CORAL : "ConnectionPool::getSessionFromNewConnection" from "CORAL/Services/ConnectionService" )

At the same time, on the client the directory ~/oradiag_avalassi will appear. This contains

ls -l ~/oradiag_avalassi/diag/clients/user_avalassi/host_1039731343_76/*
/afs/cern.ch/user/a/avalassi/oradiag_avalassi/diag/clients/user_avalassi/host_1039731343_76/alert:
total 2
-rw-r----- 1 avalassi zg 1371 Nov 19 16:10 log.xml

/afs/cern.ch/user/a/avalassi/oradiag_avalassi/diag/clients/user_avalassi/host_1039731343_76/cdump:
total 0

/afs/cern.ch/user/a/avalassi/oradiag_avalassi/diag/clients/user_avalassi/host_1039731343_76/incident:
total 0

/afs/cern.ch/user/a/avalassi/oradiag_avalassi/diag/clients/user_avalassi/host_1039731343_76/incpkg:
total 0

/afs/cern.ch/user/a/avalassi/oradiag_avalassi/diag/clients/user_avalassi/host_1039731343_76/lck:
total 0
-rw-r----- 1 avalassi zg 0 Nov 19 16:10 AM_3216668543_3129272988.lck

/afs/cern.ch/user/a/avalassi/oradiag_avalassi/diag/clients/user_avalassi/host_1039731343_76/metadata:
total 0

/afs/cern.ch/user/a/avalassi/oradiag_avalassi/diag/clients/user_avalassi/host_1039731343_76/stage:
total 0

/afs/cern.ch/user/a/avalassi/oradiag_avalassi/diag/clients/user_avalassi/host_1039731343_76/sweep:
total 0

/afs/cern.ch/user/a/avalassi/oradiag_avalassi/diag/clients/user_avalassi/host_1039731343_76/trace:
total 1
-rw-r----- 1 avalassi zg 955 Nov 19 16:10 sqlnet.log

The two files contain interesting info (about a possible misconfig of the client lib, which I will follow up with Oracle).

Andrea

Andrea Valassi <valassi>
Project AdministratorIn charge of this item.
2009-11-17 10:22, comment #4:

Ok, thanks again for the info. If needed I can also attach a volume to the lcgnight afs area where we can store the logs if they are of any use for us.

Cheers

Stefan

Stefan Roiser <roiser>
Project Member
2009-11-17 10:12, comment #3:

Thanks Stefan.

I got an answer from Oracle. The directories are due to the new ADR feature [Automatic Diagnostic Repository].

The feature can be controlled from the sqlnet.ora file. It can be turned off by setting the following parameters in sqlnet.ora
DIAG_ADR_ENABLED=FALSE
DIAG_DDE_ENABLED=FALSE
The trace file location is controlled by :
ADR_BASE=/xxx/yyy/zzz

OCI-documentation of the feature is here:
http://download.oracle.com/docs/cd/...

I will continue to follow up with Oracle to understand if we can also control this using environment variables. Otherwise I will have to follow up with the Physics Database team, because if I understand correctly the sqlnet.ora file is taken frm $TNS_ADMIN, and this usually points to a central place in AFS (ie we should change the default sqlnet.ora for all users).

Andrea

Andrea Valassi <valassi>
Project AdministratorIn charge of this item.
2009-11-17 10:04, comment #2:

Removed, thanks for following up

Stefan

Stefan Roiser <roiser>
Project Member
2009-11-17 09:58, comment #1:

Stefan, you can definitely remove them.

As for the creation, I am trying to understand. I saw that most dates are around October 21 in my cases. I would imagine that this is an Oracle 11 issue, and maybe it is only oracle 11.1 and is solved in 11.2. Have you seen files created recently (with 11.2), or do they look like old files (created in the few weeks we tested 11.1)?

An interesting link:
http://forums.oracle.com/forums/thr...

There seems to be very little documentation around for this, I opened an Oracle SR.

Andrea

Andrea Valassi <valassi>
Project AdministratorIn charge of this item.
2009-11-16 13:02, original submission:

Stefan reported that a lot of log files are created in the client's $HOME in a directory oradiag_<username>. I also saw this, for instance on the coralserver node.

Date: Mon, 16 Nov 2009 10:59:27 +0100
From: Stefan Roiser <-unavailable->
To: Andrea Valassi <-unavailable->
Subject: log files

Hi Andrea,

I found the lcgnight afs home volume full today. It seems there are a lot of log files from oracle? E.g.

oradiag_lcgnight/diag/clients/user_lcgnight/host_2334089177_11/cdump/core_30211

around 450MB in total. Two questions

- Can I remove them?
- Could the creation of those in $HOME be avoided, if they stem from db tests?

Thanks

Stefan

Andrea Valassi <valassi>
Project AdministratorIn charge of this item.

 

Attached Files
file #11469:  log.xml added by valassi (1kB - text/xml)
file #11470:  sqlnet.log added by valassi (955B - application/octet-stream)

 

Depends on the following items: None found

Items that depend on this one: None found

 

Carbon-Copy List
  • -unavailable- added by valassi
  • -unavailable- added by roiser (Posted a comment)
  • -unavailable- added by valassi (Submitted the item)
  • -unavailable- added by valassi
  •  

    There are 0 votes so far. Votes easily highlight which items people would like to see resolved in priority, independantly of the priority of the item set by tracker managers.

    Only logged-in users can vote.

     

     

     

    Follow 13 latest changes.

    Date Changed By Updated Field Previous Value => Replaced By
    2011-11-25 17:25valassiCarbon-Copy-=>Added -unavailable-
    2010-08-19 15:18valassiSummaryOracle DIAGNOSTIC_DEST directory created in user $HOME on client=>Oracle DIAGNOSTIC_DEST directory created in user $HOME on client (following ORA-24550)
    2009-12-08 16:06valassiSeverity1 - Wish=>3 - Normal
      Priority1 - Later=>5 - Normal
      StatusNone=>Fixed
      Assigned toNone=>valassi
      Open/ClosedOpen=>Closed
      Closed on2009-12-08 16:06=>2009-12-08 16:06
    2009-11-19 16:38valassiPriority5 - Normal=>1 - Later
      Severity3 - Normal=>1 - Wish
    2009-11-19 16:24valassiAttached File-=>Added log.xml, #11469
      Attached File-=>Added sqlnet.log, #11470
    2009-11-16 13:02valassiCarbon-Copy-=>Added roiser
    Show feedback again

    Back to the top


    Powered by Savane SVN (toward 3.1)