## POOL and CORAL - Bugs: bug #58917, Oracle DIAGNOSTIC_DEST directory...

Show feedback again

You are not allowed to post comments on this tracker with your current authentication level.

## bug #58917: Oracle DIAGNOSTIC_DEST directory created in user $HOME on client (following ORA-24550)  Submitted by: Andrea Valassi Submitted on: 2009-11-16 13:02 Category: CORAL (generic) Severity: 3 - Normal Priority: 5 - Normal Status: Fixed Privacy: Public Assigned to: Andrea Valassi Open/Closed: Closed Fixed Release: None Effort: 0.00 Planned Release: ### document.write('<a onclick="document.getElementById(\'hidsubpartcontentdiscussion\').style.display=\'none\'; document.getElementById(\'hidsubpartlinkhidediscussion\').style.display=\'none\'; document.getElementById(\'hidsubpartlinkshowdiscussion\').style.display=\'block\';" id="hidsubpartlinkhidediscussion" style="display: block" href="#discussion"><span class="minusorplus">(-)</span> Discussion</a>'); document.write('<a onclick="document.getElementById(\'hidsubpartcontentdiscussion\').style.display=\'inline\'; document.getElementById(\'hidsubpartlinkhidediscussion\').style.display=\'block\'; document.getElementById(\'hidsubpartlinkshowdiscussion\').style.display=\'none\';" id="hidsubpartlinkshowdiscussion" style="display: none" href="#discussion"><span class="minusorplus">(+)</span> Discussion</a>'); Discussion 2011-11-25 17:25, comment #12: For the record: the Oracle signal handler producing these ORA-24550 can probably be completed disabled (e.g. if one wants to use the ROOT signal handler) by setting DIAG_SIGHANDLER_ENABLED=FALSE. Thanks to Luca for the suggestion. Andrea Andrea Valassi <valassi> 2010-08-19 15:18, comment #11: Rename to mention ORA-24550. Andrea Valassi <valassi> 2009-12-08 16:06, comment #10: I have created an sqlnet.ora file in /afs/cern.ch/sw/lcg/external/oracle/11.2.0.1.0/admin/ with the two lines to disable the ADR. Oracle recommends using different sqlnet.ora for 10g and 11g clients, so I did not modify the central sqlnte.ora which so f was used for all client applications (9i and 10g). Thanks to Ana, the LCG_Interface for oracle has been modified, http://lcgcmt.cvs.cern.ch/cgi-bin/l... This is used for LCGCMT-preview (for th enext 57x release?), but not (yet?) for the 56 patches used for 56d. Bug closed. Andrea Andrea Valassi <valassi> 2009-11-19 16:58, comment #9: I can also confirm that the ORA-24550 (and the oradiag dump from ADR) only happen if the client (random cycler test) is running oracle 11g. With oracle 10g the failure is less verbose (which makes it look less catastrophic, although it probably is not - actually the extra info in oracle 11g is probably very useful!). Andrea Andrea Valassi <valassi> 2009-11-19 16:38, comment #8: I keep this open at lowest priority for the moment. That is to say: for the moment we keep it as it is (ADR dumps will go to$HOME).

Andrea Valassi <valassi>
2009-11-19 16:36, comment #7:

I am following up with Oracle also whether the DIAG from ADR can be controlled other than though sqlnet.ora. I discussed a possible enhancement request.

"My request for controlling the DIAG programmatically through OCI, or thorough environment variables, rather than only through sqlnet.ora, is not a blocker. Of course if sqlnet.ora is the only way to control this, then I will find a way to use sqlnet.ora. The reason why I consider this to be a limitation is that it may force us to have several different sqlnet.ora and tnsnames.ora for different applications. Presently all applications at CERN in completely different domains (administration, physics, accelerator control etc) and using different technologies (OCI, OCCI etc) and client versions (10g, 11g, maybe still 9i somewhere) generally use the same TNS_ADMIN with a single shared tnsnames.ora (which is updated very frequently, several times a week, by several teams in parallel). We also have an sqlnet.ora there, but essentially it contains almost no information. Most of the application-specific controls are handled by each application.

So far I have never had to touch sqlnet.ora for the set of applications I am responsible for. If I have to change it to handle DIAG, I see two ways. Option 1, I modify the central sqlnet.ora in our central TNS_ADMIN, after dicussing with various other teams. Option 2, I change my application to use an application-specific TNS_ADMIN, with a tnsnames.ora that references/includes the central tnsnames.ora, and an application-specific sqlnet.ora. In any case I will have to try one of these options (probably option 2) until/unless the ER is implemented."

In any case we will probably have to implement one of the two options above for the time being. The third option is to do nothing, and then users may sometimes see oradiag_<user> directories appear in their $HOME. But frankly I imagine this should happen relatively infrequently. It is now relatively well understood what caused this in the nightlies (see bug #59026): a misconfigured coral server (missing Oracle libs) and a probably-buggy random cycler test (not catching exceptions properly). Andrea Andrea Valassi <valassi> 2009-11-19 16:24, comment #6: I am attaching the two files. The interesting thing is that they contain two disturbing sets of messages: First: [AutoCreate Relation]: following error encountered and ignored: DIA-48316: Message 48316 not found; No message file for product=RDBMS, facility=DIA; arguments: [ADR_CONTROL] DIA-48210: Message 48210 not found; No message file for product=RDBMS, facility=DIA DIA-48166: Message 48166 not found; No message file for product=RDBMS, facility=DIA; arguments: [/afs/cern.ch/user/a/avalassi/oradiag_avalassi/diag/clients/user_avalassi/host_1039731343_76/metadata/ADR_CONTROL.ams] [0] DIA-48122: Message 48122 not found; No message file for product=RDBMS, facility=DIA; arguments: [/afs/cern.ch/user/a/avalassi/oradiag_avalassi/diag/clients/user_avalassi/host_1039731343_76/metadata/ADR_CONTROL.ams] [0] DIA-27040: Message 27040 not found; No message file for product=RDBMS, facility=DIA -> Can this mean that some error messages are missing in the instant client ociicus? Second: Directory does not exist for read/write [/afs/cern.ch/sw/lcg/external/oracle/11.2.0.1.0/slc4_ia32_gcc34/lib/log] [] -> Can this mean that it is expecting to be able to write in the Oracle lib installation directory? Andrea Andrea Valassi <valassi> 2009-11-19 16:19, comment #5: For the record, I was able to reproduce the oradiag creation in bug #59026. I copy the same post I made there. I can now also confirm that the observed (Oracle) client crash is responsible for the oradiag dump described in bug #58917. This is how to reproduce it: - on the client, \rm -rf ~/oradiag_avalassi - on the server, mv away OracleAccess.so - on the client, qmtest run integration.randomcycler_oracle-coral This will fail with the following errors on the client CORAL/RelationalPlugins/coral Success Coral server technology: coral CORAL/RelationalPlugins/coral Success Coral server protocol: CORAL/RelationalPlugins/coral Success Coral server host: coralserver CORAL/RelationalPlugins/coral Success Coral server port: 40007 CORAL/RelationalPlugins/coral Error Exception caught in Session::startUserSession: Connection on "oracle://lcg_coral_nightly/lcg_coral_nightly" cannot be established ( CORAL : "ConnectionPool::getSessionFromNewConnection" from "CORAL/Services/ConnectionService" ) ORA-24550: signal received: [si_signo=6] [si_errno=0] [si_code=-6] [si_int=0] [si_ptr=(nil)] [si_addr=0x6913] sskgds_getcall: WARNING! *** STACK TRACE ABORTED *** sskgds_getcall: WARNING! *** UNREADABLE FRAME FOUND *** sskgds_getcall: invalid fp = 0x1 kpedbg_dmp_stack()+255<-kpeDbgCrash()+66<-kpeDbgSignalHandler()+151<-skgesig_sigactionHandler()+214<-__kernel_vsyscall()+16<-00000000<-00974B5E<-0000503E CORAL/RelationalPlugins/coral Error Exception caught in Session::startUserSession: Connection on "oracle://lcg_coral_nightly/lcg_coral_nightly" cannot be established ( CORAL : "ConnectionPool::getSessionFromNewConnection" from "CORAL/Services/ConnectionService" ) At the same time, on the client the directory ~/oradiag_avalassi will appear. This contains ls -l ~/oradiag_avalassi/diag/clients/user_avalassi/host_1039731343_76/* /afs/cern.ch/user/a/avalassi/oradiag_avalassi/diag/clients/user_avalassi/host_1039731343_76/alert: total 2 -rw-r----- 1 avalassi zg 1371 Nov 19 16:10 log.xml /afs/cern.ch/user/a/avalassi/oradiag_avalassi/diag/clients/user_avalassi/host_1039731343_76/cdump: total 0 /afs/cern.ch/user/a/avalassi/oradiag_avalassi/diag/clients/user_avalassi/host_1039731343_76/incident: total 0 /afs/cern.ch/user/a/avalassi/oradiag_avalassi/diag/clients/user_avalassi/host_1039731343_76/incpkg: total 0 /afs/cern.ch/user/a/avalassi/oradiag_avalassi/diag/clients/user_avalassi/host_1039731343_76/lck: total 0 -rw-r----- 1 avalassi zg 0 Nov 19 16:10 AM_3216668543_3129272988.lck /afs/cern.ch/user/a/avalassi/oradiag_avalassi/diag/clients/user_avalassi/host_1039731343_76/metadata: total 0 /afs/cern.ch/user/a/avalassi/oradiag_avalassi/diag/clients/user_avalassi/host_1039731343_76/stage: total 0 /afs/cern.ch/user/a/avalassi/oradiag_avalassi/diag/clients/user_avalassi/host_1039731343_76/sweep: total 0 /afs/cern.ch/user/a/avalassi/oradiag_avalassi/diag/clients/user_avalassi/host_1039731343_76/trace: total 1 -rw-r----- 1 avalassi zg 955 Nov 19 16:10 sqlnet.log The two files contain interesting info (about a possible misconfig of the client lib, which I will follow up with Oracle). Andrea Andrea Valassi <valassi> 2009-11-17 10:22, comment #4: Ok, thanks again for the info. If needed I can also attach a volume to the lcgnight afs area where we can store the logs if they are of any use for us. Cheers Stefan Stefan Roiser <roiser> 2009-11-17 10:12, comment #3: Thanks Stefan. I got an answer from Oracle. The directories are due to the new ADR feature [Automatic Diagnostic Repository]. The feature can be controlled from the sqlnet.ora file. It can be turned off by setting the following parameters in sqlnet.ora DIAG_ADR_ENABLED=FALSE DIAG_DDE_ENABLED=FALSE The trace file location is controlled by : ADR_BASE=/xxx/yyy/zzz OCI-documentation of the feature is here: http://download.oracle.com/docs/cd/... I will continue to follow up with Oracle to understand if we can also control this using environment variables. Otherwise I will have to follow up with the Physics Database team, because if I understand correctly the sqlnet.ora file is taken frm$TNS_ADMIN, and this usually points to a central place in AFS (ie we should change the default sqlnet.ora for all users).

Andrea

Andrea Valassi <valassi>
2009-11-17 10:04, comment #2:

Removed, thanks for following up

Stefan

Stefan Roiser <roiser>
2009-11-17 09:58, comment #1:

Stefan, you can definitely remove them.

As for the creation, I am trying to understand. I saw that most dates are around October 21 in my cases. I would imagine that this is an Oracle 11 issue, and maybe it is only oracle 11.1 and is solved in 11.2. Have you seen files created recently (with 11.2), or do they look like old files (created in the few weeks we tested 11.1)?

http://forums.oracle.com/forums/thr...

There seems to be very little documentation around for this, I opened an Oracle SR.

Andrea

Andrea Valassi <valassi>
2009-11-16 13:02, original submission:

Stefan reported that a lot of log files are created in the client's $HOME in a directory oradiag_<username>. I also saw this, for instance on the coralserver node. Date: Mon, 16 Nov 2009 10:59:27 +0100 From: Stefan Roiser <-unavailable-> To: Andrea Valassi <-unavailable-> Subject: log files Hi Andrea, I found the lcgnight afs home volume full today. It seems there are a lot of log files from oracle? E.g. oradiag_lcgnight/diag/clients/user_lcgnight/host_2334089177_11/cdump/core_30211 around 450MB in total. Two questions - Can I remove them? - Could the creation of those in$HOME be avoided, if they stem from db tests?

Thanks

Stefan

Andrea Valassi <valassi>

Attached Files
file #11469:  log.xml added by valassi (1kB - text/xml)
file #11470:  sqlnet.log added by valassi (955B - application/octet-stream)

Depends on the following items: None found

Items that depend on this one: None found

Carbon-Copy List
• -unavailable- added by roiser (Posted a comment)
• -unavailable- added by valassi (Submitted the item)
•

There are 0 votes so far. Votes easily highlight which items people would like to see resolved in priority, independantly of the priority of the item set by tracker managers.

Only logged-in users can vote.

Date Changed By Updated Field Previous Value => Replaced By
2010-08-19 15:18valassiSummaryOracle DIAGNOSTIC_DEST directory created in user $HOME on clientOracle DIAGNOSTIC_DEST directory created in user$HOME on client (following ORA-24550)
2009-12-08 16:06valassiSeverity1 - Wish3 - Normal
Priority1 - Later5 - Normal
StatusNoneFixed
Assigned toNonevalassi
Open/ClosedOpenClosed
Closed on2009-12-08 16:062009-12-08 16:06
2009-11-19 16:38valassiPriority5 - Normal1 - Later
Severity3 - Normal1 - Wish