logo
Another error spotted…
Agent is going down after a few minutes / hours.

2014-04-22 14:56:50,337 [40:54F9BB47:GCThread-11] INFO - uncaughtException handler with
java.lang.OutOfMemoryError: Java heap space

001. Agent is down

001. When you start an agent and check his status, everything looks fine.
But after some time (minutes to hours) agent will fail and go down.
You can start it again, but this situation will repeat shortly.
Time between this is not static.

In the agent log You can find something like that:

 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at oracle.sysman.gcagent.task.TaskFutureImpl.run1(TaskFutureImpl.java:357)
 at oracle.sysman.gcagent.task.TaskFutureImpl.run(TaskFutureImpl.java:322)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
2014-04-22 14:56:50,321 [1:main] ERROR - Critical error:
java.lang.OutOfMemoryError: Java heap space
 at java.io.BufferedWriter.<init>(BufferedWriter.java:87)
 at java.io.BufferedWriter.<init>(BufferedWriter.java:70)
 at oracle.sysman.gcagent.common.FileUtils.setContents(FileUtils.java:153)
 at oracle.sysman.gcagent.common.FileUtils.safeSetContents(FileUtils.java:260)
 at oracle.sysman.gcagent.common.FileUtils.safeSetContents(FileUtils.java:210)
 at oracle.sysman.gcagent.common.FileUtils.safeSetContents(FileUtils.java:182)
 at oracle.sysman.gcagent.tmmain.AgentStatus.writeStatusMessage(AgentStatus.java:130)
 at oracle.sysman.gcagent.tmmain.TMMain.writeCurrentStatus(TMMain.java:622)
 at oracle.sysman.gcagent.tmmain.TMMain.waitForSignal(TMMain.java:655)
 at oracle.sysman.gcagent.tmmain.TMMain.agentMain(TMMain.java:570)
 at oracle.sysman.gcagent.tmmain.TMMain.main(TMMain.java:519)
2014-04-22 14:56:50,336 [1:3305B9] INFO - Handling OOME (FCCounter [count=1, duration=300000, maxInDuration=0, timestamp=1398171410336],
6289778)
2014-04-22 14:56:50,337 [40:54F9BB47:GCThread-11] INFO - uncaughtException handler with
java.lang.OutOfMemoryError: Java heap space
 at java.util.Arrays.copyOfRange(Arrays.java:3209)
 at java.lang.String.<init>(String.java:271)
 at java.lang.StringBuilder.toString(StringBuilder.java:455)
 at oracle.sysman.gcagent.upload.UploadStoreForward$dequeuer.xferFile(UploadStoreForward.java:2729)
 at oracle.sysman.gcagent.upload.UploadStoreForward$dequeuer._run(UploadStoreForward.java:3562)
 at oracle.sysman.gcagent.upload.UploadStoreForward$dequeuer.run(UploadStoreForward.java:3452)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at oracle.sysman.gcagent.util.system.GCAThread$RunnableWrapper.run(GCAThread.java:184)
 at java.lang.Thread.run(Thread.java:662)
2014-04-22 14:56:50,338 [40:54F9BB47] INFO - uncaughtException: in the finally block with bounce of false
2014-04-22 14:56:50,351 [170:65C9DFE7:GC.SysExecutor.7 (oracle_emd:SERVERNAME:3873:z#UploadSystemStats) (oracle_emd:SERVERNAME:3873:z#UploadSystemS
tats:z#UploadSystemStats)] INFO - Cancelled task oracle_emd:SERVERNAME:3873:z#UploadSystemStats:z#UploadSystemStats due to []
2014-04-22 14:56:50,353 [169:B6D626C4:GC.SysExecutor.6 (Ping OMS)] INFO - attempting another heartbeat
2014-04-22 14:56:50,358 [175:BAD8DA75:GC.SysExecutor.9 (oracle_emd:SERVERNAME:3873:EMDStatusCollection) (oracle_emd:SERVERNAME:3873:EMDStatusCollec
tion:EMDStatus)] INFO - Cancelled task oracle_emd:SERVERNAME:3873:EMDStatusCollection:EMDStatus due to [oracle.sysman.gcagent.task.TaskTimeout
Exception: task timeout: 120000 MILLISECONDS for oracle_emd:SERVERNAME:3873:EMDStatusCollection]
2014-04-22 14:56:50,335 [159:B5B7037C:GC.SysExecutor.4 (oracle_emd:SERVERNAME:3873:z#QueuePerformance) (oracle_emd:SERVERNAME:3873:z#QueuePerforman
ce:z#QueuePerformance)] INFO - Cancelled task oracle_emd:SERVERNAME:3873:z#QueuePerformance:z#QueuePerformance due to []
2014-04-22 14:56:53,140 [57:323D9CCF:GC.Executor.1 (oracle_database:SWDDBP92 na SERVERNAME:Response) (oracle_database:SWDDBP92 na SERVERNAME:Respon
se:Response)] WARN - Marshalling to output file stream /u01/app/oracle/product/12.1.0.2.0/agent_inst/sysman/emd/upload/upload/sending/upl
B_P0_S1_I21.xml.tmp failed
java.io.InterruptedIOException:
 at java.io.FileOutputStream.writeBytes(Native Method)
 at java.io.FileOutputStream.write(FileOutputStream.java:282)

001a. Other cases, but source of problem is still same:

log4j:ERROR Failed to flush writer,
java.io.InterruptedIOException:
 at java.io.FileOutputStream.writeBytes(Native Method)
 at java.io.FileOutputStream.write(FileOutputStream.java:282)
 at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:202)
 at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:272)
 at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:276)
 at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:122)
 at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:212)
 at org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:49)
 at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:309)
 at org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:294)
 at org.apache.log4j.WriterAppender.append(WriterAppender.java:157)
 at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
 at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:57)
 at org.apache.log4j.Category.callAppenders(Category.java:255)
 at org.apache.log4j.Category.forcedLog(Category.java:445)
 at org.apache.log4j.Category.log(Category.java:844)
 at oracle.sysman.gcagent.util.logging.Logger.logMsg(Logger.java:242)
 at oracle.sysman.gcagent.util.logging.Logger.logErrorImpl(Logger.java:231)
 at oracle.sysman.gcagent.util.logging.Logger.error(Logger.java:186)
 at oracle.sysman.gcagent.task.TaskFutureImpl$WrappedTask.call(TaskFutureImpl.java:637)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at oracle.sysman.gcagent.task.TaskFutureImpl.run1(TaskFutureImpl.java:340)
 at oracle.sysman.gcagent.task.TaskFutureImpl.run(TaskFutureImpl.java:305)
 at oracle.sysman.gcagent.task.CompositeTask.runSubtask(CompositeTask.java:70)
 at oracle.sysman.gcagent.task.CompositeTask.run(CompositeTask.java:77)
 at oracle.sysman.gcagent.metadata.impl.collection.CollectionItem$CollectionItemTask.run(CollectionItem.java:1679)
 at oracle.sysman.gcagent.task.AbstractTemplateTask.call(AbstractTemplateTask.java:185)
 at oracle.sysman.gcagent.task.AbstractTemplateTask.call(AbstractTemplateTask.java:47)
 at oracle.sysman.gcagent.task.scheduler.DispatchingTaskScheduler$ReschedulingHelper$ReschedulingTask.call(DispatchingTaskScheduler.java:421)
 at oracle.sysman.gcagent.task.scheduler.DispatchingTaskScheduler$ReschedulingHelper$ReschedulingTask.call(DispatchingTaskScheduler.java:383)
 at oracle.sysman.gcagent.task.executor.DiagWrappedTask.call(DiagWrappedTask.java:60)
 at oracle.sysman.gcagent.task.TaskFutureImpl$WrappedTask.accountedCall(TaskFutureImpl.java:548)
 at oracle.sysman.gcagent.task.TaskFutureImpl$WrappedTask.call(TaskFutureImpl.java:587)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at oracle.sysman.gcagent.task.TaskFutureImpl.run1(TaskFutureImpl.java:340)
 at oracle.sysman.gcagent.task.TaskFutureImpl.run(TaskFutureImpl.java:305)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at oracle.sysman.gcagent.util.system.GCAThread$RunnableWrapper.run(GCAThread.java:141)
 at java.lang.Thread.run(Thread.java:662)

or….

----- Thu Apr 17 04:26:16 2014::18124::Checking status of EMAgent : 25622 -----
----- Thu Apr 17 04:26:47 2014::18124::Checking status of EMAgent : 25622 -----
log4j:ERROR Failed to flush writer,
java.io.InterruptedIOException:
 at java.io.FileOutputStream.writeBytes(Native Method)
 at java.io.FileOutputStream.write(FileOutputStream.java:282)
 at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:202)
----- Thu Apr 17 04:27:18 2014::18124::Checking status of EMAgent : 25622 -----
 at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:272)
 at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:276)
 at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:122)
----- Thu Apr 17 04:27:51 2014::18124::Checking status of EMAgent : 25622 -----
 at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:212)
 at org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:49)
 at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:309)
 at org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:294)
----- Thu Apr 17 04:28:23 2014::18124::Checking status of EMAgent : 25622 -----
 at org.apache.log4j.WriterAppender.append(WriterAppender.java:157)
 at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
 at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:57)
 at org.apache.log4j.Category.callAppenders(Category.java:255)
 at org.apache.log4j.Category.forcedLog(Category.java:445)
 at org.apache.log4j.Category.log(Category.java:844)
 at oracle.sysman.gcagent.util.logging.Logger.logMsg(Logger.java:242)
 at oracle.sysman.gcagent.util.logging.Logger.logWarnImpl(Logger.java:226)
 at oracle.sysman.gcagent.util.logging.Logger.warn(Logger.java:169)
 at oracle.sysman.gcagent.util.logging.Logger.warn(Logger.java:159)
----- Thu Apr 17 04:28:54 2014::18124::Checking status of EMAgent : 25622 -----
 at oracle.sysman.gcagent.metadata.impl.collection.IntervalSchedule.getNextUnaligned(IntervalSchedule.java:318)
 at oracle.sysman.gcagent.metadata.impl.collection.IntervalSchedule.getNext(IntervalSchedule.java:383)
 at oracle.sysman.gcagent.task.scheduler.DispatchingTaskScheduler$ReschedulingHelper.rescheduleIfNecessary(DispatchingTaskScheduler.java:482)
 at oracle.sysman.gcagent.task.scheduler.DispatchingTaskScheduler$ReschedulingHelper.access$900(DispatchingTaskScheduler.java:381)
 at oracle.sysman.gcagent.task.scheduler.DispatchingTaskScheduler$ReschedulingHelper$ReschedulingTask.call(DispatchingTaskScheduler.java:433)
----- Thu Apr 17 04:29:25 2014::18124::Checking status of EMAgent : 25622 -----
 at oracle.sysman.gcagent.task.scheduler.DispatchingTaskScheduler$ReschedulingHelper$ReschedulingTask.call(DispatchingTaskScheduler.java:383)
 at oracle.sysman.gcagent.task.executor.DiagWrappedTask.call(DiagWrappedTask.java:60)
 at oracle.sysman.gcagent.task.TaskFutureImpl$WrappedTask.accountedCall(TaskFutureImpl.java:548)
 at oracle.sysman.gcagent.task.TaskFutureImpl$WrappedTask.call(TaskFutureImpl.java:587)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at oracle.sysman.gcagent.task.TaskFutureImpl.run1(TaskFutureImpl.java:340)
 at oracle.sysman.gcagent.task.TaskFutureImpl.run(TaskFutureImpl.java:305)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at oracle.sysman.gcagent.util.system.GCAThread$RunnableWrapper.run(GCAThread.java:141)
 at java.lang.Thread.run(Thread.java:662)
Agent is going down due to an OutOfMemoryError
----- Thu Apr 17 04:29:56 2014::18124::Checking status of EMAgent : 25622 -----
----- Thu Apr 17 04:31:10 2014::18124::Checking status of EMAgent : 25622 -----
----- Thu Apr 17 04:31:10 2014::18124::EMAgent exited at Thu Apr 17 04:31:10 2014 with return value 55. -----
----- Thu Apr 17 04:31:10 2014::18124::EMAgent has exited due to initialization failure. -----
----- Thu Apr 17 04:31:10 2014::18124::Stopping other components. -----
----- Thu Apr 17 04:31:10 2014::18124::Commiting Process death. -----
----- Thu Apr 17 04:31:10 2014::18124::Generating additional diagnostics for 25622 -----
----- Thu Apr 17 04:31:10 2014::18124::Generating openfiles report in /u01/app/oracle/product/12.1.0.2/agent_inst/sysman/log/lsof_25622 -----
----- Thu Apr 17 04:31:29 2014::18124::Generating netstat report in /u01/app/oracle/product/12.1.0.2/agent_inst/sysman/log/netstat_25622 -----
----- Thu Apr 17 04:31:32 2014::18124::Generating all host processes report in /u01/app/oracle/product/12.1.0.2/agent_inst/sysman/log/allprocesses_25622 -----
----- Thu Apr 17 04:31:32 2014::18124::Exiting watchdog loop
 -----

002. Solution – in the agent configuration – file like:

-rw-r—– 1 oracle oinstall 6571 Apr 17 04:41 emd.properties
located in path like this:
/u01/app/oracle/product/12.1.0.2/agent_inst/sysman/config/emd.properties
increase the Xmx memory limit.
Before this operation shutdown agent (emctl stop agent). And start him again after change.

#
# These are the optional Java flags for the agent
#
agentJavaDefines=-Xmx140M -XX:MaxPermSize=96M

and increase Xmx parameter, like:

agentJavaDefines=-Xmx256M -XX:MaxPermSize=96M

 

BTW: There are an auto-tune option enabled (at default) in the agent settings, so in log You can see this part, and this is normal (agent will try to increase amount of used memory themself), but if You see this, think about increase memory at start.
If problem still happen, put larger value.

----- Thu Apr 17 04:14:32 2014::18124::Checking status of EMAgent : 18215 -----
----- Thu Apr 17 04:14:42 2014::18124::Checking status of EMAgent : 18215 -----
----- Thu Apr 17 04:14:42 2014::18124::EMAgent exited at Thu Apr 17 04:14:42 2014 with return value 57. -----
----- Thu Apr 17 04:14:42 2014::18124::EMAgent will be restarted because of an Out of Memory Exception. -----
----- Thu Apr 17 04:14:42 2014::18124::Restarting EMAgent. -----
----- Thu Apr 17 04:14:43 2014::18124::Auto tuning the agent at time Thu Apr 17 04:14:43 2014 -----
UploadMaxBytesXML=50.0
_SchedulePersistTimer=30
MaxThreads=10
agentJavaDefines=-Xmx140M -XX:MaxPermSize=96M
SchedulerRandomSpreadMins=30
UploadMaxNumberXML=5000
Auto tuning was successful
----- Thu Apr 17 04:14:54 2014::18124::Finished auto tuning the agent at time Thu Apr 17 04:14:54 2014 -----
----- Thu Apr 17 04:14:54 2014::18124::Launching the JVM with following options: -Xmx140M -XX:MaxPermSize=96M -server -d64 -Djava.security.egd=file:///dev/./urandom -Dsun.lang.ClassLoader.allowArraySyntax=true -XX:+UseConcMarkSweepGC -----
----- Thu Apr 17 04:14:54 2014::18124::Agent Launched with PID 25622 at time Thu Apr 17 04:14:54 2014 -----
----- Thu Apr 17 04:14:54 2014::25622::Execing EMAgent process is taking longer than expected 120 secs -----
----- Thu Apr 17 04:14:54 2014::25622::Time elapsed between Launch of Watchdog process and execing EMAgent is 739 secs -----
2014-04-17 04:14:55,287 [1:main] WARN - Missing filename for log handler 'wsm'
2014-04-17 04:14:55,305 [1:main] WARN - Missing filename for log handler 'opss'
2014-04-17 04:14:55,307 [1:main] WARN - Missing filename for log handler 'opsscfg'

 

.