jobs stuck in waiting status

slehner

Member
We have been experiencing a problem in the last few weeks where jobs get
submitted to the server to run and sit in waiting status.
After a period of time (no action taken), varying from 5 minutes to 30 minutes,
the jobs will then start processing again.

Originally we thought that there was a relation between certain jobs on the
system causing things to hang up, but this idea has pretty much been eliminated.
Disk space and processor utilization/memory are not issues.

We have 12 multithreaded queues, all jobs go to these (all same name).

Jobs that are running on the console as RUNUBE run fine, and process normally.

The jdenet and jdeque services are setup to start with a domain administrator
account.

Stopping and starting services also successfully "resets" the queues so that
Runbatch.exe processes are handled.

Any help would be greatly appreciated.

Thanks,

XE SP13, Win2k-Oracle 8.1.7, MetaFrame

Steve Lehner
Global Systems Administrator
Global Information Systems
Westcon.Net
email: [email protected]
phone: 914 829 7457
 
First, please tell me that you don't have *all* UBE's running in multithreaded queues. There are a list of jobs that should not be run multithreaded.

Second, how many processors does your server(s) have? The general rule of thumb is to have no more than two actively processing UBE's per processor. If you have twelve multithreaded queues, with a minimum of two threads per queue that would equal 24 possible active UBE's. Did you mean you have twelve threads in your x number of queues?



slehner <[email protected]> wrote:

We have been experiencing a problem in the last few weeks where jobs get
submitted to the server to run and sit in waiting status.
After a period of time (no action taken), varying from 5 minutes to 30 minutes,
the jobs will then start processing again.

Originally we thought that there was a relation between certain jobs on the
system causing things to hang up, but this idea has pretty much been eliminated.
Disk space and processor utilization/memory are not issues.

We have 12 multithreaded queues, all jobs go to these (all same name).

Jobs that are running on the console as RUNUBE run fine, and process normally.

The jdenet and jdeque services are setup to start with a domain administrator
account.

Stopping and starting services also successfully "resets" the queues so that
Runbatch.exe processes are handled.

Any help would be greatly appreciated.

Thanks,

XE SP13, Win2k-Oracle 8.1.7, MetaFrame

Steve Lehner
Global Systems Administrator
Global Information Systems
Westcon.Net
email: [email protected]
phone: 914 829 7457





--------------------------
 
I could tell you that, but I'd be lying...

I guess a required step is to get the jobs that need their own queue to be
reassigned, as soon as those UBE's are identified.

We are running on a Compaq DL760 with 8 - 700Mhz Xeon processors and 8G of
memory.

We have Oracle split across two 203G drive arrays, and have plenty of free
space, both on application and boot drives.

The Queue setup is as follows:

UBEQueues=12
UBEQueue1=QB7333
UBEQueue2=QB7333
UBEQueue3=QB7333
UBEQueue4=QB7333
UBEQueue5=QB7333
UBEQueue6=QB7333
UBEQueue7=QB7333
UBEQueue8=QB7333
UBEQueue9=QB7333
UBEQueue10=QB7333
UBEQueue11=QB7333
UBEQueue12=QB7333

This has been working fine for months, but I am working on addressing adding
single threaded queues anyway.

If it were a problem of a job holding the queues, then I would see something
either in S or P status, but this is not the case. Everything, aside from
RUNUBE jobs are sitting in W.


Steve Lehner
Global Systems Administrator
Global Information Systems
Westcon.Net
email: [email protected]
phone: 914 829 7457



| | brother_of_kara |
| | <brother_of_karamazov@|
| | yahoo.com> |
| | |
| | 01/18/2002 12:29 AM |
| | Please respond to |
| | jdelist |
| | |
>----------------------------------------------------------------------------|
| |
| To: Stephen Lehner/Westchester/IS/US/WestconGroup@WestconGrp |
| cc: |
| Subject: Re: jobs stuck in waiting status |
>----------------------------------------------------------------------------|






First, please tell me that you don't have *all* UBE's running in multithreaded
queues. There are a list of jobs that should not be run multithreaded.

Second, how many processors does your server(s) have? The general rule of thumb
is to have no more than two actively processing UBE's per processor. If you
have twelve multithreaded queues, with a minimum of two threads per queue that
would equal 24 possible active UBE's. Did you mean you have twelve threads in
your x number of queues?



slehner <[email protected]> wrote:

We have been experiencing a problem in the last few weeks where jobs get
submitted to the server to run and sit in waiting status.
After a period of time (no action taken), varying from 5 minutes to 30 minutes,
the jobs will then start processing again.

Originally we thought that there was a relation between certain jobs on the
system causing things to hang up, but this idea has pretty much been eliminated.
Disk space and processor utilization/memory are not issues.

We have 12 multithreaded queues, all jobs go to these (all same name).

Jobs that are running on the console as RUNUBE run fine, and process normally.

The jdenet and jdeque services are setup to start with a domain administrator
account.

Stopping and starting services also successfully "resets" the queues so that
Runbatch.exe processes are handled.

Any help would be greatly appreciated.

Thanks,

XE SP13, Win2k-Oracle 8.1.7, MetaFrame

Steve Lehner
Global Systems Administrator
Global Information Systems
Westcon.Net
email: [email protected]
phone: 914 829 7457





--------------------------



--------------------------
 
Steve,
an issue has been discused here at the LIST that was similar to yours.
Some UBE`s would lock the rdaspec file and not release it until the UBE
was done. One of the JOBS would go into `S` state and stay there for a
long time. There was a fix for that problem. I do not remember the ES or
SAR number.
Otherwise I agree with brother_of_kara: You should try to create queues
different from the default queue. There is documentation on the KG about
which reports should not run multithreaded.

Thanks, Gerd
 
Gerd:

Maybe the SAR number is 3696736 (RDA TAM tables locked)

I hope this help.

Regards!!
 
Back
Top