|Posted on Tuesday, September 04, 2012 - 9:27 am: |
I am using WebMO Pro, 12.1.004p
Every job submitted terminates this way, for every user on the system except me.
This behavior has only been observed in the last week and I believe I have been excluded because I have had jobs continuously running for the last month. I recently deleted some old user accounts and added some new user accounts. The error message occurs for both the new users and continuing old users.
I'll attach the diagnose.html for the system, and see if stopping/re-starting the daemon will solve the problem.
Post Number: 295
|Posted on Tuesday, September 04, 2012 - 10:30 am: |
Please login as admin. You to the 'user manager'. Pick a user who exhibits this problem, and click the 'edit' icon next to that user.
On the resulting 'Permissions' tab, there should be a field for 'Total time limit' and 'Job time limit'. (These may be greyed out, see below.) The former specifies the TOTAL aggregate amount of CPU time allocated to that user (should say N/A or -1 for unlimited; if it is 0, change it!). The latter specifies the time allowed for a given job (N/A or -1 for unlimited).
If those are greyed out, and the 'Use group setttings' is checked, then these settings are read from the corresponding user GROUP. To modify those, go to the 'Group Manager' (under admin). Those same time limit settings can be modified their as well. Since the issues affects MANY users, I suspect the problem lies there.
|Posted on Tuesday, September 25, 2012 - 12:38 pm: |
I stopped/re-started the daemon and the problem went away. I did not pursue your solution because all of our user accounts are identical, but only one account was able to submit a job.
I suspect the issue is related to the daemon running while users are added/deleted. At least, that is the only way I can explain the behavior on my system. During the summer there was essentially always an active job running. After the addition/deletion of user accounts, the only user account that could successfully submit jobs was the account under which there were active jobs during the addition/deletion of user accounts.
Post Number: 308
|Posted on Tuesday, September 25, 2012 - 1:13 pm: |
Ahh, this could be the case. I will look into that hypothesis.
|Posted on Wednesday, June 04, 2014 - 8:02 am: |
Currently I am using 14.0.006e, have being using/administrating WebMO E, and just WebMO at the beginning, for 10 years now.
This last version does nor respond when I disable a user ID, or set the Job time limit in the group setting to 0, or set the Job time limit in an user's account to 0. Still the disabled user can send jobs, and users from groups with job limit time 0 can have jobs executed. I have huge classes and I need to be able to control CPU usage.
What might be wrong here?
Post Number: 400
|Posted on Wednesday, June 04, 2014 - 1:12 pm: |
This is likely related to adding/deleting a user while the daemon is running. This will be resolved in the next release. Starting/stoping the daemon should solve this issue.
|Posted on Friday, February 01, 2019 - 1:16 pm: |
We are currently using Version: 18.0.002e WebMO Enterprise. We are running it on an HPC Cluster that runs SLURM. Our users have experienced some time out cases due to our default queue on the cluster has a short default time limit. We are not sure how to increase the time limit. so that the WebMO interface will work with our slurm scheduler and the user can run longer jobs.
Any suggestions? I am totally new to WebMO. Please forgive me if I am not using the correct terms for things.
Post Number: 639
|Posted on Monday, February 04, 2019 - 1:35 pm: |
It sounds like SLURM is enforcing a time limit set on the "default" queue. This would need to be increased by changing the setting within the SLURM config -- this is outside of the control of WebMO, and cannot be changed from within WebMO. Once the time limit is increased from SLURM, the jobs should finish.
Note that you may have other queues available in SLURM. You can add those alternative queues in the 'Batch Queue' managed and have students submit to these queues. Assuming those queues have a longer time limit, this should also solve the problem,