Background & Summary
VASP has been added as an interface and a small test job submitted. VASP appears to hang when being executed by the assigned node in the HPC cluster webmo is running on though.
The necessary POTCAR file is empty. (That is an issue we will later have to resolve).
Behavior
This is the contents of the run_log:
Code: Select all
[webmo@pople-n005 18]$ cat run_log
Executing script: ./run_vasp.cgi
Creating working directory: /tmp/webmo-16380/18
Script execution node: pople-n005.cluster
Job execution node(s): pople-n005.cluster
Executing command: /home/support/apps/intel/15.0.6/impi/5.0.3.049/intel64/bin/mpirun -np 1 -machinefile /tmp/hvfk1MvL6t /home/users/webmo/vasp/vasp.5.4.4_pople01_parallel_complex_extended
Code: Select all
-rw-r--r-- 1 webmo webmo 51 Jul 23 17:32 zmatrix
-rw-r--r-- 1 webmo webmo 201 Jul 23 17:32 input.xyz
-rw-r--r-- 1 webmo webmo 15 Jul 23 17:32 charges
drwxr-xr-x 19 webmo webmo 168 Jul 23 17:32 ..
-rw-r--r-- 1 webmo webmo 234 Jul 23 17:32 job_options
-rw-r--r-- 1 webmo webmo 184 Jul 23 17:32 summary
-rw-r--r-- 1 webmo webmo 0 Jul 23 17:32 notes
-rw-r--r-- 1 webmo webmo 221 Jul 23 17:32 input.poscar
-rw-r--r-- 1 webmo webmo 47 Jul 23 17:32 input.kpoints
-rw-r--r-- 1 webmo webmo 59 Jul 23 17:32 input.inp
lrwxrwxrwx 1 webmo webmo 53 Jul 23 17:32 POSCAR -> /usr/local/webmo/private/webmo/graeme/18/input.poscar
lrwxrwxrwx 1 webmo webmo 54 Jul 23 17:32 KPOINTS -> /usr/local/webmo/private/webmo/graeme/18/input.kpoints
lrwxrwxrwx 1 webmo webmo 50 Jul 23 17:32 INCAR -> /usr/local/webmo/private/webmo/graeme/18/input.inp
-rw-r--r-- 1 webmo webmo 0 Jul 23 17:32 POTCAR
-rw-r--r-- 1 webmo webmo 1.9K Jul 23 17:32 pbs_script.sh
-rw-r--r-- 1 webmo webmo 0 Jul 23 17:32 pbs_stdout
-rw-r--r-- 1 webmo webmo 353 Jul 23 17:32 run_log
-rw-r--r-- 1 webmo webmo 0 Jul 23 17:32 output.out.stdout
lrwxrwxrwx 1 webmo webmo 58 Jul 23 17:32 output.out -> /usr/local/webmo/private/webmo/graeme/18/output.out.stdout
-rw-r--r-- 1 webmo webmo 96 Jul 23 17:33 output.out.stderr
-rw-r--r-- 1 webmo webmo 255 Jul 24 15:07 pbs_stderr
-rw-rw-r-- 1 webmo webmo 0 Jul 24 15:08 CHGCAR
-rw-rw-r-- 1 webmo webmo 0 Jul 24 15:08 WAVECAR
-rw-rw-r-- 1 webmo webmo 0 Jul 24 15:08 EIGENVAL
-rw-rw-r-- 1 webmo webmo 0 Jul 24 15:08 CONTCAR
-rw-rw-r-- 1 webmo webmo 0 Jul 24 15:08 DOSCAR
-rw-rw-r-- 1 webmo webmo 0 Jul 24 15:08 OSZICAR
-rw-rw-r-- 1 webmo webmo 0 Jul 24 15:08 PCDAT
-rw-rw-r-- 1 webmo webmo 0 Jul 24 15:08 XDATCAR
-rw-rw-r-- 1 webmo webmo 0 Jul 24 15:08 REPORT
-rw-rw-r-- 1 webmo webmo 0 Jul 24 15:08 CHG
drwxr-xr-x 2 webmo webmo 4.0K Jul 24 15:08 .
-rw-rw-r-- 1 webmo webmo 401 Jul 24 15:14 OUTCAR
-rw-rw-r-- 1 webmo webmo 746 Jul 24 15:14 vasprun.xml
These are the relevant processes running on the node in the cluster:
Code: Select all
$ ps auxww | grep webmo
webmo 5929 0.0 0.0 23928 1596 ? S Jul23 0:00 /bin/sh /var/spool/slurmd/job21586/slurm_script
webmo 6094 0.0 0.0 34300 5188 ? S Jul23 0:00 /usr/bin/perl ./run_vasp.cgi 18 graeme compute
webmo 6097 0.0 0.0 23936 1556 ? S Jul23 0:00 /bin/sh /home/support/apps/intel/15.0.6/impi/5.0.3.049/intel64/bin/mpirun -np 1 -machinefile /tmp/hvfk1MvL6t /home/users/webmo/vasp/vasp.5.4.4_pople01_parallel_complex_extended
webmo 6103 0.0 0.0 20300 1544 ? S Jul23 0:00 mpiexec.hydra -np 1 -machinefile /tmp/hvfk1MvL6t /home/users/webmo/vasp/vasp.5.4.4_pople01_parallel_complex_extended
webmo 6104 0.0 0.0 0 0 ? Z Jul23 0:00 [srun] <defunct>
root 11664 0.0 0.0 103320 868 pts/1 S+ 15:07 0:00 grep --color=auto webmo
Expected behavior
The job should actually fail, not just hang. The POTCAR file is empty so that should cause it to break. But the webmo launched job doesn't get that far, it just hangs there indefinitely.
Running some of the same commands by hand leads to these errors:
Code: Select all
[webmo@pople-n005 18]$ bash /home/support/apps/intel/15.0.6/impi/5.0.3.049/intel64/bin/mpirun -np 1 -machinefile /tmp/hvfk1MvL6t /home/users/webmo/vasp/vasp.5.4.4_pople01_parallel_complex_extended
running on 1 total cores
distrk: each k-point on 1 cores, 1 groups
distr: one band on 1 cores, 1 groups
using from now: INCAR
vasp.5.4.4.18Apr17-6-g9f103f2a35 (build Nov 07 2019 11:29:47) complex
POSCAR found : 1 types and 2 ions
scaLAPACK will be used
ERROR: number of potentials on File POTCAR incompatible with number of species
INCAR : 1 POTCAR: 0
[webmo@pople-n005 18]$ /home/support/apps/intel/15.0.6/impi/5.0.3.049/intel64/bin/mpirun -np 1 -machinefile /tmp/hvfk1MvL6t /home/users/webmo/vasp/vasp.5.4.4_pople01_pa
rallel_complex_extended
running on 1 total cores
distrk: each k-point on 1 cores, 1 groups
distr: one band on 1 cores, 1 groups
using from now: INCAR
vasp.5.4.4.18Apr17-6-g9f103f2a35 (build Nov 07 2019 11:29:47) complex
POSCAR found : 1 types and 2 ions
scaLAPACK will be used
ERROR: number of potentials on File POTCAR incompatible with number of species
INCAR : 1 POTCAR: 0
Any pointers would be very much appreciated.
Many thanks in advance.
Sean