= 2009-10-15 = * [http://www.summit09.ca/ Summit'09] - OGF 27 == Torque == * 撰寫 pbs script {{{ jazz@bio037:~$ cat myscript #!/bin/bash ### Job 名稱 #PBS -N mytest ### 輸出檔案 #PBS -e /home/jazz/mytest.err #PBS -o /home/jazz/mytest.log ###================================================ # 顯示目錄及時間資訊 echo Working directory is $PBS_O_WORKDIR cd $PBS_O_WORKDIR echo Running on host `hostname` echo Time is `date` echo Directory is `pwd` # 執行檔案 date }}} * 丟 job {{{ jazz@bio037:~$ qsub < myscript 30.bio037 }}} * 查 job 執行過程 {{{ jazz@bio037:~$ tracejob 30 /var/spool/torque/mom_logs/20091015: No matching job records located Job: 30.bio037 10/15/2009 00:38:59 S enqueuing into batch, state 1 hop 1 10/15/2009 00:38:59 S Job Queued at request of jazz@bio037, owner = jazz@bio037, job name = mytest, queue = batch 10/15/2009 00:38:59 S Job Modified at request of Scheduler@bio037 10/15/2009 00:38:59 S Exit_status=0 resources_used.cput=00:00:00 resources_used.mem=0kb resources_used.vmem=0kb resources_used.walltime=00:00:00 10/15/2009 00:38:59 L Job Run 10/15/2009 00:38:59 S Job Run at request of Scheduler@bio037 10/15/2009 00:38:59 A queue=batch 10/15/2009 00:38:59 A user=jazz group=jazz jobname=mytest queue=batch ctime=1255538339 qtime=1255538339 etime=1255538339 start=1255538339 owner=jazz@bio037 exec_host=bio013/0 Resource_List.neednodes=1 Resource_List.nodect=1 Resource_List.nodes=1 10/15/2009 00:38:59 A user=jazz group=jazz jobname=mytest queue=batch ctime=1255538339 qtime=1255538339 etime=1255538339 start=1255538339 owner=jazz@bio037 exec_host=bio013/0 Resource_List.neednodes=1 Resource_List.nodect=1 Resource_List.nodes=1 session=5366 end=1255538339 Exit_status=0 resources_used.cput=00:00:00 resources_used.mem=0kb resources_used.vmem=0kb resources_used.walltime=00:00:00 10/15/2009 00:39:07 S Post job file processing error 10/15/2009 00:39:07 S dequeuing from batch, state COMPLETE }}} * 每個 Job 都可以用 jobid 去查執行的 host 是哪些,在 exec_host 這個變數 {{{ etime=1255538339 start=1255538339 owner=jazz@bio037 exec_host=bio013/0 }}} * 從錯誤訊息,可以明白每台 pbs_mom 執行過的 job 都會紀錄在 /var/spool/torque/mom_logs/日期 {{{ /var/spool/torque/mom_logs/20091015 }}}