Changes between Initial Version and Version 1 of chwhs/DataChallenge


Ignore:
Timestamp:
Jul 1, 2008, 2:59:59 PM (16 years ago)
Author:
chwhs
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • chwhs/DataChallenge

    v1 v1  
     1
     2== '''''Data Challenge''''' ==
     3 * (宗旨?):先構想出較有利的機制並透過設計來調整Torque、GXP和Gfarm的運作,盡可能降低資料搬移量並且能取得最佳資源執行工作,更能在某資源fail時,能儘速交由另一資源來接續其工作,而不是重新執行此工作,來達到seamlessly完成工作,更進而加快完成速度。
     4 * 簡易測試範例
     5   * '''$ gxpc e --help'''
     6{{{
     7Usage:
     8  gxpc e  [OPTION ...] CMD
     9  gxpc mw [OPTION ...] CMD
     10  gxpc ep [OPTION ...] file
     11
     12Description:
     13  Execute the command on the selected nodes.
     14
     15Option (for mw only):
     16  --master 'command'
     17    equivalent to e --updown '3:4:command' ...
     18  if --master is not given, it is equivalent to e --updown 3:4 ...
     19
     20Options (for e, mw, and ep):
     21  --withmask,-m MASK
     22    execute on a set of nodes saved by savemask or pushmask
     23  --withhostmask,-h HOSTMASK
     24    execute on a set of nodes whose names match regexp HOSTMASK
     25  --withhostnegmask,-H HOSTMASK
     26    execute on a set of nodes whose names do not match regexp HOSTMASK
     27  --up FD0[:FD1]
     28    collect output from FD0 of CMD, and output them to FD1 of gxpc.
     29    if :FD1 is omitted, it is treated as if FD1 == FD0
     30  --down FD0[:FD1]
     31    broadcast input to FD0 of gxpc to FD1 of CMD.
     32    if :FD1 is omitted, it is treated as if FD1 == FD0
     33  --updown FD1:FD2[:MASTER]
     34    if :MASTER is omitted, collect output from FD1 of CMD,
     35    and broadcast them to FD2 of CMD.
     36    if :MASTER is given, run MASTER on the local host, collect
     37    output from FD1 of CMD, feed them to stdin of the MASTER.
     38    broadcast stdout of the MASTER to FD1 of CMD.
     39  --pty
     40    assign pseudo tty for stdin/stdout/stderr of CMD
     41
     42By default,
     43
     44- stdin of gxpc are broadcast to stdin of CMD
     45- stdout of CMD are output to stdout of gxpc
     46- stderr of CMD are output to stderr of gxpc
     47
     48This is as if `--down 0 --up 1 --up 2' are specified.  In this
     49case, stdout/stderr are block-buffered by default.  You may need
     50to do setbuf in your program or flush stdout/err, to display
     51CMD's output without delay.  --pty overwrites this and turn them
     52to line-buffered (by default).  both stdout/err of CMD now goto
     53stdout of gxpc (they are merged).  CMD's stdout/err should appear
     54as soon as they are newlined.
     55
     56See Also:
     57  smask savemask pushmask rmask restoremask popmask
     58}}}
     59   * $ '''gxpc use --help'''
     60{{{
     61Usage:
     62  gxpc use          [--as USER] RSH_NAME SRC [TARGET]
     63  gxpc use --delete [--as USER] RSH_NAME SRC [TARGET]
     64  gxpc use
     65  gxpc use --delete [idx]
     66
     67Description:
     68
     69  Configure rsh-like commands used to login targets matching a
     70particular pattern from hosts matching a particular pattern. The
     71typical usage is `gxpc use RSH_NAME SRC TARGET', which says gxp can
     72use an rsh-like command RSH_NAME for SRC to login TARGET. gxpc
     73remembers these facts to decide which hosts should issue which
     74commands to login which hosts, when explore command is issued. See the
     75section of explore command and the tutorial section of the manual.
     76
     77Examples:
     78  gxpc use           ssh abc000.def.com pqr.xyz.ac.jp
     79  gxpc use           ssh abc000 pqr
     80  gxpc use           ssh abc
     81  gxpc use           rsh abc
     82  gxpc use --as taue ssh abc000 pqr
     83  gxpc use qrsh      abc
     84  gxpc use qrsh_host abc
     85  gxpc use sge       abc
     86  gxpc use torque    abc
     87
     88The first line says that, if gxpc is told to login pqr.xyz.ac.jp by
     89explore command, hosts named abc000.def.com can use `ssh' method to do
     90so.  How it translates into the actual ssh command line can be shown
     91by `show_explore' command (try `gxpc help show_explore') and can be
     92configured by `rsh' command (try `gxpc help rsh').
     93
     94SRC and TARGET are actually regular expressions, so the line like the
     95first one can often be written like the second one.  The first line
     96is equivalent to the second line as long as there is only one host
     97begining with abc000 and there is only one target beginning with pqr.
     98In general, the specification:
     99
     100  gxpc use RSH_NAME SRC TARGET
     101
     102is read: if gxpc is told to login a target matching regular
     103expession TARGET, a host matching regular expression SRC can use
     104RSH_NAME to do so.
     105
     106Note that the effect of use command is NOT to specify which target
     107gxpc should login, but to specify HOW it can do so, if it is told
     108to. It is the role of explore command to specify which target hosts it
     109should login
     110
     111If the TARGET argument is omitted as in the third line, it is
     112treated as if TARGET expression is SRC. That is, the third line
     113is equivalent to:
     114
     115  gxpc use ssh abc abc
     116
     117This is often useful to express that ssh login is possible
     118between hosts within a single cluster, which typically have a
     119common prefix in their host names. If the traditional rsh command
     120is allowed within a single cluster, the fourth line may be useful
     121too.
     122
     123If --as user option is given, login is issued using an explicit user
     124name. The fifth line says when gxp attempts to login pqr from abc000,
     125the explicit user name `taue' should be given. You do not need this as
     126long as the underlying rsh-like command will complement it by a
     127configuration file. e.g., ssh will read ~/.ssh/config to complement
     128user name used to login a particular host.
     129
     130qrsh_host uses command qrsh, with an explicit hostname argument
     131to login a particular host (i.e., qrsh -l hostname=...).  This is
     132useful in environments where direct ssh is discouraged or
     133disallowed and qrsh is preferred.
     134
     135qrsh also uses qrsh, but without an explicit hostname. The host
     136is selected by the scheduler. Therefore it does not make sense to
     137try to speficify a particular hostname as TARGET.  Thus, the
     138effect of the line
     139
     140  gxpc use qrsh abc
     141
     142is if targets beginning with abc is given (upon explore command),
     143a host beginning with abc will issue qrsh, and get whichever host
     144is allocated by the scheduler.
     145
     146See Also:
     147  explore conf_explore
     148}}}
     149   * $ '''gxpc explore --help'''
     150{{{
     151Usage:
     152  gxpc explore [OPTIONS] TARGET TARGET ...
     153
     154Description:
     155  Login target hosts specified by OPTIONS and TARGET.
     156
     157Options:
     158  --dry
     159    dryrun. only show target hosts
     160  --hostfile,-h HOSTS_FILE
     161    give known hosts by file
     162  --hostcmd HOSTS_CMD
     163    give known hosts by command output
     164  --targetfile,-t TARGETS_FILE
     165    give target hosts by file
     166  --targetcmd TARGETS_CMD
     167    give target hosts by command output
     168  --timeout SECONDS
     169    specify the time to wait for a remote host's response
     170    until gxp considers it dead
     171  --children_soft_limit N (>= 2)
     172    control the shape of the explore tree. if this value is N, gxpc
     173    tries to keep the number of children of a single host no more than N,
     174    unless it is absolutely necessary to reach requested nodes.
     175  --children_hard_limit N
     176    control the shape of the explore tree. if this value is N, gxpc
     177    keeps the number of children of a single host no more than N, in any event.
     178  --verbosity N (0 <= N <= 2)
     179    set verbosity level (the larger the more verbose)
     180  --set_default
     181    if you set this option, options specified in this explore becomes the default.
     182    for example, if you say --timeout 20.0 and --set_default, timeout is set to
     183    20.0 in subsequent explores, even if you do not specify --timeout.
     184  --reset_default
     185    reset the default values set by --set_default.
     186  --show_settings
     187    show effective explore options, considering those given by command line and
     188    those specified as default values.
     189
     190Execution of an explore command will conceptually consist of the
     191following three steps.
     192
     193(1) Known Hosts: Know names of existing hosts, either by
     194--hostfile, --hostcmd, or a default rule. These are called
     195'known hosts.' -h is an acronym of --hostfile.
     196
     197(2) Targets: Extract login targets from known hosts. They are
     198extracted by regular expressions given either by --targetfile,
     199--targetcmd, or directly by command line arguments. -t is
     200an acronym of --targetfile.
     201
     202(3) gxpc will attempt to login these targets according to the
     203rules specified by `use' commands.
     204
     205Known hosts are specified by a file using --hostfile option, or
     206by output of a command using --hostcmd. Formats of the two are
     207common and very simple. In the simplest format, a single file
     208contains a single hostname. For example,
     209
     210   hongo001
     211   hongo002
     212   hongo004
     213   hongo005
     214   hongo006
     215   hongo007
     216   hongo008
     217
     218is a valid HOSTS_FILE. If you specify a command that outputs
     219a list of files in the above format, the effect is the same
     220as giving a file having the list by --hostfile. For example,
     221
     222  --hostcmd 'for i in `seq 1 8` ; do printf "%03d\n" $i ; done'
     223
     224has the same effect as giving the above file to --hostfile.
     225
     226The format of a HOSTS_FILE is actually a so-called /etc/hosts
     227format, each line of which may contain several aliases of the
     228same host, as well as their IP address. gxpc simply regards them
     229as aliases of a single host, wihtout giving any significance to
     230which columns they are in. Anything after `#' on each line is a
     231comment and ignored. Lines not containning any name, such as
     232empty lines, are also ignored.  The above simple format is
     233obviously a special case of this.
     234
     235It is sometimes convenient to specify /etc/hosts as an argument
     236to --hostfile or to specify `ypcat hosts' as an argument to
     237--hostcmd. As a matter of fact, if you do not specify any of
     238--hostfile, --hostcmd, --targetfile, and --targetcmd, it is
     239treated as if --hostfile /etc/hosts is given.
     240
     241Login targets are specified by a file using --targetfile option,
     242--targetcmd option, or by directly listing targets in the command
     243line. Format of them are common and only slightly different from
     244HOSTS_FILE.  The format of the list of targets in the command
     245line is as follows.
     246
     247   TARGET_REGEXP [N] TARGET_REGEXP [N] TARGET_REGEXP [N] ...
     248
     249where N is an integer and TARGET_REGEXP is any string that cannot
     250be parsed as an integer. That is, it is a list of regular
     251expressions, each item of which may optionally be followed by an
     252integer. The integer indicates how many logins should occur to
     253the target matching TARGET_REGEXP. The following is a valid
     254command line.
     255
     256  gxpc explore -h hosts_file hongo00
     257
     258which says you want to target all hosts beginning with hongo00,
     259among all hosts listed in hosts_file.  If, for example, you have
     260specified by `use' command that the local host can login these
     261hosts by ssh, you will reach hosts whose names begin with
     262hongo00.  If you instead say
     263
     264  gxpc explore -h hosts_file hongo00 2
     265
     266you will get two processes on each of these hosts.
     267
     268If you do not give any of --targetfile, --targetcmd, and command
     269line targets, it is treated as if a regular expression mathing
     270any string is given as the command line target. That is, all
     271known hosts are targets.
     272
     273Format of targets_host is simply a list of lines each of which
     274is like the list of arguments just explained above. Thus, the
     275following is a valid TARGETS_FILE.
     276
     277  hongo00 2
     278  chiba0
     279  istbs
     280  sheep
     281
     282which says you want to get two processes on each host beginning
     283with hongo00 and one process on each host beginning with chiba0,
     284istbs, or sheep. Just to illustrate the syntax, the same thing
     285can be alternatively written with different arrangement into
     286lines.
     287
     288  hongo00 2 chiba0
     289  istbs sheep
     290
     291Similar to hosts_file, you may instead specify a command line
     292producing the output conforming to the format of TARGETS_FILE.
     293
     294We have so far explained that target_regexp is matched against a
     295pool of known hosts to generate the actual list of targets.
     296There is an exception to this. If TARGET_REGEXP does not match
     297any host in the pool of known hosts, it is treated as if the
     298TARGET_REGEXP is itself a known host. Thus,
     299
     300  gxpc explore hongo000 hongo001
     301
     302will login hongo000 and hongo001, because neither hosts_file nor
     303hosts_cmd hosts are given so these expressions obviously won't
     304match any known host. Using this rule, you may have a file that
     305explicitly lists all hosts and solely use it to specify targets
     306without using separate HOSTS_FILE. For example, if you have a
     307long TARGETS_FILE called targets like:
     308
     309  abc000
     310  abc001
     311    ...
     312  abc099
     313  def000
     314  def001
     315    ...
     316  def049
     317  pqr000
     318  pqr001
     319    ...
     320  pqr149
     321
     322and say
     323
     324  gxpc explore -t targets
     325
     326you say you want to get these 300 targets using whatever methods
     327you specified by `use' commands.
     328
     329Unlike HOSTS_FILE, an empty line in TARGETS_FILE is treated as if
     330it is the end of file. By inserting an empty line, you can easily
     331let gxpc ignore the rest of the file. This rule is sometimes
     332convenient when targeting a small number of hosts within a
     333TARGETS_FILE.
     334
     335Here are some examples.
     336
     3371.
     338
     339  gxpc explore -h hosts_file chiba hongo
     340
     341Hosts beginning with chiba or hongo in hosts_file
     342become the targets.
     343
     3442.
     345
     346  gxpc explore -h hosts_file -t targets_file
     347
     348Hosts matching any regular expression in targets_file become
     349the targets.
     350
     3513.
     352
     353  gxpc explore -h hosts_file
     354
     355All hosts in hosts_file become the targets.  Equivalent to `gxpc
     356explore -h hosts_file .'  (`.' is a regular expression mathing
     357any non-empty string).
     358
     3594.
     360
     361  gxpc explore -t targets_file
     362
     363All hosts in targetfile become the targets. This is simiar to the
     364previous case, but the file format is different.  Note that in
     365this case, strings in targets_file won't be matched against
     366anything, so they should be literal target names.
     367
     3685.
     369
     370  gxpc explore chiba000 chiba001 chiba002 chiba003
     371
     372chiba000, chiba001, chiba002, and chiba003 become the targets.
     373
     3746.
     375
     376  gxpc explore chiba0
     377
     378Equivalent to `gxpc explore -h /etc/hosts chiba0' which is hosts
     379beginning with chiba0 in /etc/hosts become the targets. Useful
     380when you use a single cluster and all necessary hosts are listed
     381in that file.
     382
     3837.
     384
     385  gxpc explore
     386
     387Equivalent to `gxpc explore -h /etc/hosts' which is in turn
     388equivalent to `gxpc explore -h /etc/hosts .'  That is, all hosts
     389in /etc/hosts become the targets.  This will be rarely useful
     390because /etc/hosts typically includes hosts you don't want to
     391use.
     392}}}
     393   * 以在chiba這個cluster裡為例:(機器有chiba000-157,用到chiba000-003)
     394{{{
     395$ dach005@chiba000:~$ gxpc e hostname (用gxpc e 來帶入要使用的指令"hostname",在一開始無login到其他hosts時先查看)
     396chiba000 
     397
     398$ dach005@chiba000:~$ gxpc use ssh chiba (表示可使用ssh指令login 到chiba cluster裡的hosts,即source hosts和target hosts都在此cluster裡)
     399
     400$ dach005@chiba000:~$ gxpc use (可看目前可ssh 的列表)
     4010 : use ssh chiba chiba
     402
     403$ dach005@chiba000:~$ gxpc explore chiba00[[ 1-3 ]] (可用explore指令來達到真的login到遠端機器裡)(因為語法問題,下指令時無需空格)
     404reached : chiba001
     405reached : chiba002
     406reached : chiba003
     407
     408$ dach005@chiba000:~$ gxpc e hostname (再看一次,可以發現目前我們可以reach到的機器列表)
     409chiba000
     410chiba003
     411chiba002
     412chiba001
     413
     414$ dach005@chiba000:~$ qsub test_3.sh (Torque 沒搭配GXP時,僅可以在本端機器這邊下命令)
     41564.chiba000.intrigger.nii.ac.jp
     416
     417$ dach005@chiba000:~$ gxpc e qsub test_3.sh (搭配torque 我們發現就可以同時在本機及遠端機器下執行指令--> chiba000-003是GXP的執行nodes而非執行Torque的執行nodes)
     41867.chiba000.intrigger.nii.ac.jp
     41968.chiba000.intrigger.nii.ac.jp
     42065.chiba000.intrigger.nii.ac.jp
     42166.chiba000.intrigger.nii.ac.jp
     422}}}