wiki:jazz/10-05-04

2010-05-04

  • CloudBurst 測試步驟
    • 使用 Sample Results 的資料集
      jazz@hadoop:~$ wget "http://downloads.sourceforge.net/project/cloudburst-bio/cloudburst/CloudBurst-1.0.1/CloudBurst-1.0.1.tgz?use_mirror=nchc"
      jazz@hadoop:~$ tar zxvf CloudBurst-1.0.1.tgz
      jazz@hadoop:~$ cd CloudBurst-1.0.1
      jazz@hadoop:~/CloudBurst-1.0.1$ wget "http://downloads.sourceforge.net/project/cloudburst-bio/cloudburst-data/CloudBurst-sample-data/CloudBurst-small-sample.tgz?use_mirror=nchc"
      jazz@hadoop:~/CloudBurst-1.0.1$ tar zxvf CloudBurst-small-sample.tgz
      jazz@hadoop:~/CloudBurst-1.0.1$ hadoop fs -mkdir cloudburst
      jazz@hadoop:~/CloudBurst-1.0.1$ hadoop fs -put CloudBurst-small-sample/100k.br cloudburst/
      jazz@hadoop:~/CloudBurst-1.0.1$ hadoop fs -put CloudBurst-small-sample/s_suis.br cloudburst/
      jazz@hadoop:~/CloudBurst-1.0.1$ hadoop fs -lsr
      drwxr-xr-x   - jazz supergroup          0 2010-04-30 10:55 /user/jazz/cloudburst
      -rw-r--r--   2 jazz supergroup    4493593 2010-04-30 10:55 /user/jazz/cloudburst/100k.br
      -rw-r--r--   2 jazz supergroup     579773 2010-04-30 10:55 /user/jazz/cloudburst/s_suis.br
      jazz@hadoop:~/CloudBurst-1.0.1$ hadoop jar CloudBurst.jar cloudburst/s_suis.br cloudburst/100k.br results 36 3 0 1 240 48 24 24 128 16 >& cloudburst.err
      jazz@hadoop:~/CloudBurst-1.0.1$ tail -n 1 cloudburst.err
      Total Running time:  102.68
      jazz@hadoop:~/CloudBurst-1.0.1$ hadoop fs -get results .
      jazz@hadoop:~/CloudBurst-1.0.1$ java -jar PrintAlignments.jar results | sort -nk4 > 100k.3.txt
      Printing results
      
    • 執行時間約 1 mins 13 sec
    • 根據官方網站的說明,最後輸出的格式(100k.3.txt)是給 UCSC Genome Browser 用的格式。
Last modified 14 years ago Last modified on May 5, 2010, 2:41:38 AM