wiki:jazz/CloudBurst

Version 4 (modified by jazz, 14 years ago) (diff)

--

CloudBurst

Installation and Test Procedure 安裝與測試步驟

  • 使用 Sample Results 的資料集
    jazz@hadoop:~$ wget http://nchc.dl.sourceforge.net/project/cloudburst-bio/cloudburst/CloudBurst-1.1.0/CloudBurst-1.1.0.tgz
    jazz@hadoop:~$ tar zxvf CloudBurst-1.0.1.tgz
    jazz@hadoop:~$ cd CloudBurst-1.1.0
    jazz@hadoop:~/CloudBurst-1.1.0$ wget "http://nchc.dl.sourceforge.net/project/cloudburst-bio/cloudburst-data/CloudBurst-sample-data/CloudBurst-small-sample.tgz"
    jazz@hadoop:~/CloudBurst-1.1.0$ tar zxvf CloudBurst-small-sample.tgz
    jazz@hadoop:~/CloudBurst-1.1.0$ hadoop fs -mkdir cloudburst
    jazz@hadoop:~/CloudBurst-1.1.0$ hadoop fs -put CloudBurst-small-sample/100k.br cloudburst/
    jazz@hadoop:~/CloudBurst-1.1.0$ hadoop fs -put CloudBurst-small-sample/s_suis.br cloudburst/
    jazz@hadoop:~/CloudBurst-1.1.0$ hadoop fs -lsr
    drwxr-xr-x   - jazz supergroup          0 2010-04-30 10:55 /user/jazz/cloudburst
    -rw-r--r--   2 jazz supergroup    4493593 2010-04-30 10:55 /user/jazz/cloudburst/100k.br
    -rw-r--r--   2 jazz supergroup     579773 2010-04-30 10:55 /user/jazz/cloudburst/s_suis.br
    jazz@hadoop:~/CloudBurst-1.0.1$ hadoop jar CloudBurst.jar cloudburst/s_suis.br cloudburst/100k.br results 36 3 0 1 240 48 24 24 128 16 >& cloudburst.err
    jazz@hadoop:~/CloudBurst-1.0.1$ tail -n 1 cloudburst.err
    Total Running time:  102.68
    jazz@hadoop:~/CloudBurst-1.0.1$ hadoop fs -get results .
    jazz@hadoop:~/CloudBurst-1.0.1$ java -jar PrintAlignments.jar results | sort -nk4 > 100k.3.txt
    Printing results
    
  • 在 hadoop.nchc.org.tw 20 台環境下,執行時間約 1 分 13 秒
  • 根據官方網站的說明,最後輸出的格式(100k.3.txt)是給 UCSC Genome Browser 用的格式。

Reference