  • Devaraj Das 來訪
    • 09:30 - 10:30 拜會中心主任
    • 11:00 - 12:30 Public Talk Session: "Introduction to Hadoop and Cloud Computing" @ 北群多媒體
    • 14:00 - 17:00 Hands-on Labs (1): "Basics of DFS commands + How to develop MapReduce program using Hadoop?" @ 北群多媒體
  • Hadoop Hands-on Labs (1)
    • download hadoop-0.18.2
      $ cd ~
      $ wget
      $ tar zxvf hadoop-0.18.2.tar.gz
    • 1. Hadoop 會用 SSH 進行內部連線,因此需要做 SSH Key exchange
      ~$ ssh-keygen
      ~$ cp ~/.ssh/ ~/.ssh/authorized_keys
    • 2. 需要 JAVA_HOME 環境變數才能執行 hadoop namenode
      $ echo "export JAVA_HOME=/usr/lib/jvm/java-6-sun" >> ~/.bash_profile
      $ cd ~/hadoop-0.18.2
    • 3. 編輯 conf/ (HADOOP_HOME要設定到你的hadoop安裝目錄)
      export JAVA_HOME=/usr/lib/jvm/java-6-sun
      export HADOOP_HOME=/home/jazz/hadoop-0.18.2/
      export HADOOP_CONF_DIR=$HADOOP_HOME/conf
    • 4. 編輯 conf/hadoop-site.xml 在 configuration 那一段加入以下設定
          The name of the default file system. Either the literal string
          "local" or a host:port for NDFS.
          The host and port that the MapReduce job tracker runs at. If
          "local", then jobs are run in-process as a single map and
          reduce task.
    • 5. 啟動hadoop 的兩道指令
      ~/hadoop-0.18.2$ bin/hadoop namenode -format
      ~/hadoop-0.18.2$ bin/
  • 6. 完成後可以看到以下三個網頁
  • 也可以放的東西上hdfs去看看
    ~/hadoop-0.18.2$ bin/hadoop dfs -put conf conf
    ~/hadoop-0.18.2$ bin/hadoop dfs -ls
    Found 1 items
    drwxr-xr-x   - jazz supergroup          0 2008-11-04 15:56 /user/jazz/conf
    ~/hadoop-0.18.2$ bin/hadoop jar /home/jazz/hadoop-0.18.2/hadoop-0.18.2-examples.jar wordcount
    ERROR: Wrong number of parameters: 0 instead of 2.
    wordcount [-m <maps>] [-r <reduces>] <input> <output>
    Generic options supported are
    -conf <configuration file>     specify an application configuration file
    -D <property=value>            use value for given property
    -fs <local|namenode:port>      specify a namenode
    -jt <local|jobtracker:port>    specify a job tracker
    -files <comma separated list of files>    specify comma separated files to be copied to the map reduce cluster
    -libjars <comma separated list of jars>    specify comma separated jar files to include in the classpath.
    -archives <comma separated list of archives>    specify comma separated archives to be unarchived on the compute machines.
    The general command line syntax is
    bin/hadoop command [genericOptions] [commandOptions]