wiki:waue/2009/0512
製作 Nutch deb

前言

  • 此篇要製作Nutch的deb包
  • 參考前一篇 Deb檔打包工作[前一篇參考Jazz打包方法]
  • 動機: Nutch 的安裝方法繁瑣,並且設定檔輸入錯誤則難以debug,常nutch執行完後才知道完全沒有抓到資料,卻又找不出問題在哪?因此若用deb包安裝完後,使用者簡單的再設定一下就可以上手。
  • 目的:安裝完此Nutch包,則Nutch 安裝完成,並且載入Nutch的設定檔
  • 最終目的:整合hadoop, nutch , tomcat 三個複雜的軟體
  • future work: 打包順利,下一步則設計nutch的簡易設定流程,如/opt/drbl/sbin/dcs

紀錄測試步驟

  • 預安裝到系統的哪個目錄很重要,事先要把檔案的配置拓樸規劃好,否則要改的檔案很多,容易錯亂
  • 事先產生自己的 gpg key ,在最後產生deb檔的時候會用到 (用gui的gpa產生比較理想),產生後用gpg --list-key |grep D/可以查到私鑰的八碼編碼,如:B35CE8C3
  • 似乎直接修改nutch-1.0下的conf內的檔案會被警告,因此建議不要動任何原本在nutch-1.0資料夾下的檔案,改寫在debian/nutch.postinst內

1. 找到並解壓縮 nutch-1.0.tar.gz

2. 在 nutch-1.0 資料夾執行 dh_make -f ../nutch_1.0.tar.gz

3. 將debian內的 rm *.ex *.EX dir 等檔案,並修改 rules , control

  • changelog
nutch (1.0-1) unstable; urgency=low

  * Initial release (Closes: #nnnn)  <nnnn is the bug number of your ITP>

 -- Wei-Yu Chen <waue0920@gmail.com>  Tue, 12 May 2009 11:15:51 +0800
  • compat

5
  • control

Source: nutch
Section:devel
Priority: extra
Maintainer: Wei-Yu Chen <waue0920@gmail.com>
Build-Depends: debhelper (>= 5)
Standards-Version: 3.7.2

Package: nutch
Architecture: any
Depends: ${shlibs:Depends}, ${misc:Depends}, sun-java6-jre, sun-java6-bin
Suggests: sun-java6-jdk, tomcat6
Description: Apache Nutch
   .
   Apache Nutch is a software that search and crawl internet.
  • copyright
This package was debianized by Wei-Yu Chen <waue0920@gmail.com> on
Tue, 12 May 2009 11:15:51 +0800.

It was downloaded from <url://example.com>

Upstream Author(s): 

    <put author's name and email here>
    <likewise for another author>

Copyright: 

    <Copyright (C) YYYY Name OfAuthor>
    <likewise for another author>

License:

    <Put the license of the package here indented by 4 spaces>

The Debian packaging is (C) 2009, Wei-Yu Chen <waue0920@gmail.com> and
is licensed under the GPL, see `/usr/share/common-licenses/GPL'.

# Please also look if there are files or directories which have a
# different copyright/license attached and list them here.
  • docs

CHANGES.txt
LICENSE.txt 
NOTICE.txt
README.txt
README.txt
  • rules (注意此檔的dh_xxxx前為一個'tab'鍵的距離)
#!/usr/bin/make -f
export DH_VERBOSE=0

all:

install:
        dh_testdir
        dh_testroot
        dh_install -Xlicense.txt
        dh_installdocs
        dh_installchangelogs
        #dh_installexamples
        dh_compress
        dh_fixperms
        dh_installdeb
        dh_link
        dh_gencontrol
        dh_md5sums
        dh_builddeb

clean:
        dh_clean

binary: install

build:
binary-arch:
binary-indep:

4. 增加 nutch.[install,docs,links,postinst, postrm, prerm] 等檔

  • NAME.postinst, NAME.install, NAME.postrm 等檔案的NAME 須與deb的檔名相同,如:nutch.install
  • nutch.install 的目錄需對應正確
  • nutch.install
conf/*          etc/nutch
bin             opt/nutch
lib             opt/nutch
webapps         opt/nutch
tomcat          opt/nutch
plugins         opt/nutch
urls            opt/nutch
*.jar           opt/nutch
*.job           opt/nutch
*.xml           opt/nutch
default.properties      opt/nutch
  • nutch.postrm
#!/bin/sh

echo "$1"

if [ "$1" != remove ]
then
  exit 0
fi

setup_hdfsadm_user() {
  if ! getent passwd hdfsadm >/dev/null; then
    echo "no account found: 'hdfsadm'."
  else
    userdel hdfsadm
    rm -rf /home/hdfsadm
    rm -rf /opt/nutch
    rm -rf /tmp/hadoop*
    rm -rf /tmp/hsperfdata*
  fi
}

setup_hdfsadm_user
  • nutch.postinst
#!/bin/sh

echo "$1"

if [ "$1" != configure ]
then
  exit 0
fi

show_message() {
  echo "You can quickly start by following ways [in /opt/nutch/ with root privilege]:"
  echo "(1) Modify the urls/urls.txt file with indicate urls, one site one line."
  echo "(2) Use this instruction \"bin/nutch crawl urls -dir search -depth 4 -topN 50\" to crawl web"
  echo "(3) Type \" tomcat/bin/startup.sh \" and use browser to check the result in http://localhost:8080/"
  echo "Enjoy !"
}
show_message
  • nutch.link
etc/nutch  opt/nutch/conf

5. 在 nutch-1.0 資料夾內編輯Makefile檔

  • 此檔不見得要編寫,然而編寫會有執行上得便利性
  • Makefile
    VERSION = 0.19.1
    all: help
    deb:
    	@sudo dpkg-buildpackage -rfakeroot -ai386 -k0xB35CE8C3
    clean:
    	@sudo debian/rules clean
    source: 
    	@chmod a+x ./bin/*
    help:
    	@echo "Usage:"
    	@echo "make deb     - Build Debian Package."
    	@echo "make clean   - Clean up Debian Package temparate files."
    	@echo "make source  - download source tarball from hadoop mirror site."
    	@echo "make help    - show Makefile options."
    	@echo " "
    	@echo "Example:"
    	@echo "$$ make source; make deb; make clean"
    

6. 執行 sudo dpkg-buildpackage -rfakeroot -k0xB35CE8C3

  • 若有編寫Make,也可以執行 sudo make deb
  • 打包成功後會在上層目錄找到nutch-1.0.deb的檔案,而debian/資料夾內也會多了一個nutch的資料夾,此目錄的內容就是被打包在deb檔的內容
Last modified 15 years ago Last modified on May 14, 2009, 8:49:49 AM