{{{ #!html
製作 Nutch deb
}}} [[PageOutline]] = 前言 = * 此篇要製作Nutch的deb包 * 參考前一篇 [wiki:waue/2009/0511 Deb檔打包工作][前一篇參考Jazz打包方法] * 動機: Nutch 的安裝方法繁瑣,並且設定檔輸入錯誤則難以debug,常nutch執行完後才知道完全沒有抓到資料,卻又找不出問題在哪?因此若用deb包安裝完後,使用者簡單的再設定一下就可以上手。 * 目的:安裝完此Nutch包,則Nutch 安裝完成,並且載入Nutch的設定檔 * 最終目的:整合hadoop, nutch , tomcat 三個複雜的軟體 * future work: 打包順利,下一步則設計nutch的簡易設定流程,如/opt/drbl/sbin/dcs = 紀錄測試步驟 = * 預安裝到系統的哪個目錄很重要,事先要把檔案的配置拓樸規劃好,否則要改的檔案很多,容易錯亂 * 事先產生自己的 gpg key ,在最後產生deb檔的時候會用到 (用gui的gpa產生比較理想),產生後用'''gpg --list-key |grep D/'''可以查到私鑰的八碼編碼,如:B35CE8C3 * 似乎直接修改nutch-1.0下的conf內的檔案會被警告,因此建議不要動任何原本在nutch-1.0資料夾下的檔案,改寫在debian/nutch.postinst內 == 1. 找到並解壓縮 nutch-1.0.tar.gz == == 2. 在 nutch-1.0 資料夾執行 ''' dh_make -f ../nutch_1.0.tar.gz ''' == == 3. 將debian內的 '''rm *.ex *.EX dir '''等檔案,並修改 rules , control == * changelog {{{ nutch (1.0-1) unstable; urgency=low * Initial release (Closes: #nnnn) -- Wei-Yu Chen Tue, 12 May 2009 11:15:51 +0800 }}} * compat {{{ 5 }}} * control {{{ Source: nutch Section:devel Priority: extra Maintainer: Wei-Yu Chen Build-Depends: debhelper (>= 5) Standards-Version: 3.7.2 Package: nutch Architecture: any Depends: ${shlibs:Depends}, ${misc:Depends}, sun-java6-jre, sun-java6-bin Suggests: sun-java6-jdk, tomcat6 Description: Apache Nutch . Apache Nutch is a software that search and crawl internet. }}} * copyright {{{ This package was debianized by Wei-Yu Chen on Tue, 12 May 2009 11:15:51 +0800. It was downloaded from Upstream Author(s): Copyright: License: The Debian packaging is (C) 2009, Wei-Yu Chen and is licensed under the GPL, see `/usr/share/common-licenses/GPL'. # Please also look if there are files or directories which have a # different copyright/license attached and list them here. }}} * docs {{{ CHANGES.txt LICENSE.txt NOTICE.txt README.txt README.txt }}} * rules (注意此檔的dh_xxxx前為一個'tab'鍵的距離) {{{ #!/usr/bin/make -f export DH_VERBOSE=0 all: install: dh_testdir dh_testroot dh_install -Xlicense.txt dh_installdocs dh_installchangelogs #dh_installexamples dh_compress dh_fixperms dh_installdeb dh_link dh_gencontrol dh_md5sums dh_builddeb clean: dh_clean binary: install build: binary-arch: binary-indep: }}} == 4. 增加 nutch.[install,docs,links,postinst, postrm, prerm] 等檔 == * NAME.postinst, NAME.install, NAME.postrm 等檔案的NAME 須與deb的檔名相同,如:nutch.install * nutch.install 的目錄需對應正確 * nutch.install {{{ conf/* etc/nutch bin opt/nutch lib opt/nutch webapps opt/nutch tomcat opt/nutch plugins opt/nutch urls opt/nutch *.jar opt/nutch *.job opt/nutch *.xml opt/nutch default.properties opt/nutch }}} * nutch.postrm {{{ #!/bin/sh echo "$1" if [ "$1" != remove ] then exit 0 fi setup_hdfsadm_user() { if ! getent passwd hdfsadm >/dev/null; then echo "no account found: 'hdfsadm'." else userdel hdfsadm rm -rf /home/hdfsadm rm -rf /opt/nutch rm -rf /tmp/hadoop* rm -rf /tmp/hsperfdata* fi } setup_hdfsadm_user }}} * nutch.postinst {{{ #!/bin/sh echo "$1" if [ "$1" != configure ] then exit 0 fi show_message() { echo "You can quickly start by following ways [in /opt/nutch/ with root privilege]:" echo "(1) Modify the urls/urls.txt file with indicate urls, one site one line." echo "(2) Use this instruction \"bin/nutch crawl urls -dir search -depth 4 -topN 50\" to crawl web" echo "(3) Type \" tomcat/bin/startup.sh \" and use browser to check the result in http://localhost:8080/" echo "Enjoy !" } show_message }}} * nutch.link {{{ etc/nutch opt/nutch/conf }}} == 5. 在 nutch-1.0 資料夾內編輯Makefile檔 == * 此檔不見得要編寫,然而編寫會有執行上得便利性 * Makefile {{{ VERSION = 0.19.1 all: help deb: @sudo dpkg-buildpackage -rfakeroot -ai386 -k0xB35CE8C3 clean: @sudo debian/rules clean source: @chmod a+x ./bin/* help: @echo "Usage:" @echo "make deb - Build Debian Package." @echo "make clean - Clean up Debian Package temparate files." @echo "make source - download source tarball from hadoop mirror site." @echo "make help - show Makefile options." @echo " " @echo "Example:" @echo "$$ make source; make deb; make clean" }}} == 6. 執行 ''' sudo dpkg-buildpackage -rfakeroot -k0xB35CE8C3''' == * 若有編寫Make,也可以執行 '''sudo make deb''' * 打包成功後會在上層目錄找到nutch-1.0.deb的檔案,而debian/資料夾內也會多了一個nutch的資料夾,此目錄的內容就是被打包在deb檔的內容