= 2010-06-21 = == Linux == * [http://www.liamdelahunty.com/tips/linux_ls_awk_totals.php Linux - Sum using awk] - 用 awk 指令進行檔案大小加總(sum) {{{ $ ls -lh php*; ls -l php* | awk '{ SUM += $5} END { print SUM/1024/1024 }' -rw-rw-r-- 1 admin site13 1013 Aug 12 11:01 php_checkbox_check.php -rw-rw-r-- 1 admin site13 8.9k Nov 11 23:15 php_contact_form.php -rw-rw-r-- 1 liamdela site13 560 Oct 1 2002 php_convert_url_to_link.php -rw-r--r-- 1 root site13 948 Dec 7 21:56 php_gethostbyaddr.php -rw-rw-r-- 1 liamdela site13 2.1k Feb 25 2002 php_list_a_directory.php -rw-rw-r-- 1 liamdela site13 1.7k May 28 2003 php_multi_dimensional_arrays.php -rw-rw-r-- 1 admin site13 2.9k Nov 14 13:15 php_prev_next_array.php -rw-r--r-- 1 admin site13 1.9k Feb 3 00:55 php_uk_mailing_list.php 0.0194302 }}} * [http://ubuntuforums.org/showthread.php?t=424366 rsync resume downloads] - 使用 rsync 作檔案續傳動作 {{{ $ rsync --partial --progress --rsh=ssh localfile remotehost:directory/ }}} == Hadoop == * 今天在論壇上有人問到如果 google 的查詢索引放在記憶體中,若使用 Hadoop 與 HBase 的邏輯來思考,資料流大概是長這樣: {{{ #!graphviz digraph cloud_stack { size="16,10"; node[shape=box,width=2.0]; "Crawlers" -> "HTML files stored in Google File System" -> "Map/Reduce Job to analysis keywords into index" -> "(search keyword, URLs) key-value pairs stored in BigTable"; } }}}