close
Warning:
Can't synchronize with repository "(default)" (Unsupported version control system "svn": /usr/lib/python2.7/dist-packages/libsvn/_core.so: failed to map segment from shared object: Cannot allocate memory). Look in the Trac log for more information.
- Timestamp:
-
Jun 21, 2010, 6:20:09 PM (14 years ago)
- Author:
-
jazz
- Comment:
-
--
Legend:
- Unmodified
- Added
- Removed
- Modified
-
v4
|
v5
|
|
1 | 1 | = 2010-06-21 = |
2 | | |
3 | | {{{ |
4 | | #!graphviz |
5 | | digraph cloud_stack { size="16,10"; node[shape=box,width=2.0]; |
6 | | "Crawlers" -> "HTML files stored in Google File System" -> "Map/Reduce Job to analysis keywords into index" -> "(search keyword, URLs) key-value pairs stored in BigTable"; |
7 | | } |
8 | | }}} |
9 | 2 | |
10 | 3 | == Linux == |
… |
… |
|
31 | 24 | $ rsync --partial --progress --rsh=ssh localfile remotehost:directory/ |
32 | 25 | }}} |
| 26 | |
| 27 | == Hadoop == |
| 28 | |
| 29 | * 今天在論壇上有人問到如果 google 的查詢索引放在記憶體中,若使用 Hadoop 與 HBase 的邏輯來思考,資料流大概是長這樣: |
| 30 | {{{ |
| 31 | #!graphviz |
| 32 | digraph cloud_stack { size="16,10"; node[shape=box,width=2.0]; |
| 33 | "Crawlers" -> "HTML files stored in Google File System" -> "Map/Reduce Job to analysis keywords into index" -> "(search keyword, URLs) key-value pairs stored in BigTable"; |
| 34 | } |
| 35 | }}} |