= 2011-10-31 = == Crawlzilla == * 最近看到 Google+ Ripples 就是先前一直覺得 Crawlzilla 可以做的分析,把文章跟文章之間的關係找出來。 * [[Image(http://sociable.co/wp-content/uploads/2011/10/google-ripples.png)]] == Hadoop == * Hadoop 與 MySQL 的關係: {{{ #!graphviz digraph g { rankdir=LR; subgraph cluster_1 { node [shape=box,width=1.0,style=filled,color=red]; label = "Hadoop\n資料處理框架\nData Processing Framework"; "HDFS"->"MapReduce"; color=blue; } subgraph cluster_2 { node [shape=box,width=1.0,style=filled,color=green]; label = "MySQL\n資料庫\nDatabase"; "結構化資料\nStructured\nData"; color=red; } node [shape=box,width=1.0,style=filled,color=gray]; "非結構化資料\nUnstructured\nData" -> "HDFS"; "MapReduce" -> "結構化資料\nStructured\nData"; } }}} * Hadoop 與 HBase 的關係: {{{ #!graphviz digraph g { rankdir=LR; subgraph cluster_2 { node [shape=box,width=1.0,style=filled,color=green]; label = "HBase\n分散式資料存儲\nDistributed Datastore"; color=red; subgraph cluster_3 { node [shape=box,width=1.0,style=filled]; label = "HDFS"; "結構化資料\nStructured\nData"; node [shape=box,width=1.0,style=filled,color=red]; "HDFS"; } } subgraph cluster_1 { node [shape=box,width=1.0,style=filled,color=red]; label = "Hadoop\n資料處理框架\nData Processing Framework"; "HDFS"->"MapReduce"; color=blue; } node [shape=box,width=1.0,style=filled,color=gray]; "非結構化資料\nUnstructured\nData" -> "HDFS"; "MapReduce" -> "結構化資料\nStructured\nData"; } }}}