Context Navigation

← Previous Change
Wiki History
Next Change →

Changes between Initial Version and Version 1 of jazz/10-10-20

Timestamp:: Oct 20, 2010, 7:54:19 PM (15 years ago)
Author:: jazz
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

jazz/10-10-20

                       v1
+= 2010-10-20 =
+== Hadoop ==
+ * http://github.com/tomwhite/hadoop-book/ - Tom White "Hadoop: The Definitive Guide" 的範例程式
+ * [http://www.sys-con.com/node/1573226/print NoHadoop: Big Data Requires Not Only Hadoop]
+   * 這篇文章講了很多 Hadoop 替代方案，針對許多不同的應用情境：
+     * [http://cloudscale.com CloudScale] - 即時資料倉儲需求（realtime data-wharehouse requirements）
+       * !CloudScale 稱之為 Redoop - 出自 [http://www.sys-con.com/node/1572508/print Hadoop and Realtime Cloud Computing](2010-10-15)
+     * MPI , BSP - 若有高速運算需求（supercomputing requirements）
+     * Pregel (架構,沒有code,Google) - 若要作大量圖形演算法運算（graph computing requirements）
+     * Percolator (架構,沒有code,Google)- 若必須針對大量資料作持續的更新（incrementally update the analytics on a massive data set continuously）
+ * 看樣子今年 Hadoop World NYC 發生了很多事情。
+ * 第一： Oracle 與 Hadoop 整合 - 一個新的專案叫做 [http://www.questsoftware.in/Ora-Oop/ OraOop]
+  * [http://www.ctoedge.com/content/helping-oracle-get-along-hadoop Helping Oracle Get Along with Hadoop]
+  * [[Image(http://img.itbe.com/ctoedge/quest5.gif)]]
+ * 第二： Membase 與 Hadoop 整合 - 可以讓 Hadoop 更即時（Real-Time） ?? 跟我最初想的方法很像(Ex. 結合 ActiveMQ)，想想用 Memcache 也是不錯的選擇啦~搞不好可以用 memcache 來作參數傳遞跟公用變數（Global Variable）。
+   * [http://www.smartbrief.com/news/aaaa/industryMW-detail.jsp?id=838FF0EC-C664-4F16-9F77-F61CA011BB57 Membase-Cloudera Integration Joins Leading Hadoop Distribution and Real-Time NoSQL Database]
+   * [http://blog.membase.com/membase-cloudera-integration Membase and Cloudera Integration]
+   * [http://www.h-online.com/open/news/item/Bi-directional-connection-for-Membase-and-Cloudera-Hadoop-1106525.html Bi-directional connection for Membase and Cloudera Hadoop]
+{{{
+The first consists of a Membase NodeCode module that streams data from Membase to CDH in real-time,
+while the second consists of a Sqoop-derived batch loader utility that allows for the loading of data
+to and from Membase and CDH.
+}}}
+ * 第三： Twitter 與 Hadoop
+   * [http://siliconangle.com/blog/2010/10/12/hadoop-is-a-big-part-of-twitter%E2%80%99s-ecosystem/ Hadoop is a "Big part of Twitter’s ecosystem"]
+{{{
+For something like Hadoop, the presence of a robust, public tool really helps
+to build a prosperous ecosystem.
+}}}
+   * [http://www.computerworld.com/s/article/print/9191098/Twitter_solves_its_data_formatting_challenge?taxonomyName=Storage&taxonomyId=19 Twitter solves its data formatting challenge]
+{{{
+While primary copies of user Tweets are kept in MySQL and Cassandra databases,
+the company is also building a second data repository, running on Hadoop, that
+can be used for analytics and applications.
+}}}