= 2009-04-14 = * Hadoop 叢集 * 升級至 Lenny * 重新蒐集 MAC Address 讓 hostname 盡量與 KVM 順序相仿 * 安裝 Power Management ,由於機房有限制單一插座總電流不得大於 12 安培,因此不能插滿八個電源孔。 * [消防/資安] Power KVM 的電源線也必須注意可容納的電流上限。 || hostname || hadoop107 || hadoop108 || hadoop109 || hadoop110 || hadoop || hadoop111 || hadoop112 || || KVM || 9 || 10 || 11 || 12 || 13 || 14 || 15 || || Power KVM || 1G || 2A || 2B || 2C || 2D || 2E || 2F || || hostname || hadoop106 || hadoop105 || hadoop104 || hadoop103 || hadoop102 || hadoop101 || || KVM || 6 || 5 || 4 || 3 || 2 || 1 || || Power KVM || 1F || 1E || 1D || 1C || 1B || 1A || == Hadoop == * 從 Y!TW 蔡奕楷 [http://www.openfoundry.org/index.php?option=com_docman&Itemid=112&gid=255&lang=en&task=cat_view 12/06 的 Hadoop - Open source grid computing platform 演講投影片],看到一個有趣的指令叫 hod,回想起先前在 Hadoop 官方網站上看到的 [http://hadoop.apache.org/core/docs/r0.18.3/hod.html Hadoop On Demand],簡單瞄了一下文件,HOD 主要是用 Python 寫的格網叢集佈署工具,用 Torque 或 Maui 當資源管理(Resource Manager)和排程器(Scheduler)。雖然文件裡寫著 Virtual Hadoop Cluster,不過沒有看到虛擬化技術的需求跟描述,因此應該僅僅只是在實體叢集中,切割部分節點來組成一個 Hadoop 叢集。 == Virtualization == * [http://www.networkworld.com/cgi-bin/mailto/x.cgi?pagetosend=/export/home/httpd/htdocs/news/2009/040909-users-warned-of-virtualisations-dark.html&pagename=/news/2009/040909-users-warned-of-virtualisations-dark.html&pageurl=http://www.networkworld.com/news/2009/040909-users-warned-of-virtualisations-dark.html&site=printpage Users warned of virtualization's 'dark side' ] - 虛擬化的'黑暗面' == MapReduce == * 持續蒐集一些 MapReduce 不同語言的實作.... * [http://www.quora.com/What-are-some-promising-open-source-alternatives-to-Hadoop-MapReduce-for-map-reduce What are some promising open-source alternatives to Hadoop MapReduce for map/reduce?] (2010-08-24 補充) * R * [http://www.nabble.com/The-R-Project-and-Map-Reduce-td22328530.html The R-Project and Map Reduce] * http://ml.stat.purdue.edu/rhipe/ - Wow!! '''RHIPE - R and Hadoop Integrated Processing v.0.1''' 這兩個的結合真是符合我們目前的方向啊!!! * http://cran.r-project.org/web/packages/mapReduce/ - R 官方的 mapReduce 套件 mapReduce - flexible mapReduce algorithm for parallel computation * 更神奇的是 [http://aws.amazon.com/elasticmapreduce/ Amazon Web Service 也有支援 R] 呢!! {{{ #!sh Develop your data processing application authored in your choice of Java, Ruby, Perl, Python, PHP, R, or C++. }}} * Java * [http://www.gridgain.com/ GridGain] - Java 寫的 MapReduce Framework * [http://mirror.facebook.com/facebook/hive/ Hive] - 架構在 Hadoop 之上,由 facebook 主導的專案 * [http://code.google.com/p/cloudmapreduce Cloud MapReduce] - A MapReduce implementation on Amazon Cloud OS * C/C++ * [http://csl.stanford.edu/~christos/sw/phoenix/ Phoenix] * 2007/3/1 上傳的演講 - Evaluating MapReduce for Multi-core and Multiprocessor Systems * [http://csl.stanford.edu/%7Echristos/publications/2007.cmp_mapreduce.hpca.pdf 投影片] * [http://video.google.com/videoplay?docid=5795534100478091031 演講影片] * [http://www.galagosearch.org/ Galago TupleFlow] * Perl * [http://search.cpan.org/~drrho/Parallel-MapReduce-0.09/lib/Parallel/MapReduce.pm Parallel::MapReduce] * [http://projects.camlcity.org/projects/plasma.html PlasmaFS] - implements the map/reduce framework on a compute cluster * Python * [http://mfisk.github.com/filemap/ FileMap] - [http://github.com/mfisk/filemap/tree/master 原始碼] * [http://discoproject.org/ Disco] - 核心用 Erlang 寫的,Job 管理可以用 Python 撰寫。 * [http://wiki.github.com/klbostee/dumbo dumbo] - 跟 Hadoop 的關聯性非常強,因為這個專案就是 Hadoop Stream 裡的 Python 實作 * [http://wiki.github.com/goossaert/prince/ Prince] - API for Hadoop/MapReduce in Python, 2010 ([wiki:jazz/10-05-12 2010-05-12]) * [http://code.google.com/p/octopy/ octopy] - Easy MapReduce for Python (2010-08-24) * [http://code.google.com/p/httpmr/ httpmr] - A scalable data processing framework for people with web clusters. (2010-08-24) - 架在 Google App Engine 之上 * [http://www.cs.ucr.edu/~jdou/misco/ misco] - A Mobile MapReduce Framework * Ruby * [http://skynet.rubyforge.org/ Skynet] * [http://www.rufy.com/starfish/doc/ Starfish] - Open source Ruby implementation * [http://rubygems.org/gems/mapredus mapredus] - simple mapreduce framework using redis and resque (2010-08-24) * Erlang * [http://wiki.basho.com/display/RIAK/Riak Riak] : An Open Source Internet-Scale Data Store * CUDA * [http://www.cse.ust.hk/gpuqp/Mars.html Mars] - A MapReduce Framework on Graphics Processors - 如果要用 GPU 來算 MapReduce 的話,可以用 Mars * Qt * [http://labs.trolltech.com/page/Projects/Threads/QtConcurrent QtConcurrent] * Open Source C++ MapReduce (non-distributed) implementation from Trolltech * 網頁寫說適用於 shared-memory (non-distributed) systems。 * bash * [http://blog.last.fm/2009/04/06/mapreduce-bash-script Mapreduce Bash Script] - 用 bash shell script 寫的 MapReduce - [http://github.com/erikfrey/bashreduce/tree/master 原始碼] * JavaScript * [http://www.igvita.com/2009/03/03/collaborative-map-reduce-in-the-browser/ Collaborative Map-Reduce in the Browser] - 這個實作所要提倡的精神有點類似 SETI@Home,也就是希望藉由群眾的力量,來打造以 HTTP 為標準的分散式叢集。 * .NET * [http://qizmt.myspace.com/ Qizmt] - !MySpace just released a MapReduce framework for .NET called Qizmt as an open source project. - [http://channel9.msdn.com/shows/Communicating/MySpace-Qizmt-a-NET-MapReduce-Framework/ 簡介影片] - [http://code.google.com/p/qizmt/ 原始碼下載] * [http://research.microsoft.com/en-us/projects/Dryad/ Dryad] - [http://connect.microsoft.com/site/sitehome.aspx?SiteID=891 DryadLINQ] (2010-08-24) * MPI * [http://www.sandia.gov/~sjplimp/mapreduce/doc/Manual.html MapReduce-MPI Library] - ([http://cxwangyi.blogspot.com/2010/05/mpi-based-mapreduce-implementation.html 2010-05-16: MPI-based MapReduce Implementation]) * MySQL * [http://gearman.org/ Gearman] - [http://www.youtube.com/watch?hl=en&v=xDRDu1g270c&gl=US Map/Reduce and Queues for MySQL Using Gearman (Video)] * http://mapreduce.net/ == HBase / CouchDB == * 在看維基百科的時候,發現 CouchDB 同時被分屬在 [http://en.wikipedia.org/wiki/Column-oriented_DBMS Column-oriented DBMS] 跟 [http://en.wikipedia.org/wiki/Document-oriented_database Document-oriented database] * [http://labs.mudynamics.com/2009/04/03/interactive-couchdb/ Interactive CouchDB] - 這裡用 Java Script 示範了 CouchDB 結合不同用途的 MapReduce 實作。