wiki:HyperTable

Version 8 (modified by sunny, 17 years ago) (diff)

--

Hypertable 源起

搜尋引擎公司Zvents以Google的9位研究人員在2006年發表的Bigtable(簡稱BT,Google內部使用的文件儲存系統)設計規格為基礎[FYI]《Bigtable:結構化數據的分佈存儲系統》,推出了一款開放源碼的分散式數據儲存系統Hypertable專案。以 C++ 撰寫,可架在 HDFS 和 KFS 上,按照1000節點比例設計,0.9 alpha測試版已經在10個節點上測試過,儘管還在初期階段,但已有不錯的效能:寫入 28M 列的資料,各節點寫入速率可達 7MB/s,讀取速率可達 1M cells/s。

Hypertable 簡介

總結來說Hypertable是一個高效能,分散式,開放源碼,與欄位導向的資料庫,可以儲存,處理叢集電腦上大量的節構化與非結構化的資料,Hypertable提供C++的API及HQL(Hypertable Query Language)給用戶端來存取資料庫內容。Hypertable的用途並不是為了取代傳統的資料庫管理系統(像是MySQL與Oracle DB),而是為了可以儲存與管理大量的資料集。傳統的關連式資料庫RDBMs(Relational Database Management)是交易導向式,提供許多進階功能給使用者查詢結構化的資料庫內容。Hypertable為了逹到規模可彈性調整以及高效能的輸出,捨棄像是RDBMs常在使用的join或其他query的功能特色,MySQL之類的RDBMs系統屬於row-oriented較適合用於寫入動作較頻繁的工作負載情形,Hypertable屬於column-oriented較適合讀取動作較頻繁的工作負載情形。Hypertable建構於分散式檔案系統(DFS)之上,目前有許多DFS系統,任一DFS均可架設Hypertable,一套DFS可以讓許多機器看起來就好像單一虛擬磁碟,而且都均有容錯與備份機制,因此DFSs可以結合許多叢集電腦的儲存資源,提供高速與大資料量的存取效能,故Hypertable架構在DFS之上也就可以提供高速與高容量的資料庫儲存空間。

How it works

Performance

  • PerformanceTestAOLQueryLog
    • Machine Profile
      • 8 data nodes
        • each node
          • 1 x 1.8GHz Dual-core Opteron Processor 2210
          • 4 GB RAM
          • 4 x 7200 RPM SATA drives (mounted JBOD)
    • The AOL query logs were inserted into an 8-node Hypertable cluster. The average size of each row key was ~7 bytes and each value was ~15 bytes. The insert rate (with 4 simultaneous insert processes) was approximately 410K inserts/s. The table was scanned at a rate of approximately 671K cells/s.

How we use it

Similar Project

  • HBase: Bigtable-like structured storage for Hadoop HDFS
    • Hypertable is based very closely on the design of Bigtable, with a few modifications. Hypertable is designed for speed and is written in C++, while Hbase is in Java.
  • Thrudb
    • Thrudb is a set of simple services built on top of Facebook’s Thrift framework that provides indexing and document storage services for building and scaling websites. Its purpose is to offer web developers flexible, fast and easy-to-use services that can enhance or replace traditional data storage and access layers.
  • DistStore
    • DistStore? is a family of lightweight Thrift based web services which are extremely scalable in both throughput and dataset size. Its purpose is to offer a simple and flexible data storage solution that can grow with your project from inception to millions of users.

How To Install

Reference

Related News

Attachments (1)

Download all attachments as: .zip