wiki:jazz/12-06-02

2012-06-02

File System & Kernel Page Size

  • 前陣子大家在討論怎樣加速異地檔案傳輸的速度,Thomas 提到說 mount 指令不支援大於 4096 (4K) 的問>題,也提到這個現象受限於 CPU 架構,因為 Kernel Page Size 是相依於 CPU 架構。今天簡單地查了一下,把這個問題紀錄一下,畢竟我們常要去探討 Disk I/O 速度,從硬體架構跟軟體架構的堆疊來看,最終還是得>設法找出最佳化的參數。唯有對問題更了解,才又辦法知道如何把速度榨出來~
  • 結論(1) : 檔案系統的 Block Size (檔案區塊大小)最大值等於 Linux Kernel 的 Page Size (虛擬記憶體分頁大小)
    • [參考一] mount XFS 64KB block size cause "Function not implemented"

      XFS: A high-performance journaling filesystem
      http://oss.sgi.com/projects/xfs/
      Filesystem Block Size
      The minimum filesystem block size is 512 bytes. The maximum filesystem block size is the page size of the kernel, which is 4K on x86 architecture and is set as a kernel compile option on the IA64 architecture (up to 64 kilobyte pages). So, XFS supports filesystem block sizes up to 64 kilobytes (from 512 bytes, in powers of 2), when the kernel page size allows it.

  • 結論(2) : 支援大於 4K Page Size 的硬體架構為 ia64, mips, pa-risc, powerpc, sh, sparc64
    • [參考二] Mount ext4 partition with >4KiB block size
      From page_types.h under arch\x86\include\asm
      
      /* PAGE_SHIFT determines the page size */
      #define PAGE_SHIFT  12
      #define PAGE_SIZE   (_AC(1,UL) << PAGE_SHIFT)
      #define PAGE_MASK   (~(PAGE_SIZE-1))
      
      Here is a list of the archs that support 64KiB or greater page sizes: 
      ia64, mips, pa-risc, powerpc, sh, sparc64. So it looks like my best bet 
      it to find an old-PPC Mac.
      
  • 那如何查出目前的 Kernel Page Size 呢?
    • [參考三] Linux Find Out Virtual Memory PAGESIZE
      ~$ getconf PAGESIZE
      OR
      ~$ getconf PAGE_SIZE
      
    • How to get linux kernel page size programatically
      jazz@jazzbook:~$ cat /proc/meminfo | grep Mapped
      Mapped:            74764 kB
      jazz@jazzbook:~$ cat /proc/vmstat | grep "nr_mapped"
      nr_mapped 18691
      jazz@jazzbook:~$ A=$(cat /proc/meminfo | grep Mapped | awk '{ print $2 }')
      jazz@jazzbook:~$ B=$(cat /proc/vmstat | grep "nr_mapped" | awk '{ print $2 }')
      jazz@jazzbook:~$ echo $(($A/$B))
      4
      
  • 那有沒有方法可以增加 Kernel Page Size 呢?目前核心有 hugetlbpage 的支援。
  • [參考] Linux 核心文件 : hugetlbpage - 讓 i386 架構支援 4K 與 4M (2M in PAE mode) page sizes 的模組。
  • 結論(3):除了加大 Block Size 之外,最常用的方式是啟動 read ahead (運用快取來提昇硬碟循序讀取的效率)
    • [參考四] Changing PAGESIZE in kernel
      AIX vio ioo is a read ahead cache system to improve speed of transfers when doing sequential reads.
      
    • [參考五] Performance increase through bigger readahead buffers - 以下示範是把預設的 RA 由 256 提升到 1024(注意是設定整顆硬碟,而非特定磁區)
      The default value for the readahead buffer in linux 256, 
      so by increasing this to 1024 we can get a better read performance in sequential reads from the disks.
      To set this you need to do this for each disk (not partition) of your system. eg.:
      
      ~$ blockdev --report
      RO    RA   SSZ   BSZ   起始磁區            大小   裝置
      rw   256   512  4096          0    500107862016   /dev/sda
      rw   256   512  4096       2048    497999151104   /dev/sda1
      rw   256   512  4096  972656640      2106589184   /dev/sda2
      ~$ sudo blockdev --setra 1024 /dev/sda
      
  • 結論(4):可以使用支援Extent的檔案系統,來預留較大的寫入區塊。
    • [參考六] mounting ext4 fs with block size of 65536
      I'd say use 4kb blocks for compatibility and use extents and 
      stripe stride-width options on create/mount to get block allocations 
      in accordance with erase block size of your flash media.
      Make sure also that partition is aligned to this erase block size too!
      
    • [參考七] ubuntu create partition block more than 4k
      You can use fallocate function and any fs with extents: ext4, btrfs, JFS, XFS. 
      Even with 4kb block size: "A single extent in ext4 can map up to 128 MiB of 
      contiguous space".
      
      An extent is a contiguous area of storage in a computer file system, reserved for a file. 
      When a process creates a file, file-system management software allocates a whole extent.
      
      The following systems support extents:
      
          ASM - Automatic Storage Management - Oracle's database-oriented filesystem.
          BFS - BeOS, Zeta and Haiku operating systems.
          Btrfs - GPL'd extent based file storage for Linux.
          Ext4 - Linux filesystem (when the configuration enables extents — the default in Linux since version 2.6.23).
          Files-11 - Digital Equipment Corporation (subsequently Hewlett-Packard) OpenVMS filesystem.
          HFS and HFS Plus - Hierarchical File System - Apple Macintosh filesystems.
          HPFS - High Performance File Syzstem - OS/2 and eComStation.
          JFS - Journaled File System - Used by AIX, OS/2/eComStation and Linux operating systems.
          Microsoft SQL Server - Versions 2000-2008 supports extents of up to 64KB [1].
          Multi-Programming Executive - Filesystem by Hewlett-Packard.
          NTFS - Microsoft's latest-generation file system [1]
          OCFS2 - Oracle Cluster File System - a shared disk file system for Linux.
          Reiser4 - Linux filesystem (in "extents" mode).
          SINTRAN III - File system used by early computer company Norsk Data.
          UDF - Universal Disk Format - Standard for optical media.
          VERITAS File System - Enabled via the pre-allocation API and CLI.
          XFS - SGI's second generation file system.[2]
      

Hadoop & OCR

  • 在 Hadoop 課程中會介紹到一個應用案例「紐約時報用一天的時間在 100 台 EC2 上透過 Hadoop 進行 TIFF 檔轉成 PDF 的 OCR 工作」。然而並沒有實際的程式碼或者範例可以看。今天看到一套 Open Source 可以在 Linux 上運作的 OCR 軟體,名為「tesseract-ocr」。想必在英文語系中,透過這一套軟體可以很輕易辦到英文的大量圖檔辨識。
  • 專案首頁:http://code.google.com/p/tesseract-ocr/
  • An OCR Engine that was developed at HP Labs between 1985 and 1995... and now at Google.
  • Debian 套件:http://packages.debian.org/tesseract

ttysnoop

Last modified 12 years ago Last modified on Jun 3, 2012, 10:31:20 PM