= 2012-06-02 = == File System & Kernel Page Size == * 前陣子大家在討論怎樣加速異地檔案傳輸的速度,Thomas 提到說 mount 指令不支援大於 4096 (4K) 的問>題,也提到這個現象受限於 CPU 架構,因為 Kernel Page Size 是相依於 CPU 架構。今天簡單地查了一下,把這個問題紀錄一下,畢竟我們常要去探討 Disk I/O 速度,從硬體架構跟軟體架構的堆疊來看,最終還是得>設法找出最佳化的參數。唯有對問題更了解,才又辦法知道如何把速度榨出來~ * 結論(1) : '''檔案系統的 Block Size (檔案區塊大小)最大值等於 Linux Kernel 的 Page Size (虛擬記憶體分頁大小)''' * [參考一] [http://mkl-note.blogspot.tw/2010/11/mount-xfs-64kb-block-size-cause.html mount XFS 64KB block size cause "Function not implemented"] {{{ #!html

XFS: A high-performance journaling filesystem
http://oss.sgi.com/projects/xfs/
Filesystem Block Size
The minimum filesystem block size is 512 bytes. The maximum filesystem block size is the page size of the kernel, which is 4K on x86 architecture and is set as a kernel compile option on the IA64 architecture (up to 64 kilobyte pages). So, XFS supports filesystem block sizes up to 64 kilobytes (from 512 bytes, in powers of 2), when the kernel page size allows it.

}}} * 結論(2) : '''支援大於 4K Page Size 的硬體架構為 ia64, mips, pa-risc, powerpc, sh, sparc64''' * [參考二] [http://superuser.com/questions/291228/mount-ext4-partition-with-4kib-block-size Mount ext4 partition with >4KiB block size] {{{ #!text From page_types.h under arch\x86\include\asm /* PAGE_SHIFT determines the page size */ #define PAGE_SHIFT 12 #define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT) #define PAGE_MASK (~(PAGE_SIZE-1)) }}} {{{ #!text Here is a list of the archs that support 64KiB or greater page sizes: ia64, mips, pa-risc, powerpc, sh, sparc64. So it looks like my best bet it to find an old-PPC Mac. }}} * 那如何查出目前的 Kernel Page Size 呢? * [參考三] [http://www.cyberciti.biz/faq/linux-check-the-size-of-pagesize/ Linux Find Out Virtual Memory PAGESIZE] {{{ ~$ getconf PAGESIZE OR ~$ getconf PAGE_SIZE }}} * 結論(3):'''除了加大 Block Size 之外,最常用的方式是啟動 read ahead (運用快取來提昇硬碟循序讀取的效率)''' * [參考四] [http://forums.opensuse.org/forums/english/get-technical-help-here/install-boot-login/437078-changing-pagesize-kernel.html#post2152312 Changing PAGESIZE in kernel] {{{ #!text AIX vio ioo is a read ahead cache system to improve speed of transfers when doing sequential reads. }}} * [參考五] [http://www.linuxpinguin.de/2011/01/performance-increase-through-bigger-readahead-buffers/ Performance increase through bigger readahead buffers] - 以下示範是把預設的 RA 由 256 提升到 1024(注意是設定'''整顆硬碟''',而非特定磁區) {{{ The default value for the readahead buffer in linux 256, so by increasing this to 1024 we can get a better read performance in sequential reads from the disks. To set this you need to do this for each disk (not partition) of your system. eg.: }}} {{{ ~$ blockdev --report RO RA SSZ BSZ 起始磁區 大小 裝置 rw 256 512 4096 0 500107862016 /dev/sda rw 256 512 4096 2048 497999151104 /dev/sda1 rw 256 512 4096 972656640 2106589184 /dev/sda2 ~$ sudo blockdev --setra 1024 /dev/sda }}}