= 2012-06-02 =
== File System & Kernel Page Size ==
* 前陣子大家在討論怎樣加速異地檔案傳輸的速度,Thomas 提到說 mount 指令不支援大於 4096 (4K) 的問>題,也提到這個現象受限於 CPU 架構,因為 Kernel Page Size 是相依於 CPU 架構。今天簡單地查了一下,把這個問題紀錄一下,畢竟我們常要去探討 Disk I/O 速度,從硬體架構跟軟體架構的堆疊來看,最終還是得>設法找出最佳化的參數。唯有對問題更了解,才又辦法知道如何把速度榨出來~
* 結論(1) : '''檔案系統的 Block Size (檔案區塊大小)最大值等於 Linux Kernel 的 Page Size (虛擬記憶體分頁大小)'''
* [參考一] [http://mkl-note.blogspot.tw/2010/11/mount-xfs-64kb-block-size-cause.html mount XFS 64KB block size cause "Function not implemented"]
{{{
#!html
XFS: A high-performance journaling filesystem
http://oss.sgi.com/projects/xfs/
Filesystem Block Size
The minimum filesystem block size is 512 bytes. The maximum filesystem block size is the page size of the kernel, which is 4K on x86 architecture and is set as a kernel compile option on the IA64 architecture (up to 64 kilobyte pages). So, XFS supports filesystem block sizes up to 64 kilobytes (from 512 bytes, in powers of 2), when the kernel page size allows it.
}}}
* 結論(2) : '''支援大於 4K Page Size 的硬體架構為 ia64, mips, pa-risc, powerpc, sh, sparc64'''
* [參考二] [http://superuser.com/questions/291228/mount-ext4-partition-with-4kib-block-size Mount ext4 partition with >4KiB block size]
{{{
#!text
From page_types.h under arch\x86\include\asm
/* PAGE_SHIFT determines the page size */
#define PAGE_SHIFT 12
#define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT)
#define PAGE_MASK (~(PAGE_SIZE-1))
}}}
{{{
#!text
Here is a list of the archs that support 64KiB or greater page sizes:
ia64, mips, pa-risc, powerpc, sh, sparc64. So it looks like my best bet
it to find an old-PPC Mac.
}}}
* 那如何查出目前的 Kernel Page Size 呢?
* [參考三] [http://www.cyberciti.biz/faq/linux-check-the-size-of-pagesize/ Linux Find Out Virtual Memory PAGESIZE]
{{{
~$ getconf PAGESIZE
OR
~$ getconf PAGE_SIZE
}}}
* 結論(3):'''除了加大 Block Size 之外,最常用的方式是啟動 read ahead (運用快取來提昇硬碟循序讀取的效率)'''
* [參考四] [http://forums.opensuse.org/forums/english/get-technical-help-here/install-boot-login/437078-changing-pagesize-kernel.html#post2152312 Changing PAGESIZE in kernel]
{{{
#!text
AIX vio ioo is a read ahead cache system to improve speed of transfers when doing sequential reads.
}}}
* [http://stackoverflow.com/questions/4888067/how-to-get-linux-kernel-page-size-programatically How to get linux kernel page size programatically]
{{{
jazz@jazzbook:~$ cat /proc/meminfo | grep Mapped
Mapped: 74764 kB
jazz@jazzbook:~$ cat /proc/vmstat | grep "nr_mapped"
nr_mapped 18691
jazz@jazzbook:~$ A=$(cat /proc/meminfo | grep Mapped | awk '{ print $2 }')
jazz@jazzbook:~$ B=$(cat /proc/vmstat | grep "nr_mapped" | awk '{ print $2 }')
jazz@jazzbook:~$ echo $(($A/$B))
4
}}}
* [參考五] [http://www.linuxpinguin.de/2011/01/performance-increase-through-bigger-readahead-buffers/ Performance increase through bigger readahead buffers] - 以下示範是把預設的 RA 由 256 提升到 1024(注意是設定'''整顆硬碟''',而非特定磁區)
{{{
The default value for the readahead buffer in linux 256,
so by increasing this to 1024 we can get a better read performance in sequential reads from the disks.
To set this you need to do this for each disk (not partition) of your system. eg.:
}}}
{{{
~$ blockdev --report
RO RA SSZ BSZ 起始磁區 大小 裝置
rw 256 512 4096 0 500107862016 /dev/sda
rw 256 512 4096 2048 497999151104 /dev/sda1
rw 256 512 4096 972656640 2106589184 /dev/sda2
~$ sudo blockdev --setra 1024 /dev/sda
}}}
* 結論(4):'''可以使用支援[http://en.wikipedia.org/wiki/Extent_%28file_systems%29 Extent]的檔案系統,來預留較大的寫入區塊。'''
{{{
#!text
The following systems support extents:
ASM - Automatic Storage Management - Oracle's database-oriented filesystem.
BFS - BeOS, Zeta and Haiku operating systems.
Btrfs - GPL'd extent based file storage for Linux.
Ext4 - Linux filesystem (when the configuration enables extents — the default in Linux since version 2.6.23).
Files-11 - Digital Equipment Corporation (subsequently Hewlett-Packard) OpenVMS filesystem.
HFS and HFS Plus - Hierarchical File System - Apple Macintosh filesystems.
HPFS - High Performance File Syzstem - OS/2 and eComStation.
JFS - Journaled File System - Used by AIX, OS/2/eComStation and Linux operating systems.
Microsoft SQL Server - Versions 2000-2008 supports extents of up to 64KB [1].
Multi-Programming Executive - Filesystem by Hewlett-Packard.
NTFS - Microsoft's latest-generation file system [1]
OCFS2 - Oracle Cluster File System - a shared disk file system for Linux.
Reiser4 - Linux filesystem (in "extents" mode).
SINTRAN III - File system used by early computer company Norsk Data.
UDF - Universal Disk Format - Standard for optical media.
VERITAS File System - Enabled via the pre-allocation API and CLI.
XFS - SGI's second generation file system.[2]
}}}