Version 4 (modified by jazz, 12 years ago) (diff) |
---|
2012-06-02
File System & Kernel Page Size
- 前陣子大家在討論怎樣加速異地檔案傳輸的速度,Thomas 提到說 mount 指令不支援大於 4096 (4K) 的問>題,也提到這個現象受限於 CPU 架構,因為 Kernel Page Size 是相依於 CPU 架構。今天簡單地查了一下,把這個問題紀錄一下,畢竟我們常要去探討 Disk I/O 速度,從硬體架構跟軟體架構的堆疊來看,最終還是得>設法找出最佳化的參數。唯有對問題更了解,才又辦法知道如何把速度榨出來~
- 結論(1) : 檔案系統的 Block Size (檔案區塊大小)最大值等於 Linux Kernel 的 Page Size (虛擬記憶體分頁大小)
- [參考一] mount XFS 64KB block size cause "Function not implemented"
XFS: A high-performance journaling filesystem
http://oss.sgi.com/projects/xfs/
Filesystem Block Size
The minimum filesystem block size is 512 bytes. The maximum filesystem block size is the page size of the kernel, which is 4K on x86 architecture and is set as a kernel compile option on the IA64 architecture (up to 64 kilobyte pages). So, XFS supports filesystem block sizes up to 64 kilobytes (from 512 bytes, in powers of 2), when the kernel page size allows it.
- [參考一] mount XFS 64KB block size cause "Function not implemented"
- 結論(2) : 支援大於 4K Page Size 的硬體架構為 ia64, mips, pa-risc, powerpc, sh, sparc64
- [參考二] Mount ext4 partition with >4KiB block size
From page_types.h under arch\x86\include\asm /* PAGE_SHIFT determines the page size */ #define PAGE_SHIFT 12 #define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT) #define PAGE_MASK (~(PAGE_SIZE-1))
Here is a list of the archs that support 64KiB or greater page sizes: ia64, mips, pa-risc, powerpc, sh, sparc64. So it looks like my best bet it to find an old-PPC Mac.
- [參考二] Mount ext4 partition with >4KiB block size
- 那如何查出目前的 Kernel Page Size 呢?
- [參考三] Linux Find Out Virtual Memory PAGESIZE
~$ getconf PAGESIZE OR ~$ getconf PAGE_SIZE
- [參考三] Linux Find Out Virtual Memory PAGESIZE
- 結論(3):除了加大 Block Size 之外,最常用的方式是啟動 read ahead (運用快取來提昇硬碟循序讀取的效率)
- [參考四] Changing PAGESIZE in kernel
AIX vio ioo is a read ahead cache system to improve speed of transfers when doing sequential reads.
- How to get linux kernel page size programatically
jazz@jazzbook:~$ cat /proc/meminfo | grep Mapped Mapped: 74764 kB jazz@jazzbook:~$ cat /proc/vmstat | grep "nr_mapped" nr_mapped 18691 jazz@jazzbook:~$ A=$(cat /proc/meminfo | grep Mapped | awk '{ print $2 }') jazz@jazzbook:~$ B=$(cat /proc/vmstat | grep "nr_mapped" | awk '{ print $2 }') jazz@jazzbook:~$ echo $(($A/$B)) 4
- [參考五] Performance increase through bigger readahead buffers - 以下示範是把預設的 RA 由 256 提升到 1024(注意是設定整顆硬碟,而非特定磁區)
The default value for the readahead buffer in linux 256, so by increasing this to 1024 we can get a better read performance in sequential reads from the disks. To set this you need to do this for each disk (not partition) of your system. eg.:
~$ blockdev --report RO RA SSZ BSZ 起始磁區 大小 裝置 rw 256 512 4096 0 500107862016 /dev/sda rw 256 512 4096 2048 497999151104 /dev/sda1 rw 256 512 4096 972656640 2106589184 /dev/sda2 ~$ sudo blockdev --setra 1024 /dev/sda
- [參考四] Changing PAGESIZE in kernel
- 結論(4):可以使用支援Extent的檔案系統,來預留較大的寫入區塊。
The following systems support extents: ASM - Automatic Storage Management - Oracle's database-oriented filesystem. BFS - BeOS, Zeta and Haiku operating systems. Btrfs - GPL'd extent based file storage for Linux. Ext4 - Linux filesystem (when the configuration enables extents — the default in Linux since version 2.6.23). Files-11 - Digital Equipment Corporation (subsequently Hewlett-Packard) OpenVMS filesystem. HFS and HFS Plus - Hierarchical File System - Apple Macintosh filesystems. HPFS - High Performance File Syzstem - OS/2 and eComStation. JFS - Journaled File System - Used by AIX, OS/2/eComStation and Linux operating systems. Microsoft SQL Server - Versions 2000-2008 supports extents of up to 64KB [1]. Multi-Programming Executive - Filesystem by Hewlett-Packard. NTFS - Microsoft's latest-generation file system [1] OCFS2 - Oracle Cluster File System - a shared disk file system for Linux. Reiser4 - Linux filesystem (in "extents" mode). SINTRAN III - File system used by early computer company Norsk Data. UDF - Universal Disk Format - Standard for optical media. VERITAS File System - Enabled via the pre-allocation API and CLI. XFS - SGI's second generation file system.[2]