Version 6 (modified by jazz, 16 years ago) (diff) |
---|
BUGFIX: jfbterm (1)
- [DRBL] jfbterm Bug
- https://bugs.launchpad.net/ubuntu/+source/jfbterm/+bug/253163
- http://launchpadlibrarian.net/16414131/jfbterm-segfault.txt
- http://launchpadlibrarian.net/16414135/jfbterm-segfault-strace.txt
- [測試]
- 安裝 Ubuntu 8.10 Server AMD64
- 用 Ubuntu 8.10 的測試步驟:
root@intrepid:~# apt-get update root@intrepid:~# apt-get upgrade root@intrepid:~# reboot root@intrepid:~# uname -a Linux intrepid 2.6.27-7-server #1 SMP Tue Nov 4 20:16:57 UTC 2008 x86_64 GNU/Linux root@intrepid:~# apt-get install jfbterm v86d root@intrepid:~# reboot root@intrepid:~# dpkg -S chvt kbd: /usr/share/man/man1/chvt.1.gz kbd: /bin/chvt root@intrepid:~# chvt 1 root@intrepid:~# rmmod uvesafb ERROR: Module uvesafb does not exist in /proc/modules root@intrepid:~# modprobe uvesafb mode_option=1024x768 root@intrepid:~# screen root@intrepid:~# jfbterm -e ls ... 略 ... color 15 : ffff, ffff cannot mmap(mmio) : Invalid argument Segmentation fault
- [備註] 如果沒有 modprobe uvesafb 的話,會缺少 /dev/fb0,因此當執行 jfbterm 時就會出現以下錯誤訊息
ENCODING: locale = UTF-8 FONT : (4) [iso10646.1]/pcf/U:///usr/share/fonts/X11/misc/unifont.pcf.gz/ encoding : UTF-8,iso10646.1 exec : ls open /dev/fb0: No such file or directory
- [備註] 如果沒有裝 v86d 的話,當執行 modprobe uvesafb 時會在 dmesg 看到以下資訊
[ 703.839376] uvesafb: failed to execute /sbin/v86d [ 703.840224] uvesafb: make sure that the v86d helper is installed and executable [ 703.841494] uvesafb: Getting VBE info block failed (eax=0x4f00, err=-2) [ 703.842307] uvesafb: vbe_init() failed with -22 [ 703.843019] uvesafb: probe of uvesafb.0 failed with error -22
- [備註] 如果有裝 v86d 的話,當執行 modprobe uvesafb 時會在 dmesg 看到以下資訊
[ 268.848377] uvesafb: VMware, IncVMware virtual machine2.0, VMware virtual machine2.0, 2.0, OEM: V M ware, Inc. VBE support 2.0VMware, IncVMware virtual machine2.0, VBE v2.0 [ 268.884099] uvesafb: no monitor limits have been set, default refresh rate will be used [ 268.885202] uvesafb: VBE state buffer size cannot be determined (eax=0x0, err=0) [ 268.885265] uvesafb: scrolling: redraw [ 268.896617] mtrr: your processor doesn't support write-combining [ 268.909303] Console: switching to colour frame buffer device 128x48 [ 269.938085] uvesafb: framebuffer at 0xf0000000, mapped to 0xffffc20000180000, using 16384k, total 16384k [ 269.938101] fb0: VESA VGA frame buffer device
- 編輯 GRUB 的 menu.lst 就可以不用每次都跑 modprobe
-
menu.lst
old new 125 125 126 126 title Ubuntu 8.10, kernel 2.6.27-7-server 127 127 uuid 574912ac-8bd6-4cc7-90c6-8c5362033fec 128 kernel /boot/vmlinuz-2.6.27-7-server root=UUID=574912ac-8bd6-4cc7-90c6-8c5362033fec ro quiet splash 128 kernel /boot/vmlinuz-2.6.27-7-server root=UUID=574912ac-8bd6-4cc7-90c6-8c5362033fec ro quiet splash vga=0x305 129 129 initrd /boot/initrd.img-2.6.27-7-server 130 130 quiet
-
- 用 modinfo 觀看 Ubuntu 8.10 Kernel 2.6.27-7-server 的 uvesafb 模組參數,發現包括 scroll、vgapal、pmipal、mtrr、blank、nocrtc、noedid、vram_remap、vram_total、maxclk、maxhf、maxvf、mode_option、vbemode、v86d。並沒有 mode 這個參數。
root@intrepid:~# modinfo uvesafb filename: /lib/modules/2.6.27-7-server/kernel/drivers/video/uvesafb.ko description: Framebuffer driver for VBE2.0+ compliant graphics boards author: Michal Januszewski <spock@gentoo.org> license: GPL srcversion: 21EDEFDED06E0673208A0D5 depends: vermagic: 2.6.27-7-server SMP mod_unload modversions parm: scroll:Scrolling mode, set to 'redraw', 'ypan', or 'ywrap' (scroll) parm: vgapal:Set palette using VGA registers (invbool) parm: pmipal:Set palette using PMI calls (bool) parm: mtrr:Memory Type Range Registers setting. Use 0 to disable. (uint) parm: blank:Enable hardware blanking (bool) parm: nocrtc:Ignore CRTC timings when setting modes (bool) parm: noedid:Ignore EDID-provided monitor limits when setting modes (bool) parm: vram_remap:Set amount of video memory to be used [MiB] (uint) parm: vram_total:Set total amount of video memoery [MiB] (uint) parm: maxclk:Maximum pixelclock [MHz], overrides EDID data (ushort) parm: maxhf:Maximum horizontal frequency [kHz], overrides EDID data (ushort) parm: maxvf:Maximum vertical frequency [Hz], overrides EDID data (ushort) parm: mode_option:Specify initial video mode as "<xres>x<yres>[-<bpp>][@<refresh>]" (charp) parm: vbemode:VBE mode number to set, overrides the 'mode' option (ushort) parm: v86d:Path to the v86d userspace helper. (string) root@intrepid:~# LANG=C apt-cache policy v86d v86d: Installed: 0.1.5-1ubuntu2 Candidate: 0.1.5-1ubuntu2 Version table: *** 0.1.5-1ubuntu2 0 500 http://tw.archive.ubuntu.com intrepid/universe Packages 100 /var/lib/dpkg/status root@intrepid:~# LANG=C apt-cache policy jfbterm jfbterm: Installed: 0.4.7-7.2 Candidate: 0.4.7-7.2 Version table: *** 0.4.7-7.2 0 500 http://tw.archive.ubuntu.com intrepid/universe Packages 100 /var/lib/dpkg/status
- 追蹤 jfbterm 的 deb package 原始碼
root@intrepid:~# apt-get build-dep jfbterm root@intrepid:~# apt-get install dpkg-dev root@intrepid:~# apt-get source jfbterm
- uvesafb 作者的推測:
As to why the console becomes unresponsive after exiting jfbterm -- jfbterm sets KD_GRAPHICS mode on the console on which it is started, and apparently fails to set it back to KD_TEXT before segfaulting. This leaves the console in the broken state.
- 尋找合理懷疑對象(1):
root@intrepid:~/jfbterm-0.4.7# grep "KD_GRAPHICS" * vterm.c: ioctl(0, KDSETMODE, KD_GRAPHICS); root@intrepid:~/jfbterm-0.4.7# grep "KD_TEXT" * main.c: if (mode == KD_TEXT) { vterm.c: ioctl(0, KDSETMODE, KD_TEXT);
BUGFIX: jfbterm (2)
- 尋找合理懷疑對象(2):
root@intrepid:~/jfbterm-0.4.7# grep "cannot mmap" * fbcommon.c: die("cannot mmap(smem)"); fbcommon.c: die("cannot mmap(mmio)"); fbcommon.c: print_message("cannot mmap(mmio) : %s\n", strerror(errno));
- 啟用 DEBUG 旗標,編譯 jfbterm,使用 gdb 追蹤
root@intrepid:~/jfbterm-0.4.7# ./configure --enable-debug root@intrepid:~/jfbterm-0.4.7# make root@intrepid:~/jfbterm-0.4.7# gdb ./jfbterm (gdb) run
- 縱使用 gdb 還是無法正常跳回文字模式,因此直接追原始碼。
- [後記] uvesafb 作者的猜測確實合理,因為已經進入圖形模式,除非跳回文字模式,否則所有輸入輸出都不會正常顯示在畫面上。
- 根據錯誤訊息,應該是錯在 fbcommon.c 的第 572 行,往前追造成錯誤的原因是 566 行的 mmap(),歸屬在 tfbm_open() 函數
< fbcommon.c > 482 void tfbm_open(TFrameBufferMemory* p) 566 p->mmio = (u_char*)mmap(NULL, p->mlen, PROT_READ|PROT_WRITE, 567 MAP_SHARED, p->fh, p->slen); 568 if ((long)p->mmio == -1) { 569 #ifdef JFB_MMIO_CHECK 570 die("cannot mmap(mmio)"); 571 #else 572 print_message("cannot mmap(mmio) : %s\n", strerror(errno)); 573 #endif
< main.c > 368 int main(int argc, char *argv[]) 431 tfbm_open(&gFramebuffer);
- 安裝 manpages-dev 套件,查 mmap 、 strerror 跟 errno 的 manpage。從 cannot mmap(mmio) : Invalid argument 這個錯誤訊息可以判斷 errno 等於 EINVAL。而 mmap 發生 EINVAL 的情形有三種,最可能的原因是第一個:We don't like addr, length, or offset (e.g., they are too large, or not aligned on a page boundary).
- 這裡注意到 mmap 的 offset 參數必須是 sysconf 中定義 _SC_PAGE_SIZE 的倍數。(offset must be a multiple of the page size as returned by sysconf(_SC_PAGE_SIZE).)
root@intrepid:~/jfbterm-0.4.7# apt-get install manpages-dev
root@intrepid:~/jfbterm-0.4.7# man errno EINVAL Invalid argument (POSIX.1)
void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset); offset must be a multiple of the page size as returned by sysconf(_SC_PAGE_SIZE). ERRORS EINVAL We don't like addr, length, or offset (e.g., they are too large, or not aligned on a page boundary). EINVAL (since Linux 2.6.12) length was 0. EINVAL flags contained neither MAP_PRIVATE or MAP_SHARED, or contained both of these values.
- 使用 gdb 設定中斷點在 fbcomm.c 第 566 行並觀察變數狀態
root@intrepid:~/jfbterm-0.4.7# gdb (gdb) file jfbterm Reading symbols from /root/jfbterm-0.4.7/jfbterm...done. (gdb) set args -e ls (gdb) show args Argument list to give program being debugged when it is started is "-e ls". (gdb) break fbcommon.c:566 Breakpoint 1 at 0x4034ba: file fbcommon.c, line 566. (gdb) run ... 略 ... Breakpoint 1, tfbm_open (p=0x6146e0) at fbcommon.c:566 566 p->mmio = (u_char*)mmap(NULL, p->mlen, PROT_READ|PROT_WRITE, (gdb) l 561 } 562 p->smem = (char *)p->smem + p->soff; 563 564 p->moff = (u_long)(fb_fix.mmio_start) & (~PAGE_MASK); 565 p->mlen = (fb_fix.mmio_len + p->moff + ~PAGE_MASK) & PAGE_MASK; 566 p->mmio = (u_char*)mmap(NULL, p->mlen, PROT_READ|PROT_WRITE, 567 MAP_SHARED, p->fh, p->slen); 568 if ((long)p->mmio == -1) { 569 #ifdef JFB_MMIO_CHECK 570 die("cannot mmap(mmio)"); (gdb) p fb_fix.mmio_len $1 = 0 (gdb) p p->moff $2 = 0 (gdb) p p->mlen $3 = 0 (gdb) p p->fh $4 = 6 (gdb) p p->slen $5 = 1572864 (gdb) p fb_fix.smem_len $6 = 1572864
BUGFIX: jfbterm (3)
- mmap 的 offset 參數必須是 sysconf 中定義 _SC_PAGE_SIZE 的倍數。(offset must be a multiple of the page size as returned by sysconf(_SC_PAGE_SIZE).)
- 使用 gdb 設定中斷點在 fbcomm.c 第 566 行並觀察變數狀態,可以知道送到 mmap 函數的 offset 參數內容為 1572864 。
(gdb) p p->slen $5 = 1572864
- 為了確認 offset 參數是否為 _SC_PAGESIZE 的整數倍,先撰寫測試程式,用 sysconf 取得目前系統的 _SC_PAGESIZE 屬性。
root@intrepid:~# cat > get_sc_pagesize.c << EOF #include <unistd.h> #include <stdio.h> int main() { printf("_SC_PAGESIZE = %ld\n",sysconf(_SC_PAGESIZE)); } EOF root@intrepid:~# gcc get_sc_pagesize.c -o get_sc_pagesize root@intrepid:~# ./get_sc_pagesize _SC_PAGESIZE = 4096
- 根據 mmap 的 manpages,產生 EINVAL 錯誤的另一個原因是 length 等於零(從 Linux 2.6.12 以後),確認主因為 length 參數 = p->mlen = 0
MMAP(2): void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset); MMAP(2): EINVAL (since Linux 2.6.12) length was 0. 566 p->mmio = (u_char*)mmap(NULL, p->mlen, PROT_READ|PROT_WRITE, 567 MAP_SHARED, p->fh, p->slen); (gdb) p p->mlen $3 = 0
- 避開 mmap 回傳零的修正
-
jfbterm-0.4.
diff -Naur jfbterm-0.4.7/fbcommon.c jfbterm-0.4.7-dev/fbcommon.c
old new 563 563 564 564 p->moff = (u_long)(fb_fix.mmio_start) & (~PAGE_MASK); 565 565 p->mlen = (fb_fix.mmio_len + p->moff + ~PAGE_MASK) & PAGE_MASK; 566 p->mmio = (u_char*)mmap(NULL, p->mlen, PROT_READ|PROT_WRITE, 566 if(p->mlen == 0) 567 { 568 p->mmio = 0; 569 } else { 570 p->mmio = (u_char*)mmap(NULL, p->mlen, PROT_READ|PROT_WRITE, 567 571 MAP_SHARED, p->fh, p->slen); 572 } 568 573 if ((long)p->mmio == -1) { 569 574 #ifdef JFB_MMIO_CHECK 570 575 die("cannot mmap(mmio)");
-
BUGFIX: jfbterm (4)
- 在看 fbcommon.c 時,發現在這組出現錯誤的 mmap 前曾跑過另一組 mmap, 因此把參數印出來對照看看。
- gdb debug 程序
file jfbterm set args -e ls show args break fbcommon.c:500 break fbcommon.c:557 break fbcommon.c:566 run p fb_fix c p fb_fix c p fb_fix.smem_len p p->soff p p->slen p fb_fix.mmio_len p p->moff p p->mlen
root@intrepid:~/jfbterm-0.4.7-dev# gdb GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". (gdb) file jfbterm Reading symbols from /root/jfbterm-0.4.7-dev/jfbterm...done. (gdb) set args -e ls (gdb) show args Argument list to give program being debugged when it is started is "-e ls". (gdb) break fbcommon.c:500 Breakpoint 1 at 0x40303d: file fbcommon.c, line 500. (gdb) break fbcommon.c:557 Breakpoint 2 at 0x40327e: file fbcommon.c, line 557. (gdb) break fbcommon.c:566 Breakpoint 3 at 0x403301: file fbcommon.c, line 566. (gdb) run Starting program: /root/jfbterm-0.4.7-dev/jfbterm -e ls ...略... Breakpoint 1, tfbm_open (p=0x6146e0) at fbcommon.c:501 501 if (fb_var.yres_virtual != fb_var.yres) { (gdb) p fb_fix $1 = {id = "▒\204▒▒Q\177\000\000\006\000\000\000\000\000\000", smem_start = 21037808, smem_len = 4269392626, type = 32593, type_aux = 0, visual = 0, xpanstep = 65535, ypanstep = 65535, ywrapstep = 65535, line_length = 1, mmio_start = 15, mmio_len = 21076240, accel = 0, reserved = {53071, 64, 0}} (gdb) c Continuing. ...略... Breakpoint 2, tfbm_open (p=0x6146e0) at fbcommon.c:557 557 p->smem = (u_char*)mmap(NULL, p->slen, PROT_READ|PROT_WRITE, (gdb) p fb_fix $2 = {id = "VESA VGA\000\000\000\000\000\000\000", smem_start = 4026531840, smem_len = 1572864, type = 0, type_aux = 0, visual = 3, xpanstep = 0, ypanstep = 0, ywrapstep = 0, line_length = 1024, mmio_start = 0, mmio_len = 0, accel = 0, reserved = {0, 0, 0}} (gdb) c Continuing. Breakpoint 3, tfbm_open (p=0x6146e0) at fbcommon.c:566 566 if(p->mlen == 0) (gdb) p fb_fix.smem_len $3 = 1572864 (gdb) p p->soff $4 = 0 (gdb) p p->slen $5 = 1572864 (gdb) p fb_fix.mmio_len $6 = 0 (gdb) p p->moff $7 = 0 (gdb) p p->mlen $8 = 0
fb_fix.smem_len = 1572864 | p->soff = 0 | p->slen = 1572864 | mmap 成功 |
fb_fix.mmio_len = 0 | p->moff = 0 | p->mlen = 0 | mmap 失敗 (因為 length = 0) |
- 從數據看起來,無論是從 SSH 登入,或在本機 tty 使用,所得到的 fb_fix.mmio_len = 0 是主要原因。只是 mmio 所代表的意涵,fbcommon.h 程式碼並無註解。
- 繼續往 fb_fix.mmio_len 的源頭追,是 tfbm_get_fix_screen_info 函式去更新 fb_fix 這個結構。資料是透過 ioctl 去詢問核心中對應的 FBIOGET_FSCREENINFO。
<fbcommon.c> 509 tfbm_get_fix_screen_info(p->fh, &fb_fix); 244 static void tfbm_get_fix_screen_info(int fh, struct fb_fix_screeninfo *fix) 245 { 246 if (ioctl(fh, FBIOGET_FSCREENINFO, fix)) { 247 print_strerror_and_exit("ioctl FBIOGET_FSCREENINFO"); 248 } 249 }
- 定義在 Kernel 2.6.11/drivers/video/fbmem.c 中 FBIOGET_FSCREENINFO 的 ioctl handler
804 case FBIOGET_FSCREENINFO: 805 return copy_to_user(argp, &info->fix, 806 sizeof(fix)) ? -EFAULT : 0;
- 定義在 Kernel 2.6.27/drivers/video/fbmem.c 中 FBIOGET_FSCREENINFO 的 ioctl handler
1247 case FBIOGET_FSCREENINFO: 1248 ret = fb_get_fscreeninfo(inode, file, cmd, arg); 1249 break;
- fb_get_fscreeninfo 會呼叫 fb_ioctl 並以 cmd = FBIOGET_FSCREENINFO 去處理,看起來最後執行的程式碼是跟 2.6.11 一樣。
1013 static int 1014 fb_ioctl(struct inode *inode, struct file *file, unsigned int cmd, 1015 unsigned long arg) 1018 struct fb_info *info = registered_fb[fbidx]; 1046 case FBIOGET_FSCREENINFO: 1047 return copy_to_user(argp, &info->fix, 1048 sizeof(fix)) ? -EFAULT : 0;
- fb_get_fscreeninfo 會呼叫 fb_ioctl 並以 cmd = FBIOGET_FSCREENINFO 去處理,看起來最後執行的程式碼是跟 2.6.11 一樣。
- 從 fb.h 的定義,我們可以確定 mmio_len 是從 info (型態為 fb_info) 結構中抓出 fb_fix_screeninfo 型態的 fix 回傳給使用者。
<linux-2.6.27.7/include/linux/fb.h> 152 struct fb_fix_screeninfo { 156 __u32 smem_len; /* Length of frame buffer mem */ 164 unsigned long mmio_start; /* Start of Memory Mapped I/O */ 165 /* (physical address) */ 166 __u32 mmio_len; /* Length of Memory Mapped I/O */ 808 struct fb_info { 811 struct fb_var_screeninfo var; /* Current var */ 812 struct fb_fix_screeninfo fix; /* Current fix */
- 目前懷疑是 uvesafb 攔截了 FBIOGET_FSCREENINFO 的 ioctl 並回傳了不正確的 fb_fix_screeninfo 內容。因此先來寫一個測試程式,透過 FBIOGET_FSCREENINFO 的 ioctl 把 fb_fix_screeninfo 內容取出來。