wiki:jazz/jfbterm

Version 6 (modified by jazz, 15 years ago) (diff)

--

BUGFIX: jfbterm (1)

  • [DRBL] jfbterm Bug
  • [備註] 如果沒有 modprobe uvesafb 的話,會缺少 /dev/fb0,因此當執行 jfbterm 時就會出現以下錯誤訊息
    ENCODING: locale = UTF-8
    FONT : (4) [iso10646.1]/pcf/U:///usr/share/fonts/X11/misc/unifont.pcf.gz/
    encoding : UTF-8,iso10646.1
    exec : ls open /dev/fb0: No such file or directory
    
  • [備註] 如果沒有裝 v86d 的話,當執行 modprobe uvesafb 時會在 dmesg 看到以下資訊
    [  703.839376] uvesafb: failed to execute /sbin/v86d
    [  703.840224] uvesafb: make sure that the v86d helper is installed and executable
    [  703.841494] uvesafb: Getting VBE info block failed (eax=0x4f00, err=-2)
    [  703.842307] uvesafb: vbe_init() failed with -22
    [  703.843019] uvesafb: probe of uvesafb.0 failed with error -22
    
  • [備註] 如果有裝 v86d 的話,當執行 modprobe uvesafb 時會在 dmesg 看到以下資訊
    [  268.848377] uvesafb: VMware, IncVMware virtual machine2.0, VMware virtual machine2.0, 2.0, OEM: V M ware, Inc. VBE support 2.0VMware, IncVMware virtual machine2.0, VBE v2.0
    [  268.884099] uvesafb: no monitor limits have been set, default refresh rate will be used
    [  268.885202] uvesafb: VBE state buffer size cannot be determined (eax=0x0, err=0)
    [  268.885265] uvesafb: scrolling: redraw
    [  268.896617] mtrr: your processor doesn't support write-combining
    [  268.909303] Console: switching to colour frame buffer device 128x48
    [  269.938085] uvesafb: framebuffer at 0xf0000000, mapped to 0xffffc20000180000, using 16384k, total 16384k
    [  269.938101] fb0: VESA VGA frame buffer device
    
  • 編輯 GRUB 的 menu.lst 就可以不用每次都跑 modprobe
    • menu.lst

      old new  
      125125
      126126title          Ubuntu 8.10, kernel 2.6.27-7-server
      127127uuid           574912ac-8bd6-4cc7-90c6-8c5362033fec
      128 kernel         /boot/vmlinuz-2.6.27-7-server root=UUID=574912ac-8bd6-4cc7-90c6-8c5362033fec ro quiet splash
       128kernel         /boot/vmlinuz-2.6.27-7-server root=UUID=574912ac-8bd6-4cc7-90c6-8c5362033fec ro quiet splash vga=0x305
      129129initrd         /boot/initrd.img-2.6.27-7-server
      130130quiet
  • 用 modinfo 觀看 Ubuntu 8.10 Kernel 2.6.27-7-server 的 uvesafb 模組參數,發現包括 scroll、vgapal、pmipal、mtrr、blank、nocrtc、noedid、vram_remap、vram_total、maxclk、maxhf、maxvf、mode_option、vbemode、v86d。並沒有 mode 這個參數。
    root@intrepid:~# modinfo uvesafb
    filename:       /lib/modules/2.6.27-7-server/kernel/drivers/video/uvesafb.ko
    description:    Framebuffer driver for VBE2.0+ compliant graphics boards
    author:         Michal Januszewski <spock@gentoo.org>
    license:        GPL
    srcversion:     21EDEFDED06E0673208A0D5
    depends:
    vermagic:       2.6.27-7-server SMP mod_unload modversions
    parm:           scroll:Scrolling mode, set to 'redraw', 'ypan', or 'ywrap' (scroll)
    parm:           vgapal:Set palette using VGA registers (invbool)
    parm:           pmipal:Set palette using PMI calls (bool)
    parm:           mtrr:Memory Type Range Registers setting. Use 0 to disable. (uint)
    parm:           blank:Enable hardware blanking (bool)
    parm:           nocrtc:Ignore CRTC timings when setting modes (bool)
    parm:           noedid:Ignore EDID-provided monitor limits when setting modes (bool)
    parm:           vram_remap:Set amount of video memory to be used [MiB] (uint)
    parm:           vram_total:Set total amount of video memoery [MiB] (uint)
    parm:           maxclk:Maximum pixelclock [MHz], overrides EDID data (ushort)
    parm:           maxhf:Maximum horizontal frequency [kHz], overrides EDID data (ushort)
    parm:           maxvf:Maximum vertical frequency [Hz], overrides EDID data (ushort)
    parm:           mode_option:Specify initial video mode as "<xres>x<yres>[-<bpp>][@<refresh>]" (charp)
    parm:           vbemode:VBE mode number to set, overrides the 'mode' option (ushort)
    parm:           v86d:Path to the v86d userspace helper. (string)
    
    root@intrepid:~# LANG=C apt-cache policy v86d
    v86d:
      Installed: 0.1.5-1ubuntu2
      Candidate: 0.1.5-1ubuntu2
      Version table:
     *** 0.1.5-1ubuntu2 0
            500 http://tw.archive.ubuntu.com intrepid/universe Packages
            100 /var/lib/dpkg/status
    
    root@intrepid:~# LANG=C apt-cache policy jfbterm
    jfbterm:
      Installed: 0.4.7-7.2
      Candidate: 0.4.7-7.2
      Version table:
     *** 0.4.7-7.2 0
            500 http://tw.archive.ubuntu.com intrepid/universe Packages
            100 /var/lib/dpkg/status
    
  • 追蹤 jfbterm 的 deb package 原始碼
    root@intrepid:~# apt-get build-dep jfbterm
    root@intrepid:~# apt-get install dpkg-dev
    root@intrepid:~# apt-get source jfbterm
    
  • uvesafb 作者的推測:
    As to why the console becomes unresponsive after exiting jfbterm --
    jfbterm sets KD_GRAPHICS mode on the console on which it is started,
    and apparently fails to set it back to KD_TEXT before segfaulting.  This
    leaves the console in the broken state.
    
  • 尋找合理懷疑對象(1):
    root@intrepid:~/jfbterm-0.4.7# grep "KD_GRAPHICS" *
    vterm.c:        ioctl(0, KDSETMODE, KD_GRAPHICS);
    
    root@intrepid:~/jfbterm-0.4.7# grep "KD_TEXT" *
    main.c:        if (mode == KD_TEXT) {
    vterm.c:        ioctl(0, KDSETMODE, KD_TEXT);
    

BUGFIX: jfbterm (2)

  • 尋找合理懷疑對象(2):
    root@intrepid:~/jfbterm-0.4.7# grep "cannot mmap" *
    fbcommon.c:             die("cannot mmap(smem)");
    fbcommon.c:             die("cannot mmap(mmio)");
    fbcommon.c:             print_message("cannot mmap(mmio) : %s\n", strerror(errno));
    
  • 啟用 DEBUG 旗標,編譯 jfbterm,使用 gdb 追蹤
    root@intrepid:~/jfbterm-0.4.7# ./configure --enable-debug
    root@intrepid:~/jfbterm-0.4.7# make
    root@intrepid:~/jfbterm-0.4.7# gdb ./jfbterm
    (gdb) run
    
  • 縱使用 gdb 還是無法正常跳回文字模式,因此直接追原始碼。
    • [後記] uvesafb 作者的猜測確實合理,因為已經進入圖形模式,除非跳回文字模式,否則所有輸入輸出都不會正常顯示在畫面上。
  • 根據錯誤訊息,應該是錯在 fbcommon.c 的第 572 行,往前追造成錯誤的原因是 566 行的 mmap(),歸屬在 tfbm_open() 函數
    < fbcommon.c >
    
    482 void tfbm_open(TFrameBufferMemory* p)
    
    566         p->mmio = (u_char*)mmap(NULL, p->mlen, PROT_READ|PROT_WRITE,
    567                                 MAP_SHARED, p->fh, p->slen);
    568         if ((long)p->mmio == -1) {
    569 #ifdef JFB_MMIO_CHECK
    570                 die("cannot mmap(mmio)");
    571 #else
    572                 print_message("cannot mmap(mmio) : %s\n", strerror(errno));
    573 #endif
    
    < main.c >
    
    368 int main(int argc, char *argv[])
    431         tfbm_open(&gFramebuffer);
    
  • 安裝 manpages-dev 套件,查 mmap 、 strerror 跟 errno 的 manpage。從 cannot mmap(mmio) : Invalid argument 這個錯誤訊息可以判斷 errno 等於 EINVAL。而 mmap 發生 EINVAL 的情形有三種,最可能的原因是第一個:We don't like addr, length, or offset (e.g., they are too large, or not aligned on a page boundary).
  • 這裡注意到 mmap 的 offset 參數必須是 sysconf 中定義 _SC_PAGE_SIZE 的倍數。(offset must be a multiple of the page size as returned by sysconf(_SC_PAGE_SIZE).)
    root@intrepid:~/jfbterm-0.4.7# apt-get install manpages-dev
    
    root@intrepid:~/jfbterm-0.4.7# man errno
    
           EINVAL          Invalid argument (POSIX.1)
    
    void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);
    
    offset must be a multiple of the page size as returned by sysconf(_SC_PAGE_SIZE).
    
    ERRORS
           EINVAL We don't like addr, length, or offset (e.g., they are too large, or not aligned on a page boundary).
           EINVAL (since Linux 2.6.12) length was 0.
           EINVAL flags contained neither MAP_PRIVATE or MAP_SHARED, or contained both of these values.
    
  • 使用 gdb 設定中斷點在 fbcomm.c 第 566 行並觀察變數狀態
    root@intrepid:~/jfbterm-0.4.7# gdb
    (gdb) file jfbterm
    Reading symbols from /root/jfbterm-0.4.7/jfbterm...done.
    (gdb) set args -e ls
    (gdb) show args
    Argument list to give program being debugged when it is started is "-e ls".
    (gdb) break fbcommon.c:566
    Breakpoint 1 at 0x4034ba: file fbcommon.c, line 566.
    (gdb) run
    ... 略 ...
    Breakpoint 1, tfbm_open (p=0x6146e0) at fbcommon.c:566
    566             p->mmio = (u_char*)mmap(NULL, p->mlen, PROT_READ|PROT_WRITE,
    (gdb) l
    561             }
    562             p->smem = (char *)p->smem + p->soff;
    563
    564             p->moff = (u_long)(fb_fix.mmio_start) & (~PAGE_MASK);
    565             p->mlen = (fb_fix.mmio_len + p->moff + ~PAGE_MASK) & PAGE_MASK;
    566             p->mmio = (u_char*)mmap(NULL, p->mlen, PROT_READ|PROT_WRITE,
    567                                     MAP_SHARED, p->fh, p->slen);
    568             if ((long)p->mmio == -1) {
    569     #ifdef JFB_MMIO_CHECK
    570                     die("cannot mmap(mmio)");
    (gdb) p fb_fix.mmio_len
    $1 = 0
    (gdb) p p->moff
    $2 = 0
    (gdb) p p->mlen
    $3 = 0
    (gdb) p p->fh
    $4 = 6
    (gdb) p p->slen
    $5 = 1572864
    (gdb) p fb_fix.smem_len
    $6 = 1572864
    

BUGFIX: jfbterm (3)

  • mmap 的 offset 參數必須是 sysconf 中定義 _SC_PAGE_SIZE 的倍數。(offset must be a multiple of the page size as returned by sysconf(_SC_PAGE_SIZE).)
  • 使用 gdb 設定中斷點在 fbcomm.c 第 566 行並觀察變數狀態,可以知道送到 mmap 函數的 offset 參數內容為 1572864 。
    (gdb) p p->slen
    $5 = 1572864
    
  • 為了確認 offset 參數是否為 _SC_PAGESIZE 的整數倍,先撰寫測試程式,用 sysconf 取得目前系統的 _SC_PAGESIZE 屬性。
    root@intrepid:~# cat > get_sc_pagesize.c << EOF
    #include <unistd.h>
    #include <stdio.h>
    
    int main()
    {
      printf("_SC_PAGESIZE = %ld\n",sysconf(_SC_PAGESIZE));
    }
    EOF
    root@intrepid:~# gcc get_sc_pagesize.c -o get_sc_pagesize
    root@intrepid:~# ./get_sc_pagesize
    _SC_PAGESIZE = 4096
    
  • 根據 mmap 的 manpages,產生 EINVAL 錯誤的另一個原因是 length 等於零(從 Linux 2.6.12 以後),確認主因為 length 參數 = p->mlen = 0
    MMAP(2): void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);
    MMAP(2): EINVAL (since Linux 2.6.12) length was 0.
    
    566             p->mmio = (u_char*)mmap(NULL, p->mlen, PROT_READ|PROT_WRITE,
    567                                     MAP_SHARED, p->fh, p->slen);
    (gdb) p p->mlen
    $3 = 0
    
  • 避開 mmap 回傳零的修正
    • jfbterm-0.4.

      diff -Naur jfbterm-0.4.7/fbcommon.c jfbterm-0.4.7-dev/fbcommon.c
      old new  
      563563
      564564       p->moff = (u_long)(fb_fix.mmio_start) & (~PAGE_MASK);
      565565       p->mlen = (fb_fix.mmio_len + p->moff + ~PAGE_MASK) & PAGE_MASK;
      566        p->mmio = (u_char*)mmap(NULL, p->mlen, PROT_READ|PROT_WRITE,
       566       if(p->mlen == 0)
       567       {
       568         p->mmio = 0;
       569       } else {
       570         p->mmio = (u_char*)mmap(NULL, p->mlen, PROT_READ|PROT_WRITE,
      567571                               MAP_SHARED, p->fh, p->slen);
       572       }
      568573       if ((long)p->mmio == -1) {
      569574#ifdef JFB_MMIO_CHECK
      570575               die("cannot mmap(mmio)");

BUGFIX: jfbterm (4)

  • 在看 fbcommon.c 時,發現在這組出現錯誤的 mmap 前曾跑過另一組 mmap, 因此把參數印出來對照看看。
  • gdb debug 程序
    file jfbterm
    set args -e ls
    show args
    break fbcommon.c:500
    break fbcommon.c:557
    break fbcommon.c:566
    run
    p fb_fix
    c
    p fb_fix
    c
    p fb_fix.smem_len
    p p->soff
    p p->slen
    p fb_fix.mmio_len
    p p->moff
    p p->mlen
    
    root@intrepid:~/jfbterm-0.4.7-dev# gdb
    GNU gdb 6.8-debian
    Copyright (C) 2008 Free Software Foundation, Inc.
    License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
    This is free software: you are free to change and redistribute it.
    There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
    and "show warranty" for details.
    This GDB was configured as "x86_64-linux-gnu".
    (gdb) file jfbterm
    Reading symbols from /root/jfbterm-0.4.7-dev/jfbterm...done.
    (gdb) set args -e ls
    (gdb) show args
    Argument list to give program being debugged when it is started is "-e ls".
    (gdb) break fbcommon.c:500
    Breakpoint 1 at 0x40303d: file fbcommon.c, line 500.
    (gdb) break fbcommon.c:557
    Breakpoint 2 at 0x40327e: file fbcommon.c, line 557.
    (gdb) break fbcommon.c:566
    Breakpoint 3 at 0x403301: file fbcommon.c, line 566.
    (gdb) run
    Starting program: /root/jfbterm-0.4.7-dev/jfbterm -e ls
    ...略...
    Breakpoint 1, tfbm_open (p=0x6146e0) at fbcommon.c:501
    501             if (fb_var.yres_virtual != fb_var.yres) {
    (gdb) p fb_fix
    $1 = {id = "▒\204▒▒Q\177\000\000\006\000\000\000\000\000\000", smem_start = 21037808, 
      smem_len = 4269392626, type = 32593, type_aux = 0, visual = 0,
      xpanstep = 65535, ypanstep = 65535, ywrapstep = 65535, line_length = 1, mmio_start = 15, 
      mmio_len = 21076240, accel = 0, reserved = {53071, 64, 0}}
    (gdb) c
    Continuing.
    ...略...
    Breakpoint 2, tfbm_open (p=0x6146e0) at fbcommon.c:557
    557             p->smem = (u_char*)mmap(NULL, p->slen, PROT_READ|PROT_WRITE,
    (gdb) p fb_fix
    $2 = {id = "VESA VGA\000\000\000\000\000\000\000", smem_start = 4026531840, 
      smem_len = 1572864, type = 0, type_aux = 0, visual = 3, xpanstep = 0,
      ypanstep = 0, ywrapstep = 0, line_length = 1024, mmio_start = 0, 
      mmio_len = 0, accel = 0, reserved = {0, 0, 0}}
    (gdb) c
    Continuing.
    
    Breakpoint 3, tfbm_open (p=0x6146e0) at fbcommon.c:566
    566             if(p->mlen == 0)
    (gdb) p fb_fix.smem_len
    $3 = 1572864
    (gdb) p p->soff
    $4 = 0
    (gdb) p p->slen
    $5 = 1572864
    (gdb) p fb_fix.mmio_len
    $6 = 0
    (gdb) p p->moff
    $7 = 0
    (gdb) p p->mlen
    $8 = 0
    
fb_fix.smem_len = 1572864 p->soff = 0 p->slen = 1572864 mmap 成功
fb_fix.mmio_len = 0 p->moff = 0 p->mlen = 0 mmap 失敗 (因為 length = 0)
  • 從數據看起來,無論是從 SSH 登入,或在本機 tty 使用,所得到的 fb_fix.mmio_len = 0 是主要原因。只是 mmio 所代表的意涵,fbcommon.h 程式碼並無註解。
  • 繼續往 fb_fix.mmio_len 的源頭追,是 tfbm_get_fix_screen_info 函式去更新 fb_fix 這個結構。資料是透過 ioctl 去詢問核心中對應的 FBIOGET_FSCREENINFO。
    <fbcommon.c>
    
    509         tfbm_get_fix_screen_info(p->fh, &fb_fix);
    
    244 static void tfbm_get_fix_screen_info(int fh, struct fb_fix_screeninfo *fix)
    245 {
    246         if (ioctl(fh, FBIOGET_FSCREENINFO, fix)) {
    247                 print_strerror_and_exit("ioctl FBIOGET_FSCREENINFO");
    248         }
    249 }
    
  • 定義在 Kernel 2.6.11/drivers/video/fbmem.c 中 FBIOGET_FSCREENINFO 的 ioctl handler
     804        case FBIOGET_FSCREENINFO:
     805                return copy_to_user(argp, &info->fix,
     806                                    sizeof(fix)) ? -EFAULT : 0;
    
  • 定義在 Kernel 2.6.27/drivers/video/fbmem.c 中 FBIOGET_FSCREENINFO 的 ioctl handler
    1247        case FBIOGET_FSCREENINFO:
    1248                ret = fb_get_fscreeninfo(inode, file, cmd, arg);
    1249                break;
    
    • fb_get_fscreeninfo 會呼叫 fb_ioctl 並以 cmd = FBIOGET_FSCREENINFO 去處理,看起來最後執行的程式碼是跟 2.6.11 一樣。
      1013 static int 
      1014 fb_ioctl(struct inode *inode, struct file *file, unsigned int cmd,
      1015          unsigned long arg)
      
      1018        struct fb_info *info = registered_fb[fbidx];
      
      1046        case FBIOGET_FSCREENINFO:
      1047                return copy_to_user(argp, &info->fix,
      1048                                    sizeof(fix)) ? -EFAULT : 0;
      
  • 從 fb.h 的定義,我們可以確定 mmio_len 是從 info (型態為 fb_info) 結構中抓出 fb_fix_screeninfo 型態的 fix 回傳給使用者。
    <linux-2.6.27.7/include/linux/fb.h>
    
     152 struct fb_fix_screeninfo {
    
     156        __u32 smem_len;                 /* Length of frame buffer mem */
    
     164        unsigned long mmio_start;       /* Start of Memory Mapped I/O   */
     165                                        /* (physical address) */
     166        __u32 mmio_len;                 /* Length of Memory Mapped I/O  */
    
     808 struct fb_info {
    
     811        struct fb_var_screeninfo var;   /* Current var */
     812        struct fb_fix_screeninfo fix;   /* Current fix */
    
  • 目前懷疑是 uvesafb 攔截了 FBIOGET_FSCREENINFO 的 ioctl 並回傳了不正確的 fb_fix_screeninfo 內容。因此先來寫一個測試程式,透過 FBIOGET_FSCREENINFO 的 ioctl 把 fb_fix_screeninfo 內容取出來。

BUGFIX: jfbterm (5)