How can I set breakpoint in GDB for open(2) syscall returning -1
Asked Answered
S

2

24

OS: GNU/Linux
Distro: OpenSuSe 13.1
Arch: x86-64
GDB version: 7.6.50.20130731-cvs
Program language: mostly C with minor bits of assembly

Imagine that I've got rather big program that sometimes fails to open a file. Is it possible to set breakpoint in GDB in such way that it stops after open(2) syscall returns -1?

Of course, I can grep through the source code and find all open(2) invocations and narrow down the faulting open() call but maybe there's a better way.

I tried to use "catch syscall open" then "condition N if $rax==-1" but obviously it didn't get hit.
BTW, Is it possible to distinct between a call to syscall (e.g. open(2)) and return from syscall (e.g. open(2)) in GDB?

As a current workaround I do the following:

  1. Run the program in question under the GDB
  2. From another terminal launch systemtap script:

    stap -g -v -e 'probe process("PATH to the program run under GDB").syscall.return { if( $syscall == 2 && $return <0) raise(%{ SIGSTOP %}) }'
    
  3. After open(2) returns -1 I receive SIGSTOP in GDB session and I can debug the issue.

TIA.

Best regards,
alexz.

UPD: Even though I tried the approach suggested by n.m before and wasn't able to make it work I decided to give it another try. After 2 hours it now works as intended. But with some weird workaround:

  1. I still can't distinct between call and return from syscall
  2. If I use finish in comm I can't use continue, which is OK according to GDB docs
    i.e. the following does drop to gdb prompt on each break:

    gdb> comm
    gdb> finish
    gdb> printf "rax is %d\n",$rax
    gdb> cont
    gdb> end
    
  3. Actually I can avoid using finish and check %rax in commands but in this case I have to check for -errno rather than -1 e.g. if it's "Permission denied" then I have to check for "-13" and if it's "No such file or direcory" - then for -2. It's just simply not right

  4. So the only way to make it work for me was to define custom function and use it in the following way:

    (gdb) catch syscall open
    Catchpoint 1 (syscall 'open' [2]
    (gdb) define mycheck
    Type commands for definition of "mycheck".
    End with a line saying just "end".
    >finish
    >finish
    >if ($rax != -1)
     >cont
     >end
    >printf "rax is %d\n",$rax
    >end
    (gdb) comm
    Type commands for breakpoint(s) 1, one per line.
    End with a line saying just "end".
    >mycheck
    >end
    (gdb) r
    The program being debugged has been started already.
    Start it from the beginning? (y or n) y
    Starting program: /home/alexz/gdb_syscall_test/main
    .....
    Catchpoint 1 (returned from syscall open), 0x00007ffff7b093f0 in __open_nocancel () from /lib64/libc.so.6
    0x0000000000400756 in main (argc=1, argv=0x7fffffffdb18) at main.c:24
    24                      fd = open(filenames[i], O_RDONLY);
    Opening test1
    fd = 3 (0x3)
    Successfully opened test1
    
    Catchpoint 1 (call to syscall open), 0x00007ffff7b093f0 in __open_nocancel () from /lib64/libc.so.6
    rax is -38
    
    Catchpoint 1 (returned from syscall open), 0x00007ffff7b093f0 in __open_nocancel () from /lib64/libc.so.6
    0x0000000000400756 in main (argc=1, argv=0x7fffffffdb18) at main.c:24
    ---Type <return> to continue, or q <return> to quit---
    24                      fd = open(filenames[i], O_RDONLY);
    rax is -1
    (gdb) bt
    #0  0x0000000000400756 in main (argc=1, argv=0x7fffffffdb18) at main.c:24
    (gdb) step
    26                      printf("Opening %s\n", filenames[i]);
    (gdb) info locals
    i = 1
    fd = -1
    
Smoker answered 22/9, 2014 at 11:19 Comment(5)
Hardly any program is going to directly implement a syscall - most will go through a C library wrapper function. So I'd be tempted to put a conditional breakpoint in the return path of the library wrapper.Donner
Another possibility might be to strace(1) your application.Lair
BTW, compiling a recent (7.8) gdb from source code, with both Python & Guile extensibility, could be worthwhileLair
Without return value check for write: #8235936Payne
on 32 bit, use eax rather than rax.Rodrigues
I
6

Is it possible to set breakpoint in GDB in such way that it stops after open(2) syscall returns -1?

It's hard to do better than n.m.s answer for this narrow question, but I would argue that the question is posed incorrectly.

Of course, I can grep through the source code and find all open(2) invocations

That is part of your confusion: when you call open in a C program, you are not in fact executing open(2) system call. Rather, you are invoking an open(3) "stub" from your libc, and that stub will execute the open(2) system call for you.

And if you want to set a breakpoint when the stub is about to return -1, that is very easy.

Example:

/* t.c */
#include <sys/stat.h>
#include <fcntl.h>

int main()
{
  int fd = open("/no/such/file", O_RDONLY);
  return fd == -1 ? 0 : 1;
}

$ gcc -g t.c; gdb -q ./a.out
(gdb) start
Temporary breakpoint 1 at 0x4004fc: file t.c, line 6.
Starting program: /tmp/a.out

Temporary breakpoint 1, main () at t.c:6
6     int fd = open("/no/such/file", O_RDONLY);
(gdb) s
open64 () at ../sysdeps/unix/syscall-template.S:82
82  ../sysdeps/unix/syscall-template.S: No such file or directory.

Here we've reached the glibc system call stub. Let's disassemble it:

(gdb) disas
Dump of assembler code for function open64:
=> 0x00007ffff7b01d00 <+0>: cmpl   $0x0,0x2d74ad(%rip)        # 0x7ffff7dd91b4 <__libc_multiple_threads>
   0x00007ffff7b01d07 <+7>: jne    0x7ffff7b01d19 <open64+25>
   0x00007ffff7b01d09 <+0>: mov    $0x2,%eax
   0x00007ffff7b01d0e <+5>: syscall
   0x00007ffff7b01d10 <+7>: cmp    $0xfffffffffffff001,%rax
   0x00007ffff7b01d16 <+13>:    jae    0x7ffff7b01d49 <open64+73>
   0x00007ffff7b01d18 <+15>:    retq
   0x00007ffff7b01d19 <+25>:    sub    $0x8,%rsp
   0x00007ffff7b01d1d <+29>:    callq  0x7ffff7b1d050 <__libc_enable_asynccancel>
   0x00007ffff7b01d22 <+34>:    mov    %rax,(%rsp)
   0x00007ffff7b01d26 <+38>:    mov    $0x2,%eax
   0x00007ffff7b01d2b <+43>:    syscall
   0x00007ffff7b01d2d <+45>:    mov    (%rsp),%rdi
   0x00007ffff7b01d31 <+49>:    mov    %rax,%rdx
   0x00007ffff7b01d34 <+52>:    callq  0x7ffff7b1d0b0 <__libc_disable_asynccancel>
   0x00007ffff7b01d39 <+57>:    mov    %rdx,%rax
   0x00007ffff7b01d3c <+60>:    add    $0x8,%rsp
   0x00007ffff7b01d40 <+64>:    cmp    $0xfffffffffffff001,%rax
   0x00007ffff7b01d46 <+70>:    jae    0x7ffff7b01d49 <open64+73>
   0x00007ffff7b01d48 <+72>:    retq
   0x00007ffff7b01d49 <+73>:    mov    0x2d10d0(%rip),%rcx        # 0x7ffff7dd2e20
   0x00007ffff7b01d50 <+80>:    xor    %edx,%edx
   0x00007ffff7b01d52 <+82>:    sub    %rax,%rdx
   0x00007ffff7b01d55 <+85>:    mov    %edx,%fs:(%rcx)
   0x00007ffff7b01d58 <+88>:    or     $0xffffffffffffffff,%rax
   0x00007ffff7b01d5c <+92>:    jmp    0x7ffff7b01d48 <open64+72>
End of assembler dump.

Here you can see that the stub behaves differently depending on whether the program has multiple threads or not. This has to do with asynchronous cancellation.

There are two syscall instructions, and in the general case we'd need to set a breakpoint after each one (but see below).

But this example is single-threaded, so I can set a single conditional breakpoint:

(gdb) b *0x00007ffff7b01d10 if $rax < 0
Breakpoint 2 at 0x7ffff7b01d10: file ../sysdeps/unix/syscall-template.S, line 82.
(gdb) c
Continuing.

Breakpoint 2, 0x00007ffff7b01d10 in __open_nocancel () at ../sysdeps/unix/syscall-template.S:82
82  in ../sysdeps/unix/syscall-template.S
(gdb) p $rax
$1 = -2

Voila, the open(2) system call returned -2, which the stub will translate into setting errno to ENOENT (which is 2 on this system) and returning -1.

If the open(2) succeeded, the condition $rax < 0 would be false, and GDB will keep going.

That is precisely the behavior one usually wants from GDB when looking for one failing system call among many succeeding ones.

Update:

As Chris Dodd points out, there are two syscalls, but on error they both branch to the same error-handling code (the code that sets errno). Thus, we can set an un-conditional breakpoint on *0x00007ffff7b01d49, and that breakpoint will fire only on failure.

This is much better, because conditional breakpoints slow down execution quite a lot when the condition is false (GDB has to stop the inferior, evaluate the condition, and resume the inferior if the condition is false).

Isagoge answered 24/9, 2014 at 3:4 Comment(1)
@ChrisDodd You are quite right. Answer updated. Thanks!Isagoge
C
14

This gdb script does what's requested:

set $outside = 1
catch syscall open
commands
  silent
  set $outside = ! $outside
  if ( $outside && $rax >= 0)
    continue
  end
  if ( !$outside )
    continue
  end
  echo `open' returned a negative value\n
end

The $outside variable is needed because gdb stops both at syscall enter and syscall exit. We need to ignore enter events and check $rax only at exit.

Carrot answered 22/9, 2014 at 17:6 Comment(5)
A much simpler solution is to set a breakpoint inside the libc open stub, rather than on the system call itself. You wouldn't have to play the inside/outside game anymore. This can miss direct system calls which don't go through the libc stub, but, given that OP has a C program, and one of his calls to open is failing, we can assume that breaking in the stub is sufficient.Isagoge
@n.m: Thank you. Your solution is more elegant than mine.Smoker
@Chris Stratton, @Employed Russian: Thanks guys. I have to investigate your suggestion a little bit further. Let's clarify it: you propose to b open? Right?Smoker
@Basile Starynkevitch: Thank you. Could you explain please how exactly using GDB 7.8 helps in this case?Smoker
@AlexZ No, b open will stop whether open succeeds or not, which doesn't answer your question. I supplied a full answer.Isagoge
I
6

Is it possible to set breakpoint in GDB in such way that it stops after open(2) syscall returns -1?

It's hard to do better than n.m.s answer for this narrow question, but I would argue that the question is posed incorrectly.

Of course, I can grep through the source code and find all open(2) invocations

That is part of your confusion: when you call open in a C program, you are not in fact executing open(2) system call. Rather, you are invoking an open(3) "stub" from your libc, and that stub will execute the open(2) system call for you.

And if you want to set a breakpoint when the stub is about to return -1, that is very easy.

Example:

/* t.c */
#include <sys/stat.h>
#include <fcntl.h>

int main()
{
  int fd = open("/no/such/file", O_RDONLY);
  return fd == -1 ? 0 : 1;
}

$ gcc -g t.c; gdb -q ./a.out
(gdb) start
Temporary breakpoint 1 at 0x4004fc: file t.c, line 6.
Starting program: /tmp/a.out

Temporary breakpoint 1, main () at t.c:6
6     int fd = open("/no/such/file", O_RDONLY);
(gdb) s
open64 () at ../sysdeps/unix/syscall-template.S:82
82  ../sysdeps/unix/syscall-template.S: No such file or directory.

Here we've reached the glibc system call stub. Let's disassemble it:

(gdb) disas
Dump of assembler code for function open64:
=> 0x00007ffff7b01d00 <+0>: cmpl   $0x0,0x2d74ad(%rip)        # 0x7ffff7dd91b4 <__libc_multiple_threads>
   0x00007ffff7b01d07 <+7>: jne    0x7ffff7b01d19 <open64+25>
   0x00007ffff7b01d09 <+0>: mov    $0x2,%eax
   0x00007ffff7b01d0e <+5>: syscall
   0x00007ffff7b01d10 <+7>: cmp    $0xfffffffffffff001,%rax
   0x00007ffff7b01d16 <+13>:    jae    0x7ffff7b01d49 <open64+73>
   0x00007ffff7b01d18 <+15>:    retq
   0x00007ffff7b01d19 <+25>:    sub    $0x8,%rsp
   0x00007ffff7b01d1d <+29>:    callq  0x7ffff7b1d050 <__libc_enable_asynccancel>
   0x00007ffff7b01d22 <+34>:    mov    %rax,(%rsp)
   0x00007ffff7b01d26 <+38>:    mov    $0x2,%eax
   0x00007ffff7b01d2b <+43>:    syscall
   0x00007ffff7b01d2d <+45>:    mov    (%rsp),%rdi
   0x00007ffff7b01d31 <+49>:    mov    %rax,%rdx
   0x00007ffff7b01d34 <+52>:    callq  0x7ffff7b1d0b0 <__libc_disable_asynccancel>
   0x00007ffff7b01d39 <+57>:    mov    %rdx,%rax
   0x00007ffff7b01d3c <+60>:    add    $0x8,%rsp
   0x00007ffff7b01d40 <+64>:    cmp    $0xfffffffffffff001,%rax
   0x00007ffff7b01d46 <+70>:    jae    0x7ffff7b01d49 <open64+73>
   0x00007ffff7b01d48 <+72>:    retq
   0x00007ffff7b01d49 <+73>:    mov    0x2d10d0(%rip),%rcx        # 0x7ffff7dd2e20
   0x00007ffff7b01d50 <+80>:    xor    %edx,%edx
   0x00007ffff7b01d52 <+82>:    sub    %rax,%rdx
   0x00007ffff7b01d55 <+85>:    mov    %edx,%fs:(%rcx)
   0x00007ffff7b01d58 <+88>:    or     $0xffffffffffffffff,%rax
   0x00007ffff7b01d5c <+92>:    jmp    0x7ffff7b01d48 <open64+72>
End of assembler dump.

Here you can see that the stub behaves differently depending on whether the program has multiple threads or not. This has to do with asynchronous cancellation.

There are two syscall instructions, and in the general case we'd need to set a breakpoint after each one (but see below).

But this example is single-threaded, so I can set a single conditional breakpoint:

(gdb) b *0x00007ffff7b01d10 if $rax < 0
Breakpoint 2 at 0x7ffff7b01d10: file ../sysdeps/unix/syscall-template.S, line 82.
(gdb) c
Continuing.

Breakpoint 2, 0x00007ffff7b01d10 in __open_nocancel () at ../sysdeps/unix/syscall-template.S:82
82  in ../sysdeps/unix/syscall-template.S
(gdb) p $rax
$1 = -2

Voila, the open(2) system call returned -2, which the stub will translate into setting errno to ENOENT (which is 2 on this system) and returning -1.

If the open(2) succeeded, the condition $rax < 0 would be false, and GDB will keep going.

That is precisely the behavior one usually wants from GDB when looking for one failing system call among many succeeding ones.

Update:

As Chris Dodd points out, there are two syscalls, but on error they both branch to the same error-handling code (the code that sets errno). Thus, we can set an un-conditional breakpoint on *0x00007ffff7b01d49, and that breakpoint will fire only on failure.

This is much better, because conditional breakpoints slow down execution quite a lot when the condition is false (GDB has to stop the inferior, evaluate the condition, and resume the inferior if the condition is false).

Isagoge answered 24/9, 2014 at 3:4 Comment(1)
@ChrisDodd You are quite right. Answer updated. Thanks!Isagoge

© 2022 - 2024 — McMap. All rights reserved.