Why are many system calls (getpid) captured only once using strace?

Asked 23/4, 2011 at 18:49 Answered 26/12, 2020 at 13:17

Solved linux-kernel system-calls libc strace

I invoked getpid() in a program for many times (to test the efficiency of system calls), however when I use strace to get the trace, only one getpid() call is captured.

The code is simple:

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>

void print_usage(){
    printf("Usage: program count\n");
    exit(-1);
}

int main(int argc, char** argv){
    if(argc != 2)
        print_usage();
    int cnt = atoi(argv[1]);
    int i = 0;
    while(i++<cnt)
        getpid();
    return 0;
}

I used gdb and got this:

(gdb) disasse
Dump of assembler code for function getpid:
0xb76faac0 <getpid+0>:  mov    %gs:0x4c,%edx
0xb76faac7 <getpid+7>:  cmp    $0x0,%edx
0xb76faaca <getpid+10>: mov    %edx,%eax
0xb76faacc <getpid+12>: jle    0xb76faad0 <getpid+16>
0xb76faace <getpid+14>: repz ret 
0xb76faad0 <getpid+16>: jne    0xb76faadc <getpid+28>
0xb76faad2 <getpid+18>: mov    %gs:0x48,%eax
0xb76faad8 <getpid+24>: test   %eax,%eax
0xb76faada <getpid+26>: jne    0xb76faace <getpid+14>
0xb76faadc <getpid+28>: mov    $0x14,%eax
0xb76faae1 <getpid+33>: call   *%gs:0x10
0xb76faae8 <getpid+40>: test   %edx,%edx
0xb76faaea <getpid+42>: mov    %eax,%ecx
0xb76faaec <getpid+44>: jne    0xb76faace <getpid+14>
0xb76faaee <getpid+46>: mov    %ecx,%gs:0x48
0xb76faaf5 <getpid+53>: ret

I don't quite understand the assembly code. It would also be helpful if somebody can give some detailed explanation about it. According to my observation, "call *%gs:0x10" (, which jumps into vdso) is not executed, except for the first getpid() call, that may be the reason why subsequent getpid() calls are not captured. But I don't know why.

The linux kernel: 2.6.24-29 gcc (GCC) 4.2.4 libc 2.7,

Thanks!

Hysteresis answered 23/4, 2011 at 18:49 Comment(0)

Glibc caches the result, since it can't change between calls. See the source code here for instance.

So the real syscall only gets executed once. The other calls just read from the cache. (The code is not very simple because it takes care of doing the Right Thing with threads.)

Bryna answered 23/4, 2011 at 18:51 Comment(6)

Great. And I have another question; it may be not relevant. I'm wondering whether even the first getpid() call doesn't trigger user-kernel mode switch by leveraging vdso, as for gettimeofday? – Hysteresis 23/4, 2011 at 18:59

Another question is how can I single step into "call *%gs:0x10" using gdb? – Hysteresis 23/4, 2011 at 19:0

I'm not sure there's a single answer to that. syscalls are handled differently depending on the platform (i.e. even 32bit x86 and 64bit x86_64 have different syscall mechanisms). But maybe I'm wrong - you should probably post a separate question for that, and do specify which architecture(s) you're interested in, and what syscalls if there are some you're especially interested in. (i'm don't know gdb very well at all) – Bryna 23/4, 2011 at 19:2

@SIFE: not that I'm aware of and I don't see any reason they would provide that feature - a process's PID never changes. – Bryna 8/2, 2013 at 16:4

@Bryna But that doesn't answer my question. – Teeny 8/2, 2013 at 16:48

@SIFE: As I said, don't think there is. – Bryna 9/2, 2013 at 5:50

glibc caches the pid value. The first time you call getpid it asks the kernel for the pid, the next time it just returns value it got from the first getpid syscall.

glibc code:

pid_t
__getpid (void)
{
#ifdef NOT_IN_libc
  INTERNAL_SYSCALL_DECL (err);
  pid_t result = INTERNAL_SYSCALL (getpid, err, 0);
#else
  pid_t result = THREAD_GETMEM (THREAD_SELF, pid);
  if (__builtin_expect (result <= 0, 0))
    result = really_getpid (result);
#endif
  return result;
}

If you want to test the overhead of syscalls, gettimeofday() is often used to do just that - the work done the kernel is very small, and neither the compiler nor the C library can optimize away calls to it.

Gewgaw answered 23/4, 2011 at 18:55 Comment(1)

You cannot use the gettimeofday() function to mesure syscall overhead, since gettimeofday is not a standard syscall on Linux: it is optimized through the vDSO mechanism. On my laptop, a standard syscall lasts for about 225 ns, whereas gettimeofday lasts only for 20 ns. – Ader 9/3, 2018 at 14:35

Nowadays, with the introduction of pid_namespaces and numerous bug detected in applications on signal receipt or when they create child processes by calling syscall() instead of fork(), vfork() and clone(), the pid is no longer cached in the GLIBC. This is pointed out in the manual:

From glibc version 2.3.4 up to and including version 2.24, the
glibc wrapper function for getpid() cached PIDs, with the goal of
avoiding additional system calls when a process calls getpid()
repeatedly. Normally this caching was invisible, but its correct
operation relied on support in the wrapper functions for fork(2),
vfork(2), and clone(2): if an application bypassed the glibc
wrappers for these system calls by using syscall(2), then a call
to getpid() in the child would return the wrong value (to be
precise: it would return the PID of the parent process). In
addition, there were cases where getpid() could return the wrong
value even when invoking clone(2) via the glibc wrapper function.
(For a discussion of one such case, see BUGS in clone(2).)
Furthermore, the complexity of the caching code had been the
source of a few bugs within glibc over the years.

Eclipse answered 26/12, 2020 at 13:17 Comment(0)

Recommended topics

Hot tags