Strange behavior of clone

Asked 20/7, 2016 at 20:37 Answered 20/7, 2016 at 23:7

Solved c linux multithreading clone lightweight-processes

This is fairly simple application which creates a lightweight process (thread) with clone() call.

#define _GNU_SOURCE

#include <sched.h>
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
#include <errno.h>
#include <stdlib.h>
#include <time.h>

#define STACK_SIZE 1024*1024

int func(void* param) {
    printf("I am func, pid %d\n", getpid());    
    return 0;
}

int main(int argc, char const *argv[]) {
    printf("I am main, pid %d\n", getpid());
    void* ptr = malloc(STACK_SIZE);

    printf("I am calling clone\n");             
    int res = clone(func, ptr + STACK_SIZE, CLONE_VM, NULL);
    // works fine with sleep() call
    // sleep(1);

    if (res == -1) {
        printf("clone error: %d", errno);       
    } else {
        printf("I created child with pid: %d\n", res);      
    }

    printf("Main done, pid %d\n", getpid());        
    return 0;
}

Here are results:

Run 1:

➜  LFD401 ./clone
I am main, pid 10974
I am calling clone
I created child with pid: 10975
Main done, pid 10974
I am func, pid 10975

Run 2:

➜  LFD401 ./clone
I am main, pid 10995
I am calling clone
I created child with pid: 10996
I created child with pid: 10996
I am func, pid 10996
Main done, pid 10995

Run 3:

➜  LFD401 ./clone
I am main, pid 11037
I am calling clone
I created child with pid: 11038
I created child with pid: 11038
I am func, pid 11038
I created child with pid: 11038
I am func, pid 11038
Main done, pid 11037

Run 4:

➜  LFD401 ./clone
I am main, pid 11062
I am calling clone
I created child with pid: 11063
Main done, pid 11062
Main done, pid 11062
I am func, pid 11063

What is going on here? Why "I created child" message is sometimes printed several times?

Also I noticed that adding a delay after clone call "fixes" the problem.

Drawer answered 20/7, 2016 at 20:37 Comment(3)

Note that on a rare system where int/unsigned is 16-bit, yet size_t is 32-bit: #define STACK_SIZE 1024*1024 void* ptr = malloc(STACK_SIZE); is a problem. Anytime math is done in a "constant", watch out for overflow. Suggest #define STACK_SIZE ((size_t)1024*1024) – Halfhearted 20/7, 2016 at 23:10

Since it is apparently difficult to reproduce this problem, perhaps you would be so kind as to tell us some more about your architecture. Eg. OS and version; glibc version; gcc version. (Also any non-default compiler flags you use) – Kolnick 20/7, 2016 at 23:14

Ubuntu 16.04, GLIBC 2.23-0ubuntu3, gcc 5.4.0. CPU i7-5600U. No additional compiler flags. – Drawer 21/7, 2016 at 8:59

You have a race condition (i.e.) you don't have the implied thread safety of stdio.

The problem is even more severe. You can get duplicate "func" messages.

The problem is that using clone does not have the same guarantees as pthread_create. (i.e.) You do not get the thread safe variants of printf.

I don't know for sure, but, IMO the verbiage about stdio streams and thread safety, in practice, only applies when using pthreads.

So, you'll have to handle your own interthread locking.

Here is a version of your program recoded to use pthread_create. It seems to work without incident:

#define _GNU_SOURCE

#include <sched.h>
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
#include <errno.h>
#include <stdlib.h>
#include <time.h>
#include <pthread.h>

#define STACK_SIZE 1024*1024

void *func(void* param) {
    printf("I am func, pid %d\n", getpid());
    return (void *) 0;
}

int main(int argc, char const *argv[]) {
    printf("I am main, pid %d\n", getpid());
    void* ptr = malloc(STACK_SIZE);

    printf("I am calling clone\n");

    pthread_t tid;
    pthread_create(&tid,NULL,func,NULL);
    //int res = clone(func, ptr + STACK_SIZE, CLONE_VM, NULL);
    int res = 0;

    // works fine with sleep() call
    // sleep(1);

    if (res == -1) {
        printf("clone error: %d", errno);
    } else {
        printf("I created child with pid: %d\n", res);
    }

    pthread_join(tid,NULL);
    printf("Main done, pid %d\n", getpid());
    return 0;
}

Here is a test script I've been using to check for errors [it's a little rough, but should be okay]. Run against your version and it will abort quickly. The pthread_create version seems to pass just fine

#!/usr/bin/perl
# clonetest -- clone test
#
# arguments:
#   "-p0" -- suppress check for duplicate parent messages
#   "-c0" -- suppress check for duplicate child messages
#   1 -- base name for program to test (e.g. for xyz.c, use xyz)
#   2 -- [optional] number of test iterations (DEFAULT: 100000)

master(@ARGV);
exit(0);

# master -- master control
sub master
{
    my(@argv) = @_;
    my($arg,$sym);

    while (1) {
        $arg = $argv[0];
        last unless (defined($arg));

        last unless ($arg =~ s/^-(.)//);
        $sym = $1;

        shift(@argv);

        $arg = 1
            if ($arg eq "");

        $arg += 0;
        ${"opt_$sym"} = $arg;
    }

    $opt_p //= 1;
    $opt_c //= 1;
    printf("clonetest: p=%d c=%d\n",$opt_p,$opt_c);

    $xfile = shift(@argv);
    $xfile //= "clone1";
    printf("clonetest: xfile='%s'\n",$xfile);

    $itermax = shift(@argv);
    $itermax //= 100000;
    $itermax += 0;
    printf("clonetest: itermax=%d\n",$itermax);

    system("cc -o $xfile -O2 $xfile.c -lpthread");
    $code = $? >> 8;
    die("master: compile error\n")
        if ($code);

    $logf = "/tmp/log";

    for ($iter = 1;  $iter <= $itermax;  ++$iter) {
        printf("iter: %d\n",$iter)
            if ($opt_v);
        dotest($iter);
    }
}

# dotest -- perform single test
sub dotest
{
    my($iter) = @_;
    my($parcnt,$cldcnt);
    my($xfsrc,$bf);

    system("./$xfile > $logf");

    open($xfsrc,"<$logf") or
        die("dotest: unable to open '$logf' -- $!\n");

    while ($bf = <$xfsrc>) {
        chomp($bf);

        if ($opt_p) {
            while ($bf =~ /created/g) {
                ++$parcnt;
            }
        }

        if ($opt_c) {
            while ($bf =~ /func/g) {
                ++$cldcnt;
            }
        }
    }

    close($xfsrc);

    if (($parcnt > 1) or ($cldcnt > 1)) {
        printf("dotest: fail on %d -- parcnt=%d cldcnt=%d\n",
            $iter,$parcnt,$cldcnt);
        system("cat $logf");
        exit(1);
    }
}

UPDATE:

Were you able to recreate OPs problem with clone?

Absolutely. Before I created the pthreads version, in addition to testing OP's original version, I also created versions that:

(1) added setlinebuf to the start of main

(2) added fflush just before the clone and __fpurge as the first statement of func

(3) added an fflush in func before the return 0

Version (2) eliminated the duplicate parent messages, but the duplicate child messages remained

If you'd like to see this for yourself, download OP's version from the question, my version, and the test script. Then, run the test script on OP's version.

I posted enough information and files so that anyone can recreate the problem.

Note that due to differences between my system and OP's, I couldn't at first reproduce the problem on just 3-4 tries. So, that's why I created the script.

The script does 100,000 test runs and usually the problem will manifest itself within 5000-15000.

Unalienable answered 20/7, 2016 at 23:1 Comment(3)

Were you able to recreate OPs problem with clone? – Columbuscolumbyne 20/7, 2016 at 23:13

@Columbuscolumbyne Yes, absolutely. See my updated answer for more details. – Unalienable 20/7, 2016 at 23:48

Very cool +1. I wasn't planning on using clone(), but now I have a reason. – Columbuscolumbyne 21/7, 2016 at 0:3

I can't recreate OP's issue, but I don't think the printf's are actually a problem.

glibc docs:

The POSIX standard requires that by default the stream operations are atomic. I.e., issuing two stream operations for the same stream in two threads at the same time will cause the operations to be executed as if they were issued sequentially. The buffer operations performed while reading or writing are protected from other uses of the same stream. To do this each stream has an internal lock object which has to be (implicitly) acquired before any work can be done.

Edit:

Even though the above is true for threads, as rici points out, there is a comment on sourceware:

Basically, there's nothing you can safely do with CLONE_VM unless the child restricts itself to pure computation and direct syscalls (via sys/syscall.h). If you use any of the standard library, you risk the parent and child clobbering each other's internal states. You also have issues like the fact that glibc caches the pid/tid in userspace, and the fact that glibc expects to always have a valid thread pointer which your call to clone is unable to initialize correctly because it does not know (and should not know) the internal implementation of threads.

Apparently, glibc isn't designed to work with clone if CLONE_VM is set but CLONE_THREAD|CLONE_SIGHAND are not.

Columbuscolumbyne answered 20/7, 2016 at 22:16 Comment(0)

Your processes both use the same stdout (that is, the C standard library FILE struct), which includes an accidentally shared buffer. That's undoubtedly causing problems.

Kolnick answered 20/7, 2016 at 21:29 Comment(5)

glibc printf is supposed to be thread safe – Columbuscolumbyne 20/7, 2016 at 21:42

@Columbuscolumbyne However, clone() doesn't "officially" create a thread. So there may be some additional setup required that isn't happening. – Wilk 20/7, 2016 at 22:5

I suppose, but I don't see the difference. The glibc docs confirm that there is no problem with multiple threads writing to output: "yes, the same job could be done with one fprintf call but it is sometimes not possible". Incidentally, I can't recreate OPs problem. – Columbuscolumbyne 20/7, 2016 at 22:9

@evaitl: Yeah, I can't reproduce it either (now that I have a computer in front of me to try it with.) It's possible that this comment by Rich Felker is relevant: sourceware.org/bugzilla/show_bug.cgi?id=10311#c6 – Kolnick 20/7, 2016 at 22:55

@Kolnick I wish he had linked to some kind of authoritative source. Also I wish the problem were reproducible. – Columbuscolumbyne 20/7, 2016 at 23:0

Ass everyone suggests: it really seems to be a problem with, how shall I put it in case of clone(), process-safety? With a rough sketch of a locking version of printf (using write(2)) the output is as expected.

#define _GNU_SOURCE

#include <sched.h>
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
#include <errno.h>
#include <stdlib.h>
#include <time.h>

#define STACK_SIZE 1024*1024

// VERY rough attempt at a thread-safe printf
#include <stdarg.h>
#define SYNC_REALLOC_GROW 64
int sync_printf(const char *format, ...)
{
  int n, all = 0;
  int size = 256;
  char *p, *np;
  va_list args;

  if ((p = malloc(size)) == NULL)
    return -1;

  for (;;) {
    va_start(args, format);
    n = vsnprintf(p, size, format, args);
    va_end(args);
    if (n < 0)
      return -1;
    all += n;
    if (n < size)
      break;
    size = n + SYNC_REALLOC_GROW;
    if ((np = realloc(p, size)) == NULL) {
      free(p);
      return -1;
    } else {
      p = np;
    }
  }
  // write(2) shoudl be threadsafe, so just in case
  flockfile(stdout);
  n = (int) write(fileno(stdout), p, all);
  fflush(stdout);
  funlockfile(stdout);
  va_end(args);
  free(p);
  return n;
}


int func(void *param)
{
  sync_printf("I am func, pid %d\n", getpid());
  return 0;
}

int main()
{

  sync_printf("I am main, pid %d\n", getpid());
  void *ptr = malloc(STACK_SIZE);

  sync_printf("I am calling clone\n");
  int res = clone(func, ptr + STACK_SIZE, CLONE_VM, NULL);
  // works fine with sleep() call
  // sleep(1);

  if (res == -1) {
    sync_printf("clone error: %d", errno);
  } else {
    sync_printf("I created child with pid: %d\n", res);
  }
  sync_printf("Main done, pid %d\n\n", getpid());
  return 0;
}

For the third time: it's only a sketch, no time for a robust version, but that shouldn't hinder you to write one.

Pocketknife answered 20/7, 2016 at 22:20 Comment(0)

As evaitl points out printf is documented to be thread-safe by glibc's documentation. BUT, this typically assumes that you are using the designated glibc function to create threads (that is, pthread_create()). If you do not, then you are on your own.

The lock taken by printf() is recursive (see flockfile). This means that if the lock is already taken, the implementation checks the owner of the lock against the locker. If the locker is the same as the owner, the locking attempt succeeds.

To distinguish between different threads, you need to setup properly TLS, which you do not do, but pthread_create() does. What I'm guessing happens is that in your case the TLS variable that identifies the thread is the same for both threads, so you end up taking the lock.

TL;DR: please use pthread_create()

Gilmagilman answered 20/7, 2016 at 23:7 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags