I encountered a strange difference in the behavior of a program using pthreads between Linux and Mac OS X.
Consider the following program that can be compiled with "gcc -pthread -o threadtest threadtest.c":
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
static
void *worker(void *t)
{
int i = *(int *)t;
printf("Thread %d started\n", i);
system("sleep 1");
printf("Thread %d ends\n", i);
return (void *) 0;
}
int main()
{
#define N_WORKERS 4
pthread_t workers[N_WORKERS];
int args[N_WORKERS];
int i;
for (i = 0; i < N_WORKERS; ++i)
{
args[i] = i;
pthread_create(&workers[i], NULL, worker, args + i);
}
for (i = 0; i < N_WORKERS; ++i)
{
pthread_join(workers[i], NULL);
}
return 0;
}
Running the resulting executable on a 4-core Mac OS X machine results in the following behavior:
$ time ./threadtest
Thread 0 started
Thread 2 started
Thread 1 started
Thread 3 started
Thread 0 ends
Thread 1 ends
Thread 2 ends
Thread 3 ends
real 0m4.030s
user 0m0.006s
sys 0m0.008s
Note that the number of actual cores is probably not even relevant, as the time is simply spent in the "sleep 1" shell command without any computation. It is also apparent that the threads are started in parallel as the "Thread ... started" messages appear instantly after the program is started.
Running the same test program on a Linux machine gives the result that I expect:
$ time ./threadtest
Thread 0 started
Thread 3 started
Thread 1 started
Thread 2 started
Thread 1 ends
Thread 2 ends
Thread 0 ends
Thread 3 ends
real 0m1.010s
user 0m0.008s
sys 0m0.013s
Four processes are started in parallel that each sleep for a second, and that takes roughly a second.
If I put actual computations into the worker() function and remove the system() call, I see the expected speedup also in Mac OS X.
So the question is, why does using the system() call in a thread effectively serialize the execution of the threads on Mac OS X, and how can that be prevented?
Aucun commentaire:
Enregistrer un commentaire