Search This Blog

Thursday, November 1, 2012

How much is my system overloaded when I create many new processes

It is an engineering art to design and write a good application that performs well under a stress and doesn't overuses and kills your operating system.

Problem

How much load is put on an operating system when your application create and destroys a large number of processes.

Simulation

To simulate a condition when an application creates a massive number of processes I'm going to use a parallel tool. With a help of it we will run a test to simulate multiple creation and destroying of processes.

Basic pre-configuration

# we don't want any IO to the disk [1] 
# mount -t tmpfs -o size=10m tmpfs /tmp/tmpfs.d/

The command below is going to simulate a condition where 100 processes will be started simultaneously  Every process lives only for a short time before it finishes and dies. Once a process is terminated a new one will be started to replace it.
 
# cd /tmp/tmpfs.d/
# time seq 1 10000 | parallel --joblog /tmp/tmpfs.d/joblog.txt -j100 echo '{} $(date)' ">" log.{}.txt
real    4m33.859s
user    0m28.870s
sys     1m39.590s

We can monitor the resource utilization on the system with a help of these 2 simple commands.

Number of process
 
# while true; do ps -AHf | wc -l ; sleep .1; done | uniq 
178
176
177
176
...

CPU utilization
 
# top | grep Cpu
...
# the average results were 
Cpu(s): 23.4%us, 76.6%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
...
# the stats during the last seconds 
Cpu(s): 21.5%us, 78.2%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.3%si,  0.0%st
Cpu(s): 26.2%us, 73.8%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu(s): 24.8%us, 75.2%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu(s): 25.1%us, 74.9%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu(s): 23.1%us, 76.9%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu(s): 22.3%us, 77.7%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu(s): 22.8%us, 77.2%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu(s): 22.4%us, 77.2%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.3%si,  0.0%st
Cpu(s): 12.3%us, 42.9%sy,  0.0%ni, 44.5%id,  0.0%wa,  0.0%hi,  0.0%si,  0.3%st
Cpu(s):  0.3%us,  2.3%sy,  0.0%ni, 97.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu(s):  1.0%us,  2.3%sy,  0.0%ni, 96.3%id,  0.3%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu(s):  0.3%us,  2.0%sy,  0.0%ni, 97.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu(s):  0.3%us,  2.0%sy,  0.0%ni, 97.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st


Analysis

The above stats show that the CPU was primary busy executing operating system level code. The user and sys times confirms that only for 28s the processor was executing a user level code. Majority of the time the Linux kernel was performing all necessary tasks to create or destroy of processes.

Looking at the system calls we can see that the most expensive calls were 'clone' and 'close'. Interestingly the 'close' took longer then 'clone' but this may be truth because the process we were staring over and over was only 'echo $(date)' or because there were some synchronization within parallel tool itself. 

Collect system syscall
 
# strace -r -o strace.r.txt parallel --ungroup -j1 echo '{} $(date)' ">" log.{}.txt  ::: 1 2

Syscall relative timestamps
 
cat strace.r.txt  | egrep -i -A1 '0\.00[2-9]'
--
     0.004394 --- SIGCHLD (Child exited) @ 0 (0) ---
     0.000042 rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x7fdaa838d4c0}, NULL, 8) = 0
--
     0.003015 close(7)                  = 0
     0.000076 rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x7fdaa838d4c0}, {SIG_DFL, [], SA_RESTORER, 0x7fdaa838d4c0}, 8) = 0
--
     0.005035 --- SIGCHLD (Child exited) @ 0 (0) ---
     0.000043 rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x7fdaa838d4c0}, NULL, 8) = 0
--
     0.002960 close(7)                  = 0
     0.000074 rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER, 0x7fdaa838d4c0}, {SIG_DFL, [], SA_RESTORER, 0x7fdaa838d4c0}, 8) = 0
--
     0.003542 --- SIGCHLD (Child exited) @ 0 (0) ---
     0.000042 rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x7fdaa838d4c0}, NULL, 8) = 0

     # truncated


# cat strace.r.txt | egrep -i -A1 'clone'
     0.000102 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fdaa8ca89d0) = 25086
     0.003318 close(7)                  = 0
--
     0.000085 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fdaa8ca89d0) = 25087
     0.002620 close(7)                  = 0
--
     0.000085 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fdaa8ca89d0) = 25088
     0.002617 close(7)                  = 0
--
     0.000097 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fdaa8ca89d0) = 25089
     0.003037 close(7)                  = 0

     # truncated


# strace -c parallel --ungroup -j1 echo '{} $(date)' log.{}.txt  ::: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 72.01    0.108059        2573        42           clone    ### on average 65%
 26.97    0.040474          33      1211         3 wait4
  0.92    0.001379           1      1152         1 select
  0.02    0.000033           1        33           chmod
  0.02    0.000029           0       268        12 open
  0.02    0.000025           1        48           gettimeofday
  0.02    0.000024           0       498        73 lseek
  0.02    0.000024           0       233         1 rt_sigaction
  0.02    0.000024           0       294           fcntl
  0.00    0.000000           0       213           read
  0.00    0.000000           0         1           write
  0.00    0.000000           0       401           close
...
------ ----------- ----------- --------- --------- ----------------
100.00    0.150071                  5978       975 total


References
  1. http://www.thegeekstuff.com/2008/11/overview-of-ramfs-and-tmpfs-on-linux/
  2. http://www.thegeekstuff.com/2011/11/strace-examples/
  3. http://rtomaszewski.blogspot.co.uk/2012/10/how-to-install-parallel-linux-tool-on.html

No comments:

Post a Comment