How can I profile a Bash shell script slow startup?
Asked Answered
N

8

158

My Bash shell takes up to 3-4 seconds to start up, while if I start it with --norc it runs immediately.

I started "profiling" /etc/bash.bashrc and ~/.bashrc by manually inserting return statements and seeking for speed improvements, but it is not a quantitative process and it is not efficient.

How can I profile my Bash scripts, and see which commands take the most time to execute?

Naucratis answered 16/2, 2011 at 9:43 Comment(5)
I profiled the scripts, and most of the time was spent during the setup of bash_completion.Naucratis
That's not surprising since that's pretty big. You could speed that up by removing the parts you know you'll never need if you want to go to the trouble of maintaining your changes across updates, etc.Usable
You could compare: time bash -c 'exit' and time bash -i -c 'exit' and may play with --norc and --noprofile.Love
See also this answer (disclaimer: it's mine). Not exactly what you're asking, but definitely related: unix.stackexchange.com/a/555510/384864Jespersen
Depending on how 'efficient' a mechanism people need, you might just add something like echo $EPOCHREALTIME (for bash > version 5) before/after commands that seem expensive (anything to do with bash_completion, pyenv, etc). https://mcmap.net/q/65394/-linux-command-to-get-time-in-millisecondsInfidelity
U
164

If you have GNU date (or another version that can output nanoseconds), do this at the beginning of /etc/bash.bashrc (or wherever you'd like to begin a trace in any Bash script):

PS4='+ $(date "+%s.%N")\011 '
exec 3>&2 2>/tmp/bashstart.$$.log
set -x

add

set +x
exec 2>&3 3>&-

at the end of ~/.bashrc (or at the end of the section of any Bash script you'd like tracing to stop). The \011 is an octal tab character.

You should get a trace log in /tmp/bashstart.PID.log that shows the seconds.nanoseconds timestamp of each command that was executed. The difference from one time to the next is the amount of time that the intervening step took.

As you narrow things down, you can move set -x later and set +x earlier (or bracket several sections of interest selectively).

Although it's not as fine-grained as GNU date's nanoseconds, Bash 5 includes a variable which gives the time in microseconds. Using it saves you from spawning an external executable for every line and works on Macs or elsewhere that doesn't have GNU date - as long as you have Bash 5, of course. Change the setting of PS4:

PS4='+ $EPOCHREALTIME\011 '

As pointed out by @pawamoy, you can use BASH_XTRACEFD to send the output of the trace to a separate file descriptor if you have Bash 4.1 or later. From this answer:

#!/bin/bash

exec 5> command.txt
BASH_XTRACEFD="5"

echo -n "hello "

set -x
echo -n world
set +x

echo "!"

This will cause the trace output to go to the file command.txt leaving stdout and stdout to be output normally (or be redirected separately).

Making sense of the results

We can see the time each command took, as well as a running total, using OpenOffice / Libre calc, making it easy to spot that in the example below, lesspipe is taking 6ms.

color coded calc spreadsheet

  1. Copy-paste the resulting log file into a new spreadsheet.
  2. Insert a new column after the timestamp for the diff to the next timestamp; if you inserted column C, insert =B2-B1 in cell C1, then copy paste that cell to the whole C column.
  3. Instead of absolute timestamps, we can start things at 0 by copying the first timestamp, selecting the timestamp column, and Ctrl+Shift+V Special Paste > subtract.
  4. You can color values higher than X red to highlight long individual commands.
Usable answered 16/2, 2011 at 10:20 Comment(15)
Is it normal that the shell prompt is invisible and that my commands are not echoed back? However, I got the trace so I can start the analysis.. thanks a lot!Naucratis
@AndreaSpadaccini: The final exec should return fd2 to normal so you should get the prompt back.Usable
There's a pretty heavy performance penalty from running all the date commands here. If you're not going for a lot of resolution, why not put \t in a PS4 string?Scotsman
...actually, with bash 4.2, one can do better -- using \D{...} in PS4 allows completely arbitrary time format strings to be expanded without the performance overhead of launching date as a subprocess.Scotsman
@CharlesDuffy: Those are both really cool. However GNU date understands %N and Bash 4.2 doesn't (because strftime(3) doesn't) on GNU system - so arbitrary with limits. Your point about performance versus resolution is a good one and a user should make the choice wisely, keeping in mind that the performance hit is a temporary on only during debugging (and only when set -x is in effect).Usable
I believe an upcoming version of Bash will support finer time intervals on systems where they're available.Usable
If testing on MacOS, use gdate (GNU) instead of date (BSD).Interstice
With Bash 4, one can also use BASH_XTRACEFD variable to redirect the debug output to another file descriptor than the default one (2, or stderr). It helps immensely when comes the time to analyze the output (the profiling data), as one doesn't has to untangle stderr and set -x output anymore (so many edge cases).Antirachitic
@pawamoy: Thanks for pointing that out. I've added some information to my answer.Usable
For me on Ubuntu 20, the method atop the answer adds several seconds to startup time, most of which was in fzf's bash completion script. Using $EPOCHREALTIME as mentioned lower down fixed that.Distinguished
Note that $EPOCHREALTIME can either contain a . (dot) or a , (comma) depending on $LC_NUMERIC.Epiglottis
Thanks for the script ! Now, how do we interpret this log file ? Any program that could consume it and show a flamegraph?Belting
@CiprianTomoiagă: You should post a separate question.Usable
@DennisWilliamson Yes, it's probably better. It is here. ThanksBelting
I disagree that should be a separate question; 1) how would users find that, seeing it's very specific to the output given in this particular answer? 2) this answer is much less useful without being able to interpret the results 3) the other question on it's own will be useless if users find that one first.Distinguished
L
130

Profiling Bash (four answers, and more...)

Reading this and because profiling is an important step, I've done some test and research about this whole Stack Overflow question and already posted answers.

Here are more than four answers:

  • The first is based on @DennisWilliamson's idea, but with a lot less of resource consumption

  • The second was my own (before this;)

  • The third is based on @fgm's answer, but more accurate.

  • The last uses script, scriptreplay and timing file.

  • Finally, a little comparison of performance at end.

1. Using set -x and date but with limited forks

Take from @DennisWilliamson's idea, but with the following syntax, there will only one initial fork to three commands:

exec 3>&2 2> >(tee /tmp/sample-time.$$.log |
                 sed -u 's/^.*$/now/' |
                 date -f - +%s.%N >/tmp/sample-time.$$.tim)
set -x

Doing this will run date only once. There is a quick demo/test to show how it work:

for i in {1..4};do echo now;sleep .05;done| date -f - +%N

Sample script:

#!/bin/bash

exec 3>&2 2> >( tee /tmp/sample-$$.log |
                  sed -u 's/^.*$/now/' |
                  date -f - +%s.%N >/tmp/sample-$$.tim)
set -x

for ((i=3;i--;));do sleep .1;done

for ((i=2;i--;))
do
    tar -cf /tmp/test.tar -C / bin
    gzip /tmp/test.tar
    rm /tmp/test.tar.gz
done

set +x
exec 2>&3 3>&-

By running this script, you make 2 files: /tmp/sample-XXXX.log and /tmp/sample-XXXX.tim (where XXXX is process id of running script).

You could present them by using paste:

paste tmp/sample-XXXX.{tim,log}

Or you may even compute diff time:

paste <(
    while read tim ;do
        crt=000000000$((${tim//.}-10#0$last))
        printf "%12.9f\n" ${crt:0:${#crt}-9}.${crt:${#crt}-9}
        last=${tim//.}
      done < sample-time.24804.tim
  ) sample-time.24804.log

 1388487534.391309713        + (( i=3 ))
 0.000080807        + (( i-- ))
 0.000008312        + sleep .1
 0.101304843        + (( 1 ))
 0.000032616        + (( i-- ))
 0.000007124        + sleep .1
 0.101251684        + (( 1 ))
 0.000033036        + (( i-- ))
 0.000007054        + sleep .1
 0.104013813        + (( 1 ))
 0.000026959        + (( i-- ))
 0.000006915        + (( i=2 ))
 0.000006635        + (( i-- ))
 0.000006844        + tar -cf /tmp/test.tar -C / bin
 0.022655107        + gzip /tmp/test.tar
 0.637042668        + rm /tmp/test.tar.gz
 0.000823649        + (( 1 ))
 0.000011314        + (( i-- ))
 0.000006915        + tar -cf /tmp/test.tar -C / bin
 0.016084482        + gzip /tmp/test.tar
 0.627798263        + rm /tmp/test.tar.gz
 0.001294946        + (( 1 ))
 0.000023187        + (( i-- ))
 0.000006845        + set +x

or on two columns:

paste <(
    while read tim ;do
        [ -z "$last" ] && last=${tim//.} && first=${tim//.}
        crt=000000000$((${tim//.}-10#0$last))
        ctot=000000000$((${tim//.}-10#0$first))
        printf "%12.9f %12.9f\n" ${crt:0:${#crt}-9}.${crt:${#crt}-9} \
                                 ${ctot:0:${#ctot}-9}.${ctot:${#ctot}-9}
        last=${tim//.}
      done < sample-time.24804.tim
  ) sample-time.24804.log

May render:

 0.000000000  0.000000000   + (( i=3 ))
 0.000080807  0.000080807   + (( i-- ))
 0.000008312  0.000089119   + sleep .1
 0.101304843  0.101393962   + (( 1 ))
 0.000032616  0.101426578   + (( i-- ))
 0.000007124  0.101433702   + sleep .1
 0.101251684  0.202685386   + (( 1 ))
 0.000033036  0.202718422   + (( i-- ))
 0.000007054  0.202725476   + sleep .1
 0.104013813  0.306739289   + (( 1 ))
 0.000026959  0.306766248   + (( i-- ))
 0.000006915  0.306773163   + (( i=2 ))
 0.000006635  0.306779798   + (( i-- ))
 0.000006844  0.306786642   + tar -cf /tmp/test.tar -C / bin
 0.022655107  0.329441749   + gzip /tmp/test.tar
 0.637042668  0.966484417   + rm /tmp/test.tar.gz
 0.000823649  0.967308066   + (( 1 ))
 0.000011314  0.967319380   + (( i-- ))
 0.000006915  0.967326295   + tar -cf /tmp/test.tar -C / bin
 0.016084482  0.983410777   + gzip /tmp/test.tar
 0.627798263  1.611209040   + rm /tmp/test.tar.gz
 0.001294946  1.612503986   + (( 1 ))
 0.000023187  1.612527173   + (( i-- ))
 0.000006845  1.612534018   + set +x

2. Using trap debug and /proc/timer_list on recent GNU/Linux kernels, without forks.

Under GNU/Linux's recent kernels, you may find a /proc file named timer_list:

grep 'now at\|offset' /proc/timer_list
now at 5461935212966259 nsecs
  .offset:     0 nsecs
  .offset:     1383718821564493249 nsecs
  .offset:     0 nsecs

Where current time is the sum of 5461935212966259 + 1383718821564493249, but in nanoseconds.

So for computing elapsed time, there is no need of knowing offset.

For this kind of jobs, I wrote elap.bash (V2), that be sourced by the following syntax:

source elap.bash-v2

or

. elap.bash-v2 init

(See comments for the full syntax)

So you could simply add this line at top of your script:

. elap.bash-v2 trap2

A little sample:

#!/bin/bash

. elap.bash-v2 trap

for ((i=3;i--;));do sleep .1;done

elapCalc2
elapShowTotal \\e[1mfirst total\\e[0m

for ((i=2;i--;))
do
    tar -cf /tmp/test.tar -C / bin
    gzip /tmp/test.tar
    rm /tmp/test.tar.gz
done

trap -- debug
elapTotal \\e[1mtotal time\\e[0m

Do render on my host:

 0.000947481 Starting
 0.000796900 ((i=3))
 0.000696956 ((i--))
 0.101969242 sleep .1
 0.000812478 ((1))
 0.000755067 ((i--))
 0.103693305 sleep .1
 0.000730482 ((1))
 0.000660360 ((i--))
 0.103565001 sleep .1
 0.000719516 ((1))
 0.000671325 ((i--))
 0.000754856 elapCalc2
 0.316018113 first total
 0.000754787 elapShowTotal \e[1mfirst total\e[0m
 0.000711275 ((i=2))
 0.000683408 ((i--))
 0.075673816 tar -cf /tmp/test.tar -C / bin
 0.596389329 gzip /tmp/test.tar
 0.006565188 rm /tmp/test.tar.gz
 0.000830217 ((1))
 0.000759466 ((i--))
 0.024783966 tar -cf /tmp/test.tar -C / bin
 0.604119903 gzip /tmp/test.tar
 0.005172940 rm /tmp/test.tar.gz
 0.000952299 ((1))
 0.000827421 ((i--))
 1.635788924 total time
 1.636657204 EXIT

Using trap2 instead of trap as an argument to the source command:

#!/bin/bash

. elap.bash-v2 trap2
...

Will render two columns last command and total:

 0.000894541      0.000894541 Starting
 0.001306122      0.002200663 ((i=3))
 0.001929397      0.004130060 ((i--))
 0.103035812      0.107165872 sleep .1
 0.000875613      0.108041485 ((1))
 0.000813872      0.108855357 ((i--))
 0.104954517      0.213809874 sleep .1
 0.000900617      0.214710491 ((1))
 0.000842159      0.215552650 ((i--))
 0.104846890      0.320399540 sleep .1
 0.000899082      0.321298622 ((1))
 0.000811708      0.322110330 ((i--))
 0.000879455      0.322989785 elapCalc2
 0.322989785 first total
 0.000906692      0.323896477 elapShowTotal \e[1mfirst total\e[0m
 0.000820089      0.324716566 ((i=2))
 0.000773782      0.325490348 ((i--))
 0.024752613      0.350242961 tar -cf /tmp/test.tar -C / bin
 0.596199363      0.946442324 gzip /tmp/test.tar
 0.003007128      0.949449452 rm /tmp/test.tar.gz
 0.000791452      0.950240904 ((1))
 0.000779371      0.951020275 ((i--))
 0.030519702      0.981539977 tar -cf /tmp/test.tar -C / bin
 0.584155405      1.565695382 gzip /tmp/test.tar
 0.003058674      1.568754056 rm /tmp/test.tar.gz
 0.000955093      1.569709149 ((1))
 0.000919964      1.570629113 ((i--))
 1.571516599 total time
 0.001723708      1.572352821 EXIT

3. Using strace

Yes, strace could do the job:

strace -q -f -s 10 -ttt sample-script 2>sample-script-strace.log

But it could make a lot of stuff!

wc sample-script-strace.log
    6925  57637 586518 sample-script-strace.log

Using a more restricted command:

strace -f -s 10 -ttt -eopen,access,read,write ./sample-script 2>sample-script-strace.log

Will dump a lighter log:

  4519  36695 374453 sample-script-strace.log

Depending on what you're searching for, you may be more restrictive:

 strace -f -s 10 -ttt -eaccess,open ./sample-script 2>&1 | wc
  189    1451   13682

Reading them will be a little harder:

{
    read -a first
    first=${first//.}
    last=$first
    while read tim line;do
        crt=000000000$((${tim//.}-last))
        ctot=000000000$((${tim//.}-first))
        printf "%9.6f %9.6f %s\n" ${crt:0:${#crt}-6}.${crt:${#crt}-6} \
            ${ctot:0:${#ctot}-6}.${ctot:${#ctot}-6} "$line"
        last=${tim//.}
      done
  } < <(
    sed </tmp/sample-script.strace -e '
        s/^ *//;
        s/^\[[^]]*\] *//;
        /^[0-9]\{4\}/!d
  ')

 0.000110  0.000110 open("/lib/x86_64-linux-gnu/libtinfo.so.5", O_RDONLY) = 4
 0.000132  0.000242 open("/lib/x86_64-linux-gnu/libdl.so.2", O_RDONLY) = 4
 0.000121  0.000363 open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY) = 4
 0.000462  0.000825 open("/dev/tty", O_RDWR|O_NONBLOCK) = 4
 0.000147  0.000972 open("/usr/lib/locale/locale-archive", O_RDONLY) = 4
 ...
 0.000793  1.551331 open("/etc/ld.so.cache", O_RDONLY) = 4
 0.000127  1.551458 open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY) = 4
 0.000545  1.552003 open("/usr/lib/locale/locale-archive", O_RDONLY) = 4
 0.000439  1.552442 --- SIGCHLD (Child exited) @ 0 (0) ---

The original bash script is not so easy to follow in this...

4. Using script, scriptreplay and timing file

For debugging scripts, I often use:

script -t script.log 2>script.tim -c 'bash -x /path/script.sh'

As part of BSD Utils, script (and scriptreplay) is a very old tool which can be used to profile bash, with a very small footprint.

Little demo:

script -t script.log 2>script.tim -c 'bash -x -c "
    for ((i=3;i--;));do sleep .1;done

    for ((i=2;i--;)) ;do
        tar -cf /tmp/test.tar -C / bin
        gzip /tmp/test.tar
        rm /tmp/test.tar.gz
    done
"'

Will produce:

Script started on Fri Mar 25 08:29:37 2016
+ (( i=3 ))
+ (( i-- ))
+ sleep .1
+ (( 1 ))
+ (( i-- ))
+ sleep .1
+ (( 1 ))
+ (( i-- ))
+ sleep .1
+ (( 1 ))
+ (( i-- ))
+ (( i=2 ))
+ (( i-- ))
+ tar -cf /tmp/test.tar -C / bin
+ gzip /tmp/test.tar
+ rm /tmp/test.tar.gz
+ (( 1 ))
+ (( i-- ))
+ tar -cf /tmp/test.tar -C / bin
+ gzip /tmp/test.tar
+ rm /tmp/test.tar.gz
+ (( 1 ))
+ (( i-- ))
Script done on Fri Mar 25 08:29:39 2016

and generate two files:

ls -l script.*
-rw-r--r-- 1 user user 450 Mar 25 08:29 script.log
-rw-r--r-- 1 user user 177 Mar 25 08:29 script.tim

File script.log contain all traces and script.tim is the timing file:

head -n 4 script.*
==> script.log <==
Script started on Fri Mar 25 08:29:37 2016
+ (( i=3 ))
+ (( i-- ))
+ sleep .1

==> script.tim <==
0.435331 11
0.000033 2
0.000024 11
0.000010 2

You could see the total time execution with the first and last lines of the logfile and/or by summarizing times in the timing file:

head -n1 script.log ;tail -n1 script.log
Script started on Fri Mar 25 08:29:37 2016
Script done on Fri Mar 25 08:29:39 2016

sed < script.tim  's/ .*$//;H;${x;s/\n/+/g;s/^\+//;p};d' | bc -l
2.249755

In the timing file, the second value is the number of the next bytes in corresponding logfile. This let you have the ability of replaying the log file optionally with an acceleration factor:

scriptreplay script.{tim,log}

or

scriptreplay script.{tim,log} 5

or

scriptreplay script.{tim,log} .2

Showing times and commands side-by-side is a little more complex too:

exec 4<script.log
read -u 4 line
echo $line ;while read tim char;do
    read -u 4 -N $char -r -s line
    echo $tim $line
  done < script.tim &&
while read -u 4 line;do
    echo $line
done;exec 4<&-
Script started on Fri Mar 25 08:28:51 2016
0.558012 + (( i=3 ))
0.000053
0.000176 + (( i-- ))
0.000015
0.000059 + sleep .1
0.000015
 + sleep .1) + (( 1 ))
 + sleep .1) + (( 1 ))
 + tar -cf /tmp/test.tar -C / bin
0.035024 + gzip /tmp/test.tar
0.793846 + rm /tmp/test.tar.gz
 + tar -cf /tmp/test.tar -C / bin
0.024971 + gzip /tmp/test.tar
0.729062 + rm /tmp/test.tar.gz
 + (( i-- )) + (( 1 ))
Script done on Fri Mar 25 08:28:53 2016

Further work around scriptReplay: How to manipulate timing and typescript files created by "script" command?

Tests and conclusion

To make tests, I've downloaded second sample at Bash complex Hello, World!. This script take approx 0.72 seconds to complete on my host.

I've add at the top of the script one of:

  • by elap.bash function

     #!/bin/bash
    
     source elap.bash-v2 trap2
    
     eval "BUNCHS=(" $(perl <<EOF | gunzip
     ...
    
  • by set -x and PS4

     #!/bin/bash
    
     PS4='+ $(date "+%s.%N")\011 '
     exec 3>&2 2>/tmp/bashstart.$$.log
     set -x
    
     eval "BUNCHS=(" $(perl <<EOF | gunzip
     ...
    
  • by set -x and initial fork to long exec command

     #!/bin/bash
    
     exec 3>&2 2> >(tee /tmp/sample-time.$$.log |
                      sed -u 's/^.*$/now/' |
                      date -f - +%s.%N >/tmp/sample-time.$$.tim)
     set -x
    
     eval "BUNCHS=(" $(perl <<EOF | gunzip
    
  • by script (and set +x)

     script -t helloworld.log 2>helloworld.tim -c '
         bash -x complex_helloworld-2.sh' >/dev/null
    

Times

And compare execution times (on my host):

  • Direct 0.72 sec
  • elap.bash 13.18 sec
  • set + date@PS4 54.61 sec
  • set + 1 fork 1.45 sec
  • script and timing file 2.19 sec
  • strace 4.47 sec

Outputs

  • by elap.bash function
         0.000950277      0.000950277 Starting
         0.007618964      0.008569241 eval "BUNCHS=(" $(perl <<EOF | gunzi
         0.005259953      0.013829194 BUNCHS=("2411 1115 -13 15 33 -3 15 1
         0.010945070      0.024774264 MKey="V922/G/,2:"
         0.001050990      0.025825254 export RotString=""
         0.004724348      0.030549602 initRotString
         0.001322184      0.031871786 for bunch in "${BUNCHS[@]}"
         0.000768893      0.032640679 out=""
         0.001008242      0.033648921 bunchArray=($bunch)
         0.000741095      0.034390016 ((k=0))
  • by set -x and PS4
    ++ 1388598366.536099290  perl
    ++ 1388598366.536169132  gunzip
    + 1388598366.552794757   eval 'BUNCHS=(' '"2411' 1115 -13 15 33 -3 15 1
    ++ 1388598366.555001983  BUNCHS=("2411 1115 -13 15 33 -3 15 13111 -6 1
    + 1388598366.557551018   MKey=V922/G/,2:
    + 1388598366.558316839   export RotString=
    + 1388598366.559083848   RotString=
    + 1388598366.560165147   initRotString
    + 1388598366.560942633   local _i _char
    + 1388598366.561706988   RotString=
  • by set -x and initial fork to long exec command (and my second paste sample script)
     0.000000000  0.000000000    ++ perl
     0.008141159  0.008141159    ++ gunzip
     0.000007822  0.008148981    + eval 'BUNCHS=(' '"2411' 1115 -13 15 33 -3
     0.000006216  0.008155197    ++ BUNCHS=("2411 1115 -13 15 33 -3 15 13111
     0.000006216  0.008161413    + MKey=V922/G/,2:
     0.000006076  0.008167489    + export RotString=
     0.000006007  0.008173496    + RotString=
     0.000006006  0.008179502    + initRotString
     0.000005937  0.008185439    + local _i _char
     0.000006006  0.008191445    + RotString=
  • by strace
     0.000213  0.000213 brk(0)                = 0x17b6000
     0.000044  0.000257 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
     0.000047  0.000304 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7faf1c0dc000
     0.000040  0.000344 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
     0.000040  0.000384 open("/etc/ld.so.cache", O_RDONLY) = 4
     ...
     0.000024  4.425049 close(10)             = 0
     0.000042  4.425091 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
     0.000028  4.425119 read(255, "", 4409)   = 0
     0.000058  4.425177 exit_group(0)         = ?
  • by script
    Le script a débuté sur ven 25 mar 2016 09:18:35 CET
    0.667160 ++ gunzip
    0.000025
    0.000948 ++ perl
    0.000011
    0.005338 + eval 'BUNCHS=(' '"2411' 1115 -13 15 33 -3 15 13111 -6 1 111 4
    0.000044 1223 15 3311 121121 17 3311 121121 1223 3311 121121 17 3311 121
    0.000175 ++ BUNCHS=("2411 1115 -13 15 33 -3 15 13111 -6 15 1114 15 12211
    0.000029 1 1321 12211 412 21211 33 21211 -2 15 2311 11121 232 121111 122
    0.000023 4 3311 121121 12221 3311 121121 12221 3311 121121 1313 -6 15 33

Conclusion

Well! If my pure Bash script is quicker than forking to date on each command, using this implies some operations on each command.

The way of dedicating an independent process for logging and storing is clearly more efficient.

strace is an interesting way, more detailed, but hard to read.

script, with scriptreplay and acceleration factor is very nice too, but not the same precision as this is based on a console exchange instead of process execution, but very light and efficient (not same goal, not same usage).

Finally, I think that the more efficient, in readability and performances is set + 1 fork, The first of this answer, but in fine, depending on the specific case, I use sometimes strace and/or script too.

Some more tricks

Which part of script are currently running?

As you are asking about what's currently happen in your machine (backup server or else), you could use ps -C command, but for showing currently elapsed time:

ps -C rsync ho lstart,etime
Fri Aug 11 09:57:41 2023    01:38:49

or better:

ps --sid $(ps ho sid -C rsync) fwho pid,lstart,cmd
2778 Fri Aug 11 09:08:00 2023 /bin/sh -c  ps -C backup.sh >/dev/null || /path/to/backup.sh -q
2780 Fri Aug 11 09:08:00 2023  \_ /bin/bash /path/to/backup.sh -q
7688 Fri Aug 11 09:57:41 2023      \_ rsync -ax --delete --exclude '*/cache/*' --excl

This show the root process with his currents childs and all start times.

Which files or path are currently in progress

On linux, you could browse /proc pseudo filesystem, for showing file descriptors currently open by any process:

ls -l /proc/$(($(ps ho pid -C rsync)))/fd |
    sed -ne 's|.* -> /path/to/source||p'

Of course all this have to be done with root's rights (in a root session or by using sudo).

Love answered 31/12, 2013 at 9:53 Comment(20)
This was published at Unix & Linux - How to adjust time command to measure a program elapsed timeLove
The Times section is pretty informative and drives home that the forks are nothing to sneeze at (indeed completely dominating many kinds of scripts). +1 for a good (if long-drawn) answer. Perhaps in the future you should consider posting separate answersFowle
Many thanks, @Fowle ! You will find a full ready-to-run bash source file there: elap-bash-v3 (whith some feature like permitting transparent use of STDIN and STDERR)Love
March 2016: Adding script methodLove
@Fowle I agree, this post is a something long... I've added another method today, but I won't drop anything and I don't know how to introduce test and conclusion if separated posts... Sorry!Love
What happened to your website? The links to it are now dead. Maybe update links to point to your GitHub?Duty
@BenjaminW.I'ts okay now! You could delete you're comment, I will try to care about this! ( 1st real downtime from 1997 )Love
On recent versions of bash (>=4.1), you can do exec {BASH_XTRACEFD}> instead of exec 3>&2 2> which will populate the log file only with the trace logging output and not other stderr output.Supersaturated
The exec to a single date process method is very clever and my preference for sub-second precision. For script.sh, I can just do bash -c "exec {BASH_XTRACEFD}> >(tee trace.log | sed -u 's/^.*$//' | date -f - +%s.%N > timing.log); set -x; . script.sh and get profiling data without modifying script.sh. When sub-second precision is not needed, I like bash -c "exec {BASH_XTRACEFD}>trace.log; set -x; PS4='+\t'; . script.sh which time stamps every trace line with second precision and without forking to date (low overhead).Supersaturated
In my testing, it seemed that the script method logged output of the bash script being profiled to script.log without an entry in script.tim, so after my script had printed some output the timing lines no longer lined up with the commands, so it was difficult to tell how long commands were taking to run.Supersaturated
My comment two up should have 's/^.*$/now/ not 's/^.*$//.Supersaturated
As somebody not intimately familiar with bash and shell, it look me a few days to understand what the hell exec 3>&2 2> >(...) was doing. For those like me, it is using file descriptors, redirection, exec, and process substitution. This article (catonmat.net/blog/bash-one-liners-explained-part-three) was incredibly helpful to me.Galatians
@Galatians Have a look at this recent post: Duplicate output looping through multiple values in while loop bashLove
It would be worth mentioning BASH_XTRACEFD, which helps immensely when the time comes to analyze the log data. Indeed, it is very hard to untangle stderr and set -x output. BASH_XTRACEFD allows one to redirect set -x output to another file descriptor.Antirachitic
Neither date nor sed will work as written on MacOS, as it used BSD versions of both. For sed, -l (line buffering) might server as a replacement for -u. However, there's no equivalent of what date -f does in the BSD version, nor a format for milliseconds or nanoseconds.Ashok
Jesus, I did the very first suggestion, just to see whats up, my log was over 200,000 lines?Tuning
@Tuning Just better than nothing! You could use a lot of tools like grep and/or sed to help browsing your log.Love
and the winner is ... set -x and initial fork to long exec command. the overhead of calling date in PS4 is shockingSimultaneous
@MilaNautikus Yes, long fork is interesting, but no! Depending on what you do, strace is kindly different, and replaying script session is another way, another approachLove
Based on your answer I created a simple rcfile that can be used to profile the .bashrc: gist.github.com/T3sT3ro/72927cac8a3d1f74715748ffd6c56613 Just copy-paste the thing as instructed and run exec bash --rcfile ~/.bashrc.profiler — the trace with timings will be generated based on single-fork date method (method #3).Ramiform
H
18

It often helps to trace the system calls

strace -c -f ./script.sh

From the manual:

-c Count time, calls, and errors for each system call and report a summary on program exit.

-f Trace child processes ...

This is not exactly what you want and what a line-oriented profiler would show to you but it usually helps to find hot spots.

Hypodermic answered 16/2, 2011 at 10:25 Comment(0)
P
5

You may have a look at trap command with DEBUG condition. There is a way to set a command(s) to be executed along with your commands. See the notes to the answer.

Pappus answered 16/2, 2011 at 11:1 Comment(3)
@Dennis Williamson: I haven't used it for a while, but the help on my system states that "If a SIGNAL_SPEC is DEBUG, ARG is executed after every simple command."Pappus
From Bash 4.0.33 help trap: "If a SIGNAL_SPEC is DEBUG, ARG is executed before every simple command." In Bash 3.2, it says "after". That's a typo. As of Bash 2.05b, it's run before. Reference: "This document details the changes between this version, bash-2.05b-alpha1, and the previous version, bash-2.05a-release. ... 3. New Features in Bash ... w. The DEBUG trap is now run before simple commands, ((...)) commands, [[...]] conditional commands, and for ((...)) loops." Testing in each version confirms that it's before.Usable
@Dennis Williamson: Ok, then that's what version i have. I fix the answer:)Pappus
A
1

Add this in the front of the script:

N=`date +%s%N`; export PS4='+[$(((`date +%s%N`-$N)/1000000))ms][${BASH_SOURCE}:${LINENO}]: ${FUNCNAME[0]:+${FUNCNAME[0]}(): }';
exec 19>$HOME/logfile
BASH_XTRACEFD=19
set -x

The output file lists commands in millisecond scale:

$ tailf ~/logfile
++[389426ms][/home/subtleseeker/.iterm2_shell_integration.bash:96]: __bp_preexec_invoke_exec(): type -t preexec
++[389428ms][/home/subtleseeker/.iterm2_shell_integration.bash:113]: __bp_preexec_invoke_exec(): __bp_set_ret_value 0 /home/subtleseeker/.bashrc
++[389431ms][/home/subtleseeker/.iterm2_shell_integration.bash:1]: __bp_set_ret_value(): return 0
+[389433ms][:69]: tailf /home/subtleseeker/logfile
Ahimsa answered 31/8, 2020 at 4:42 Comment(1)
On what system was this tested (hardware, operating system, incl. version, etc.)? Perhaps add some information to your answer? (But without "Edit:", "Update:", or similar - the answer should appear as if it was written today.)Aggrieved
M
1

Here's a very dumb method if the startup is in the seconds range and it's a single command that is slow. Run bash -x or bash -lx (whichever is slow) and then keep bashing the enter as quickly as you can until bash finishes initialization. Then scroll back and observe the command which has the most empty lines afterwards. That's your slow command.

Manstopper answered 19/2, 2022 at 14:4 Comment(0)
U
0

Time, xtrace, bash -x, set -x and set +x (2.3. Debugging Bash scripts) remain the orthodox way to debug a script.

Nevertheless, to broaden our horizon or to have more precise control over the use of resources, it is possible to check some debugging and profiling systems available under Linux (here one of the many lists from internet): for example, Valgrind, specifically for memory debugging, or sysprof to profile the whole system:

For sysprof:

With sysprof, you can profile all the applications that are running on your machine, including a multithreaded or multiprocessed application...

And after, to select the branch of sub-processes that you find interesting.


For Valgrind:

With some more gym, it seems to be possible to make visible to Valgrind some programs that we usually install from binary (e.g. OpenOffice).

It is possible to read from the FAQ of Valgrind that Valgrind will profile the child processes if explicitly requested.

... Even if by default it profiles only traces the top-level process, and so if your program is started by a shell script, Perl script, or something similar, Valgrind will trace the shell, or the Perl interpreter, or equivalent...

It will do it with this option enabled:

 --trace-children=yes

Additional references:

Unprovided answered 30/6, 2014 at 11:4 Comment(2)
Not the downvoter, but most of these tips, while cool, are not really relevant here. Asking an appropriate question and self-answering it is more welcome here—Google "stackoverflow self answers" for the relevant etiquette.Legere
The sysprof link is broken: "Ooops... Error 404". The GDB link may or may not be half-broken (manual redirect - "This page has moved here.").Aggrieved
U
0

This post by Alan Hargreaves describes the method of profiling Bourne shell script using a DTrace provider. As far as I know, this works with Solaris and OpenSolaris (see: /bin/sh DTrace Provider).

So given the following DTrace script (sh_flowtime.d at GH based on the original):

#!/usr/sbin/dtrace -Zs
#pragma D option quiet
#pragma D option switchrate=10

dtrace:::BEGIN
{
        depth = 0;
        printf("%s %-20s  %-22s   %s %s\n", "C", "TIME", "FILE", "DELTA(us)", "NAME");
}

sh*:::function-entry
{
        depth++;
        printf("%d %-20Y  %-22s %*s-> %s\n", cpu, walltimestamp,
            basename(copyinstr(arg0)), depth*2, "", copyinstr(arg1));
}

sh*:::function-return
{
        printf("%d %-20Y  %-22s %*s<- %s\n", cpu, walltimestamp,
            basename(copyinstr(arg0)), depth*2, "", copyinstr(arg1));
        depth--;
}

sh*:::builtin-entry
{
        printf("%d %-20Y  %-22s %*s   > %s\n", cpu, walltimestamp,
            basename(copyinstr(arg0)), depth*2, "", copyinstr(arg1));
}

sh*:::command-entry
{
        printf("%d %-20Y  %-22s %*s   | %s\n", cpu, walltimestamp,
            basename(copyinstr(arg0)), depth*2, "", copyinstr(arg1));
}

you can trace the function flow including delta times.

Sample output:

# ./sh_flowtime.d
C TIME                  FILE                 DELTA(us)  -- NAME
0 2007 Aug 10 18:52:51  func_abc.sh                  0   -> func_a
0 2007 Aug 10 18:52:51  func_abc.sh                 54      > echo
0 2007 Aug 10 18:52:52  func_abc.sh            1022880      | sleep
0 2007 Aug 10 18:52:52  func_abc.sh                 34     -> func_b
0 2007 Aug 10 18:52:52  func_abc.sh                 44        > echo
0 2007 Aug 10 18:52:53  func_abc.sh            1029963        | sleep
0 2007 Aug 10 18:52:53  func_abc.sh                 44       -> func_c
0 2007 Aug 10 18:52:53  func_abc.sh                 43          > echo
0 2007 Aug 10 18:52:54  func_abc.sh            1029863          | sleep
0 2007 Aug 10 18:52:54  func_abc.sh                 33       <- func_c
0 2007 Aug 10 18:52:54  func_abc.sh                 14     <- func_b
0 2007 Aug 10 18:52:54  func_abc.sh                  7   <- func_a

Then using sort -nrk7 command, you may sort the output to show most consuming calls.

I'm not aware of any provider probes availably for other shells, so do some research (GitHub search?) or if you want to invest some time, you can write such based on the existing sh example (see How to activate sh DTrace Provider?):

Upholster answered 8/4, 2016 at 13:11 Comment(1)
The first blogs.oracle.com blog post link is (effectively) broken (redirects to the root, https://blogs.oracle.com/). The second blogs.oracle.com blog post link is broken: "Error 404--Not Found"Aggrieved

© 2022 - 2024 — McMap. All rights reserved.