Full command line as it was typed

Asked 20/3, 2009 at 19:11 Answered 1/7, 2024 at 18:47

python command-line

I want to get the full command line as it was typed.

This:

" ".join(sys.argv[:])

doesn't work here (deletes double quotes). Also I prefer not to rejoin something that was parsed and split.

Any ideas?

Phipps answered 20/3, 2009 at 19:11 Comment(3)

@S.Lott that is not what -1's are intended for. Quoting the help page on downvotes: "Use your downvotes whenever you encounter an egregiously sloppy, no-effort-expended post, or an answer that is clearly and perhaps dangerously incorrect." Now if you think the OP expended no effort on the question, that's a different matter altogether. But in my understanding, downvotes are not intended to target ignorance. – Pipes 14/6, 2017 at 17:50

Rejoining the argv[:] list isn't so bad if you use the right tool: shlex.join(sys.argv[:]) docs. It may not be exactly the same sting that was typed, but it ought to be equivalent. – Vaginate 21/10, 2022 at 17:56

@Vaginate shlex.join() was added in Python 3.8, hence why other answers do not mention it since they predate Python 3. Please add your comment as an answer, it is a very valid one. – Hephzipa 2/7, 2024 at 7:56

You're too late. By the time that the typed command gets to Python your shell has already worked its magic. For example, quotes get consumed (as you've noticed), variables get interpolated, etc.

Stirling answered 20/3, 2009 at 19:16 Comment(1)

Quotes are not consumed by Windows' CMD. – Indiana 10/9, 2018 at 21:5

*nix

Look at the initial stack layout (Linux on i386) that provides access to command line and environment of a program: the process sees only separate arguments.

You can't get the command-line as it was typed in the general case. On Unix, the shell parses the command-line into separate arguments and eventually execv(path, argv) function that invokes the corresponding syscall is called. sys.argv is derived from argv parameter passed to the execve() function. You could get something equivalent using " ".join(map(shlex.quote, sys.argv)) though you shouldn't need to e.g., if you want to restart the script with slightly different command-line parameters then sys.argv is enough (in many cases), see Is it possible to set the python -O (optimize) flag within a script?

There are some creative (non-practical) solutions:

attach the shell using gdb and interrogate it (most shells are capable of repeating the same command twice)—you should be able to get almost the same command as it was typed— or read its history file directly if it is updated before your process exits
use screen, script utilities to get the terminal session
use a keylogger, to get what was typed.

Windows

On Windows the native CreateProcess() interface is a string but python.exe still receives arguments as a list. subprocess.list2cmdline(sys.argv) might help to reverse the process. list2cmdline is designed for applications using the same rules as the MS C runtime—python.exe is one of them. list2cmdline doesn't return the command-line as it was typed but it returns a functional equivalent in this case.

On Python 2, you might need GetCommandLineW(), to get Unicode characters from the command line that can't be represented in Windows ANSI codepage (such as cp1252).

Nonoccurrence answered 20/3, 2009 at 20:4 Comment(4)

It is OK for practical proposes. But, if you want to get all double space or tabs, this will not work (This is just a good "reconstruction" of the command line, not the original). – Cassino 8/6, 2016 at 16:57

Usage is with one parameter. That means subprocess.list2cmdline(sys.argv) should be used. – Cassino 8/6, 2016 at 16:59

@A.Sommerh: 1- It works for double space and tabs. It is what subprocess module uses to create a new process on Windows. Each command on Windows parses its own command-line. python.exe uses the same rules as subprocess.list2cmdline(). 2- I've approved your edit. It makes my mentioning the name of the function into a complete code example. I didn't mean to imply that subprocess.list2cmdline() is an actual call—the name suggest that you should pass a list and it returns a command line (as a string). – Nonoccurrence 8/6, 2016 at 19:20

@J-F-Sebastian, I meant: "if you want to get all double spaces or tabs of the original command line, subprocess.list2cmdline() will not work as expected, because some of them are lost when python construct sys.argv". Good edition of your answer by the way! – Cassino 8/6, 2016 at 23:42

In a Unix environment, this is not generally possible...the best you can hope for is the command line as passed to your process.

Because the shell (essentially any shell) may munge the typed command line in several ways before handing it to the OS for execution.

Uhf answered 20/3, 2009 at 19:15 Comment(0)

As mentioned, this probably cannot be done, at least not reliably. In a few cases, you might be able to find a history file for the shell (e.g. - "bash", but not "tcsh") and get the user's typing from that. I don't know how much, if any, control you have over the user's environment.

Moncrief answered 20/3, 2009 at 20:40 Comment(2)

Even in the .history, some processing has been applied (! substitution at a minimum...). +1 for cleverness, though. Nice thought. – Uhf 20/3, 2009 at 22:46

I believe that Bash, be default, doesn't write to the .history file except at the end of a session, so this doesn't work either. – Miseno 21/3, 2009 at 1:27

On Linux there is /proc/<pid>/cmdline that is in the format of argv[] (i.e. there is 0x00 between all the lines and you can't really know how many strings there are since you don't get the argc; though you will know it when the file runs out of data ;).

You can be sure that that commandline is already munged too since all escaping/variable filling is done and parameters are nicely packaged (no extra spaces between parameters, etc.).

Mele answered 21/3, 2009 at 1:9 Comment(3)

cmdline doesn't provide much in addition to sys.argv. – Nonoccurrence 8/6, 2016 at 20:30

Thanks. This saved my bacon when I wanted to inspect the command line in a .pth triggered script which runs before the (-m specified) module has been loaded (or found). – Balkanize 6/5, 2021 at 16:7

Use /proc/self/cmdline so you don't have to get the pid – Ramsey 23/6, 2022 at 18:5

You can use psutil that provides a cross platform solution:

import psutil
import os
my_process = psutil.Process( os.getpid() )
print( my_process.cmdline() )

If that's not what you're after you can go further and get the command line of the parent program(s):

my_parent_process = psutil.Process( my_process.ppid() )
print( my_parent_process.cmdline() )

The variables will still be split into its components, but unlike sys.argv they won't have been modified by the interpreter.

Modern answered 11/5, 2021 at 13:52 Comment(1)

psutil does not exist here. Doesn't seem like an easy solution. (Windows 10, Python 3) – Chondro 1/9, 2022 at 17:50

If you're on Linux, I'd suggest monkeying with the ~/.bash_history file or the shell history command, although I believe the command must finish executing before it's added to the shell history.

I started playing with:

import popen2
x,y = popen2.popen4("tail ~/.bash_history")
print x.readlines()

But I'm getting weird behavior where the shell doesn't seem to be completely flushing to the .bash_history file.

Fichtean answered 20/3, 2009 at 21:10 Comment(0)

I needed to replay a complex command line with multi-line arguments and values that look like options but which are not.

Combining an answer from 2009 and various comments, here is a modern python 3 version that works quite well on unix.

import sys
import shlex
print(sys.executable, " ".join(map(shlex.quote, sys.argv)))

Let's test:

$ cat << EOT > test.py
import sys
import shlex
print(sys.executable, " ".join(map(shlex.quote, sys.argv)))
EOT

then:

$ python test.py --foo 1 --bar " aha " --tar 'multi \
line arg' --nar '--prefix1 --prefix2'

prints:

/usr/bin/python test.py --foo 1 --bar ' aha ' --tar 'multi \
line arg' --nar '--prefix1 --prefix2'

Note that it got '--prefix1 --prefix2' quoted correctly and the multi-line argument too!

The only difference is the full python path.

That was all I needed.

Thank you for the ideas to make this work.

Update: here is a more advanced version of the same that replays desired env vars and also wraps the long output nicely with bash line breaks so that the output can be immediately pasted in forums and not needing to manually deal with breaking up long lines to avoid horizontal scrolling.

import os
import shlex
import sys
def get_orig_cmd(max_width=80, full_python_path=False):
    """
    Return the original command line string that can be replayed 
    nicely and wrapped for 80 char width
    Args:
        - max_width: the width to wrap for. defaults to 80
        - full_python_path: whether to replicate the full path 
          or just the last part (i.e. `python`). default to `False`
    """

    cmd = []

    # deal with critical env vars
    env_keys = ["CUDA_VISIBLE_DEVICES"]
    for key in env_keys:
        val = os.environ.get(key, None)
        if val is not None:
            cmd.append(f"{key}={val}")

    # python executable (not always needed if the script is executable)
    python = sys.executable if full_python_path else sys.executable.split("/")[-1]
    cmd.append(python)

    # now the normal args
    cmd += list(map(shlex.quote, sys.argv))

    # split up into up to MAX_WIDTH lines with shell multi-line escapes
    lines = []
    current_line = ""
    while len(cmd) > 0:
        current_line += f"{cmd.pop(0)} "
        if len(cmd) == 0 or len(current_line) + len(cmd[0]) + 1 > max_width - 1:
            lines.append(current_line)
            current_line = ""
    return "\\\n".join(lines)

print(get_orig_cmd())

Here is an example that this function produced:

CUDA_VISIBLE_DEVICES=0 python ./scripts/benchmark/trainer-benchmark.py \
--base-cmd \
' examples/pytorch/translation/run_translation.py --model_name_or_path t5-small \
--output_dir output_dir --do_train --label_smoothing 0.1 --logging_strategy no \
--save_strategy no --per_device_train_batch_size 32 --max_source_length 512 \
--max_target_length 512 --num_train_epochs 1 --overwrite_output_dir \
--source_lang en --target_lang ro --dataset_name wmt16 --dataset_config "ro-en" \
--source_prefix "translate English to Romanian: " --warmup_steps 50 \
--max_train_samples 2001 --dataloader_num_workers 2 ' \
--target-metric-key train_samples_per_second --repeat-times 1 --variations \
'|--fp16|--bf16' '|--tf32' --report-metric-keys 'train_loss train_samples' \
--table-format console --repeat-times 2 --base-variation ''

Note, that it's super complex as one argument has multiple arguments as its value and it is multiline too.

Also note that this particular version doesn't rewrap single arguments - if any are longer than the requested width they remain unwrapped (by design).

Gulley answered 29/12, 2021 at 2:22 Comment(2)

Thank you so much for sharing your two implementations! I tested them both and they both work very well! I have no idea why you got downvoted, as your answer is by far the best one. You acknowledge the issue, but provide a working workaround that should solve 99% of the use cases for this exact problem. Yes, you cannot get the original commandline, but your answer shows we can recreate it very closely. Thank you again, this saved me days of headaches! – Hephzipa 30/6, 2024 at 21:14

Glad to hear you found it useful, @gaborous. Thank you. – Gulley 3/7, 2024 at 3:38

If you want to just add back the quotations, simply do the following:

'"' + " ".join(sys.argv[:]) + '"'

The shell will not change what you typed in the console. It simply parsed and split the string typed in. If you just want to get a string equal to the original typed-in string, you can just add the quotes and rejoin the string.

Kaule answered 1/7, 2024 at 18:47 Comment(1)

This won't be correct for arguments that contain " because the embedded quotes need to be escaped. Also backslashes and $ need to be escaped inside double quotes. – Kristopher 1/7, 2024 at 21:34

-1

Here's how you can do it from within the Python program to get back the full command string. Since the command-line arguments are already handled once before it's sent into sys.argv, this is how you can reconstruct that string.

commandstring = '';

for arg in sys.argv:
    if ' ' in arg:
        commandstring += '"{}"  '.format(arg);
    else:
        commandstring+="{}  ".format(arg);

print(commandstring);

Example:

Invoking like this from the terminal,

./saferm.py sdkf lsadkf -r sdf -f sdf -fs -s "flksjfksdkfj sdfsdaflkasdf"

will give the same string in commandstring:

./saferm.py sdkf lsadkf -r sdf -f sdf -fs -s "flksjfksdkfj sdfsdaflkasdf"

Riverine answered 11/5, 2018 at 4:12 Comment(5)

./something '$(rm -rf ~)' is very, very different from ./something "$(rm -rf ~)". – Sentience 19/7, 2018 at 17:0

Similarly, something 'foo "bar" baz' is different from something "foo" "bar" baz" – Sentience 19/7, 2018 at 17:1

' '.join(pipes.quote(x) for x in sys.argv) would be a safer alternative. – Sentience 19/7, 2018 at 17:30

shlex.quote rather than pipes.quote for the Py3 compatible version. – Tamer 20/1, 2021 at 7:17

This only works for strings that contain spaces. If your commandline included strings without spaces, such as paths, then it will fail and on Windows you will get mangled escaped characters (eg, \path\dir will be mangled with \p and \d escaped chars) – Hephzipa 1/7, 2024 at 22:19

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

*nix

Windows

Recommended topics

Hot tags