I want to get the full command line as it was typed.
This:
" ".join(sys.argv[:])
doesn't work here (deletes double quotes). Also I prefer not to rejoin something that was parsed and split.
Any ideas?
I want to get the full command line as it was typed.
This:
" ".join(sys.argv[:])
doesn't work here (deletes double quotes). Also I prefer not to rejoin something that was parsed and split.
Any ideas?
You're too late. By the time that the typed command gets to Python your shell has already worked its magic. For example, quotes get consumed (as you've noticed), variables get interpolated, etc.
Look at the initial stack layout (Linux on i386) that provides access to command line and environment of a program: the process sees only separate arguments.
You can't get the command-line as it was typed in the general case. On Unix, the shell parses the command-line into separate arguments and eventually execv(path, argv)
function that invokes the corresponding syscall is called. sys.argv
is derived from argv
parameter passed to the execve()
function. You could get something equivalent using " ".join(map(shlex.quote, sys.argv))
though you shouldn't need to e.g., if you want to restart the script with slightly different command-line parameters then sys.argv
is enough (in many cases), see Is it possible to set the python -O (optimize) flag within a script?
There are some creative (non-practical) solutions:
On Windows the native CreateProcess()
interface is a string but python.exe still receives arguments as a list. subprocess.list2cmdline(sys.argv)
might help to reverse the process. list2cmdline
is designed for applications using the same
rules as the MS C runtime—python.exe
is one of them. list2cmdline
doesn't return the command-line as it was typed but it returns a functional equivalent in this case.
On Python 2, you might need GetCommandLineW()
, to get Unicode characters from the command line that can't be represented in Windows ANSI codepage (such as cp1252).
subprocess.list2cmdline(sys.argv)
should be used. –
Cassino subprocess
module uses to create a new process on Windows. Each command on Windows parses its own command-line. python.exe uses the same rules as subprocess.list2cmdline()
. 2- I've approved your edit. It makes my mentioning the name of the function into a complete code example. I didn't mean to imply that subprocess.list2cmdline()
is an actual call—the name suggest that you should pass a list and it returns a command line (as a string). –
Nonoccurrence subprocess.list2cmdline()
will not work as expected, because some of them are lost when python construct sys.argv
". Good edition of your answer by the way! –
Cassino In a Unix environment, this is not generally possible...the best you can hope for is the command line as passed to your process.
Because the shell (essentially any shell) may munge the typed command line in several ways before handing it to the OS for execution.
As mentioned, this probably cannot be done, at least not reliably. In a few cases, you might be able to find a history file for the shell (e.g. - "bash", but not "tcsh") and get the user's typing from that. I don't know how much, if any, control you have over the user's environment.
On Linux there is /proc/<pid>/cmdline
that is in the format of argv[]
(i.e. there is 0x00 between all the lines and you can't really know how many strings there are since you don't get the argc; though you will know it when the file runs out of data ;).
You can be sure that that commandline is already munged too since all escaping/variable filling is done and parameters are nicely packaged (no extra spaces between parameters, etc.).
cmdline
doesn't provide much in addition to sys.argv
. –
Nonoccurrence .pth
triggered script which runs before the (-m
specified) module has been loaded (or found). –
Balkanize /proc/self/cmdline
so you don't have to get the pid –
Ramsey You can use psutil
that provides a cross platform solution:
import psutil
import os
my_process = psutil.Process( os.getpid() )
print( my_process.cmdline() )
If that's not what you're after you can go further and get the command line of the parent program(s):
my_parent_process = psutil.Process( my_process.ppid() )
print( my_parent_process.cmdline() )
The variables will still be split into its components, but unlike sys.argv
they won't have been modified by the interpreter.
If you're on Linux, I'd suggest monkeying with the ~/.bash_history
file or the shell history
command, although I believe the command must finish executing before it's added to the shell history.
I started playing with:
import popen2
x,y = popen2.popen4("tail ~/.bash_history")
print x.readlines()
But I'm getting weird behavior where the shell doesn't seem to be completely flushing to the .bash_history
file.
I needed to replay a complex command line with multi-line arguments and values that look like options but which are not.
Combining an answer from 2009 and various comments, here is a modern python 3 version that works quite well on unix.
import sys
import shlex
print(sys.executable, " ".join(map(shlex.quote, sys.argv)))
Let's test:
$ cat << EOT > test.py
import sys
import shlex
print(sys.executable, " ".join(map(shlex.quote, sys.argv)))
EOT
then:
$ python test.py --foo 1 --bar " aha " --tar 'multi \
line arg' --nar '--prefix1 --prefix2'
prints:
/usr/bin/python test.py --foo 1 --bar ' aha ' --tar 'multi \
line arg' --nar '--prefix1 --prefix2'
Note that it got '--prefix1 --prefix2'
quoted correctly and the multi-line argument too!
The only difference is the full python path.
That was all I needed.
Thank you for the ideas to make this work.
Update: here is a more advanced version of the same that replays desired env vars and also wraps the long output nicely with bash line breaks so that the output can be immediately pasted in forums and not needing to manually deal with breaking up long lines to avoid horizontal scrolling.
import os
import shlex
import sys
def get_orig_cmd(max_width=80, full_python_path=False):
"""
Return the original command line string that can be replayed
nicely and wrapped for 80 char width
Args:
- max_width: the width to wrap for. defaults to 80
- full_python_path: whether to replicate the full path
or just the last part (i.e. `python`). default to `False`
"""
cmd = []
# deal with critical env vars
env_keys = ["CUDA_VISIBLE_DEVICES"]
for key in env_keys:
val = os.environ.get(key, None)
if val is not None:
cmd.append(f"{key}={val}")
# python executable (not always needed if the script is executable)
python = sys.executable if full_python_path else sys.executable.split("/")[-1]
cmd.append(python)
# now the normal args
cmd += list(map(shlex.quote, sys.argv))
# split up into up to MAX_WIDTH lines with shell multi-line escapes
lines = []
current_line = ""
while len(cmd) > 0:
current_line += f"{cmd.pop(0)} "
if len(cmd) == 0 or len(current_line) + len(cmd[0]) + 1 > max_width - 1:
lines.append(current_line)
current_line = ""
return "\\\n".join(lines)
print(get_orig_cmd())
Here is an example that this function produced:
CUDA_VISIBLE_DEVICES=0 python ./scripts/benchmark/trainer-benchmark.py \
--base-cmd \
' examples/pytorch/translation/run_translation.py --model_name_or_path t5-small \
--output_dir output_dir --do_train --label_smoothing 0.1 --logging_strategy no \
--save_strategy no --per_device_train_batch_size 32 --max_source_length 512 \
--max_target_length 512 --num_train_epochs 1 --overwrite_output_dir \
--source_lang en --target_lang ro --dataset_name wmt16 --dataset_config "ro-en" \
--source_prefix "translate English to Romanian: " --warmup_steps 50 \
--max_train_samples 2001 --dataloader_num_workers 2 ' \
--target-metric-key train_samples_per_second --repeat-times 1 --variations \
'|--fp16|--bf16' '|--tf32' --report-metric-keys 'train_loss train_samples' \
--table-format console --repeat-times 2 --base-variation ''
Note, that it's super complex as one argument has multiple arguments as its value and it is multiline too.
Also note that this particular version doesn't rewrap single arguments - if any are longer than the requested width they remain unwrapped (by design).
If you want to just add back the quotations, simply do the following:
'"' + " ".join(sys.argv[:]) + '"'
The shell will not change what you typed in the console. It simply parsed and split the string typed in. If you just want to get a string equal to the original typed-in string, you can just add the quotes and rejoin the string.
"
because the embedded quotes need to be escaped. Also backslashes and $
need to be escaped inside double quotes. –
Kristopher Here's how you can do it from within the Python program to get back the full command string. Since the command-line arguments are already handled once before it's sent into sys.argv
, this is how you can reconstruct that string.
commandstring = '';
for arg in sys.argv:
if ' ' in arg:
commandstring += '"{}" '.format(arg);
else:
commandstring+="{} ".format(arg);
print(commandstring);
Example:
Invoking like this from the terminal,
./saferm.py sdkf lsadkf -r sdf -f sdf -fs -s "flksjfksdkfj sdfsdaflkasdf"
will give the same string in commandstring:
./saferm.py sdkf lsadkf -r sdf -f sdf -fs -s "flksjfksdkfj sdfsdaflkasdf"
./something '$(rm -rf ~)'
is very, very different from ./something "$(rm -rf ~)"
. –
Sentience something 'foo "bar" baz'
is different from something "foo" "bar" baz"
–
Sentience ' '.join(pipes.quote(x) for x in sys.argv)
would be a safer alternative. –
Sentience shlex.quote
rather than pipes.quote
for the Py3 compatible version. –
Tamer © 2022 - 2024 — McMap. All rights reserved.
shlex.join(sys.argv[:])
docs. It may not be exactly the same sting that was typed, but it ought to be equivalent. – Vaginateshlex.join()
was added in Python 3.8, hence why other answers do not mention it since they predate Python 3. Please add your comment as an answer, it is a very valid one. – Hephzipa