A command is basically a string. In general it can be split into two parts - the command's name
and the command's arguments
.
Example:
ls
is used for listing the contents of a directory:
user@computer:~$ ls
Documents Pictures Videos ...
The ls
above is executed inside home
folder of a user. Here the argument which folder to list is implicitly added to the command. We can explicitly pass some arguments:
user@computer:~$ ls Picture
image1.jpg image2.jpg ...
Here I have explicitly told ls
which folder's contents I'd like to see. We can use another argument for example l
for listing the details of each file and folder such as access permissions, size etc.:
user@computer:~$ ls Pictures
-rw-r--r-- 1 user user 215867 Oct 12 2014 image1.jpg
-rw-r--r-- 1 user user 268800 Jul 31 2014 image2.jpg
...
Oh, the size looks really weird (215867
, 268800
). Let's add the h
flag for human-friendly output:
user@computer:~$ ls -l -h Pictures
-rw-r--r-- 1 user user 211K Oct 12 2014 image1.jpg
-rw-r--r-- 1 user user 263K Jul 31 2014 image2.jpg
...
Some commands allow their arguments to be combined (in the above case we might as well write ls -lh
and we'll get the same output), using short (a single letter usually but sometimes more; abbreviation) or long names (in case of ls
we have the -a
or --all
for listing all files including hidden ones with --all
being the long name for -a
) etc. There are commands where the order of the arguments is very important but there are also others where the order of the arguments is not important at all.
For example it doesn't matter if I use ls -lh
or ls -hl
however in the case of mv
(moving/renaming files) you have less flexibility for your last 2 arguments that is mv [OPTIONS] SOURCE DESTINATION
.
In order to get a grip of commands and their arguments you can use man
(example: man ls
) or info
(example: info ls
).
In many languages including C/C++ you have a way of parsing command line arguments that the user has attached to the call of the executable (the command). There are also numerous libraries available for this task since in its core it's actually not that easy to do it properly and at the same time offer a large amount of arguments and their varieties:
getopt
argp_parse
gflags
- ...
Every C/C++ application has the so called entry point, which is basically where your code starts - the main
function:
int main (int argc, char *argv[]) { // When you launch your application the first line of code that is ran is this one - entry point
// Some code here
return 0; // Exit code of the application - exit point
}
No matter if you use a library (like one of the above I've mentioned; but this is clearly not allowed in your case ;)) or do it on your own your main
function has the two arguments:
argc
- represents the number of arguments
argv
- a pointer to an array of strings (you can also see char** argv
which is basically the same but more difficult to use).
NOTE: main
actually also has a third argument char *envp[]
which allows passing environment variables to your command but this is a more advanced thing and I really don't think that it's required in your case.
The processing of command line arguments consists of two parts:
- Tokenizing - this is the part where each argument gets a meaning. Its the process of breaking your arguments list into meaningful elements (tokens). In the case of
ls -l
the l
is not only a valid character but also a token in itself since it represents a complete, valid argument.
Here is an example how to output the number of arguments and the (unchecked for validity) characters that may or may not actually be arguments:
#include <iostream>
using std::cout;
using std::endl;
int main (int argc, char *argv[]) {
cout << "Arguments' count=%d" << argc << endl;
// First argument is ALWAYS the command itself
cout << "Command: " << argv[0] << endl;
// For additional arguments we start from argv[1] and continue (if any)
for (int i = 1; i < argc; i++) {
cout << "arg[" << i << "]: " << argv[i] << endl;
}
cout << endl;
return 0;
}
Parsing - after acquiring the tokens (arguments and their values) you need to check if your command supports these. For example:
user@computer:~$ ls -y
will return
ls: invalid option -- 'y'
Try 'ls --help' for more information.
This is because the parsing has failed. Why? Because y
(and -y
respectively; note that -
, --
, :
etc. is not required and its up to the parsing of the arguments whether you want that stuff there or not; in Unix/Linux systems this is a sort of a convention but you are not bind to it) is an unknown argument for the ls
command.
For each argument (if successfully recognized as such) you trigger some sort of change in your application. You can use an if-else
for example to check if a certain argument is valid and what it does followed by changing whatever you want that argument to change in the execution of the rest of your code. You can go the old C-style or C++-style:
* `if (strcmp(argv[1], "x") == 0) { ... }` - compare the pointer value
* `if (std::string(argv[1]) == "x") { ... }` - convert to string and then compare
I actually like (when not using a library) to convert argv
to an std::vector
of strings like this:
std::vector<std::string> args(argv, argv+argc);
for (size_t i = 1; i < args.size(); ++i) {
if (args[i] == "x") {
// Handle x
}
else if (args[i] == "y") {
// Handle y
}
// ...
}
The std::vector<std::string> args(argv, argv+argc);
part is just an easier C++-ish way to handle the array of strings since char *
is a C-style string (with char *argv[]
being an array of such strings) which can easily be converted to a C++ string that is std::string
. Then we can add all converted strings to a vector by giving the starting address of argv
and then also pointing to its last address namely argv + argc
(we add argc
number of string to the base address of argv
which is basically pointing at the last address of our array).
Inside the for
loop above you can see that I check (using simple if-else
) if a certain argument is available and if yes then handle it accordingly. A word of caution: by using such a loop the order of the arguments doesn't matter. As I've mentioned at the beginning some commands actually have a strict order for some or all of their arguments. You can handle this in a different way by manually calling the content of each args
(or argv
if you use the initial char* argv[]
and not the vector solution):
// No for loop!
if (args[1] == "x") {
// Handle x
}
else if (args[2] == "y") {
// Handle y
}
// ...
This makes sure that at position 1
only the x
will be expected etc. The problem with this is that you can shoot yourself in the leg by going out of bounds with the indexing so you have to make sure that your index stays within the range set by argc
:
if (argc > 1 && argc <= 3) {
if (args[1] == "x") {
// Handle x
}
else if (args[2] == "y") {
// Handle y
}
}
The example above makes sure you have content at index 1
and 2
but not beyond.
Last but not least the handling of each argument is a thing that is totally up to you. You can use boolean flags that are set when a certain argument is detected (example: if (args[i] == "x") { xFound = true; }
and later on in your code do something based on the bool xFound
and its value), numerical types if the argument is a number OR consists of number along with the argument's name (example: mycommand -x=4
has an argument -x=4
which you can additionally parse as x
and 4
the last being the value of x
) etc. Based on the task at hand you can go crazy and add an insane amount of complexity to your command line arguments.
Hope this helps. Let me know if something is unclear or you need more examples.
getopt
orargparse
. – Annystd::vector<std::string> args(argv, argv+argc);
so you can parse a vector of strings instead of an array of char-arrays. – Radiography