Simple Shell was our team project with Aurelie Cedia at the end of the first trimester in Software Engineering Holberton School. The goal of this project is to write a simple UNIX command interpreter. In this article I will describe step by step what happens when you type “ls -l” and hit Enter in a Shell.
Some important details:
- Our program must have the exact same output as sh (/bin/sh) as well as the exact same error output.
- We were limited by allowed for use functions and system calls, that’s why you will see in some images using, for example, function _strdup (which we recorded) instead of standard strdup.
Read the command line.
The shell runs in an infinite loop. It is always waiting for the next command to be entered by the user. Once a command is entered, the standard input is read and put into a buffer using getline.
Function getline takes a buffer and fills it with what it reads from the command line (standard input).
Split line into the tokens.
The buffer gets tokenized. We use function strtok for this. Tokenize means to break up into pieces. Shell breaks up the buffer. After it represents easily readable strings that can be passed to commands.
Function strtok takes 2 arguments: a string that you want to modify, a string that defines what delimiters it should look for. In our case it is a space. Strtok finds the first entry of the delimiter and makes it a null byte \0. Then tracks right after the delimiter.
Strings are read until the null byte. To grab all iterations we use a variable token. It will hold the string until we work with it. Each call parse the same string. The first parameter is NULL. Tokens need to be saved to be used later. We use strdup to create a place to hold in a new space in memory.
Strtok returns a pointer to the first token found in the string. A null pointer is returned if there are no tokens left to retrieve.
The whole operation will look like this:
Both command line arguments are saved. Next call to strtok will return NULL, so we know where to stop searching.
Check for alias.
We did not implement aliases in our project, but … Aliases are shortcuts to regular commands. The alias implementation can be done using data structures like:
The char *alias_name will be compared with the user’s command. If they are equal the char *real_name will be returned. So, we will replace the alias for the real command. Then take it to the parsing process.
Check for built-in’s.
In our project we implemented the exit built-in and environment. For check we use our version of function strcmp. Function strcmp takes 2 arguments: the first and the second string to be compared.
As the second string was used names of build-ins. If function return value = 0, it indicates str1 is equal to str2.
The function print_env() prints the current environment.
Check for binary in PATH (/bin).
We will search for the environment PATH. It contains all the directories where we will look for the executable files. Check if the user’s command exists at the directories of the PATH. If yes, transforms it to the full path, clear for shell.
Function _getenv return pointer to the string — full PATH value. It captures it in a variable that will get tokenized the same way as the command line.
After that we will split PATH into tokens. So we split the string till delimiter receiving the first element.
After receiving all other elements of the string value of the PATH.
Using function _strdup to store all the elements. After Simple shell will change into each directory in the path and search inside each directory for the command.
A while loop is started either the file is found in one of the directories or nothing is found. After changing directories, stat is used to check if the command is present in the directory.
Before concatenation: “ls”. After concatenation: “/bin/ls”
When concatenation is over, the shell changes back to the directory it was in. Then passes the newly created argument to the execution phase.
Execution phase. Fork.
So we found “ls” in the /bin directory. “/bin/ls” is passed into the execution. We will create a child process using a fork.
After creation of the child process, it will return a number in the form of a process ID. If the parent process is the one who forked the child, a number greater than 0 will be present. If the child tries to grab the number, it will be 0. The parent process waits for the child process to end using wait.
Why do we need to fork a child process ?
When the function execve executes, data of the calling process will be lost. They will be overwritten by the program loaded (text, data, stack). Using pid numbers, the shell can tell when it’s a child process. It executes in the child process. The parent process rests untouched in it’s infinite loop. The parent process will wait until the child process is done executing.
We use the function wait to wait for state changes in a child process, obtains information about the changes status. When status changes, the parent process will start executing again. “ls -l” was executed and the child process ended.
Image 5. Output. The command “ls -l” was executed.
The parent process goes back to the beginning of the infinite loop. A command prompt will be printed and the shell will wait for another command. The process will start again.
Co-author of the project is Aurelie Cedia.