Welcome to Mish: A Simple Bash Clone Developed for the 42 Berlin School Minishell Project.
Written by migmanu and SebasNadu , in accordance with version number 7.1 of the subject. This project does not cover the bonus part. In total, it took us around four weeks to finish the whole thing.
To test our minishell
, Mish, follow these steps on both Linux and macOS:
- Clone and build the repository:
- Run Mish:
git clone https://github.com/your-username/your-repo.git
cd your-repo
make
Wait until libft
is cloned and built.
./minishell
Now you're ready to explore Mish! If you encounter any issues or have questions, feel free to reach out to us.
The minishell
project requires students to develop a simple clone of Bash. It is also the first group project of the 42 Core Curriculum. From these two statements, we can easily identify the main challenges and goals of the task:
- Acquire a deep understanding of Bash inner workings.
- Learn how lexical and syntactic analysis works, which is indispensable also for compilers, interpreters, and programming languages.
- Coordinate work with your teammate.
We believe that without properly addressing and planning for these two challenges from the very beginning, minishell
will quickly turn into what a lot of students call "minihell."
Released in 1989, Bash has by now acquired almost mythical status. It is a universal tool that every programmer should feel, at the very least, acquainted with, if not properly comfortable. Therefore, sources at your disposal are plenty. If you are a 42 student, we advise you to search Slack for resources provided by fellow students over the years. Be aware, the minishell
subject has changed over time, so not all advice is relevant.
Here is a list of some of the most mentioned and useful links we found online:
- Minishell: Building a mini-bash (a @42 project) | tutorial
- List of edge cases to test for
- Another list of edge cases and expected behaviors | some are not expected by the project subject
- High level explanation of the project | has good sources
- 42 Slack chat about how to implement history
- Really in-depth video series explanation of how the shell works | watch at 1.25x
- Shell (computing) - Wikipedia
- Read-eval-print loop | Basic paradigm behind CLI
- Writing your own shell | Extensive explanation of a shell written in C++
- Shell syntax
- Write your own shell | paper on how to write a shell which might apply better for a non-bonus minishell.
- A lot of cool resources in one place.
In our experience, no amount of reading will actually prepare you for some of Bash's more obscure behaviors. And although a lot of them are out of the scope of the project, there's still plenty that needs to be considered, even if you do not go for the bonus. We encountered a lot of these difficult cases while testing our almost finished project. This is the most stressful way of discovering them. Luckily, by then our program was robust enough to withstand most of them.
Nevertheless, there are two available online spreadsheets that cover a huge amount of cases to test for. You can find them linked above. It's a good idea to go through most, if not all, of them at least once to test your minishell
. Maybe a read before could avoid some issues down the line too. But be aware, a lot of the cases mentioned in these spreadsheets are out of scope, apply only for bonus or are outright wrong. Do not follow them blindly, but test each with Bash on your own.
There was no great complexity regarding teamwork. Given the small size of the team and our generous time availability, we went through a simple workflow.
For this you must have a basic understanding of Git, its advantages and dangers. There are many, easy to find, resources online. We can recommend this [game](https://learngitbranching.js.org/) for a fun approach.
Each change or bug fix was done inside a branch of this repository, pushed, reviewd by the other author and merged. We could make this work because we communicated a lot, either through Slack or coding side by side.
Regardless of how you chose to organize work, you will need to communicate as much as possible and try to understand whats going on even inside the files you have not written.
After you've read the listed sources, you'll learn that Bash is composed of four distinct parts: the scanner
or lexical scanner, the tokenizer
, the expander
, the tokens parser
,and the executor
. Tokens are put into a tree, which is then executed in the appropriate order. This should probably be your layout if you are aiming for the bonus.
- The
scanner
: In charge of reading through the inputed command. It deals with problems such as syntax - The
tokenizer
: Divides the inputed command into tokens, basic units of execution. Some of this tokes we calloperators
, like the pipe symbol,redirections
, orwords
, which can be commands and/or arguments - The
expander
: Expands expresions included within$()
, such as enviroment variables. - The
parser
: Parses the tokens and create an AST (abstract syntax tree) - The
executor
: Manages forks, calls to commads and built-in commands (likecd
). Must be well design in order for the piping to work exactly as in bash. Pay attention to cases likecat cat ls
.
To this you obviously have to add the built-in commands the subject requires. These are:
- cd
- echo
- pwd
- export
- unset
- env
- exit
But things can be simplified if you choose not to do the bonus. Mish does not use a tree structure to create an AST; instead, it employs a list of commands. This structure not only encompasses the entire command created after parsing the tokens but also handles redirections. If you aim to develop a more sophisticated shell and complete the bonus, an AST is the way to go; otherwise, a simple list suffices for the task.
There are a lot of edge cases the initial parsing has to take into consideration. Those spreadsheets certainly came in handy. It was also challenging to design the correct behavior for pipes. Be especially attentive to blocking commands like cat
and how they interact with different types of commands like ls
or head
. Pipes are a somewhat abstract concept, so a lot of trial and error went into solving this part
You might also notice that we used a hash map for storing the environment variables. Although this involved some extra work at the beginning, it ended up simplifying a lot of our work. It certainly saved us from bugs in some, unanticipated, edge-cases. Though it is certainly not necessary to develop a hash map, we highly recommend to try it out.
In the spirit of not spoiling the project for anyone, we are not gonna go into a detailed explanation of how Mish works. If you have any doubts or need help, you can reach us via Slack to jmigoya-
and johnavar
.