9. Introducing the Shell

Questions:

  • “What is a command shell and why would I use one?”
  • “How can I move around on my computer?”
  • “How can I see what files and directories I have?”
  • “How can I specify the location of a file or directory on my computer?”

Objectives:

  • “Describe key reasons for learning shell.”
  • “Navigate your file system using the command line.”
  • “Access and read help files for bash programs and use help files to identify useful command options.”
  • “Demonstrate the use of tab completion, and explain its advantages.”

Keypoints:

  • “The shell gives you the ability to work more efficiently by using keyboard commands rather than a GUI.”
  • “Useful commands for navigating your file system include: ls, pwd, and cd.”
  • “Most commands take options (flags) which begin with a -.”
  • “Tab completion can reduce errors from mistyping and make work more efficient in the shell.”

9.1. What is a shell and why should I care?

A shell is a computer program that presents a command line interface which allows you to control your computer using commands entered with a keyboard instead of controlling graphical user interfaces (GUIs) with a mouse/keyboard combination.

There are many reasons to learn about the shell:

  • Many bioinformatics tools can only be used through a command line interface, or have extra capabilities in the command line version that are not available in the GUI.
  • The shell makes your work more reproducible. When you carry out your work in the command-line (rather than a GUI), your computer keeps a record of every step that you’ve carried out, which you can use to re-do your work when you need to. It also gives you a way to communicate unambiguously what you’ve done, so that others can check your work or apply your process to new data.
  • Many bioinformatic tasks require large amounts of computing power and can’t realistically be run on your own machine. These tasks are best performed using remote computers or cloud computing, which can only be accessed through a shell.

9.3. Summary

We now know how to move around our file system using the command line. This gives us an advantage over interacting with the file system through a GUI as it allows us to work on a remote server, carry out the same set of operations on a large number of files quickly, and opens up many opportunities for using bioinformatics software that is only available in command line versions.

keypoints:

  • “The /, ~, and .. characters represent important navigational shortcuts.”
  • “Hidden files and directories start with . and can be viewed using ls -a.”
  • “Relative paths specify a location starting from the current location, while absolute paths specify a location from the root of the file system.”

9.4. Moving around the file system

We’ve learned how to use pwd to find our current location within our file system. We’ve also learned how to use cd to change locations and ls to list the contents of a directory. Now we’re going to learn some additional commands for moving around within our file system.

Use the commands we’ve learned so far to navigate to the shell_data/untrimmed_fastq directory, if you’re not already there.

$ cd
$ cd shell_data
$ cd untrimmed_fastq

What if we want to move back up and out of this directory and to our top level directory? Can we type cd shell_data? Try it and see what happens.

$ cd shell_data
-bash: cd: shell_data: No such file or directory

Your computer looked for a directory or file called shell_data within the directory you were already in. It didn’t know you wanted to look at a directory level above the one you were located in.

We have a special command to tell the computer to move us back or up one directory level.

$ cd ..

Now we can use pwd to make sure that we are in the directory we intended to navigate to, and ls to check that the contents of the directory are correct.

$ pwd
/home/sateeshp/shell_data
$ ls
sra_metadata  untrimmed_fastq

From this output, we can see that .. did indeed take us back one level in our file system.

You can chain these together like so:

$ ls ../../

prints the contents of /home, which is one level up from your root directory.

Finding hidden directories

First navigate to the shell_data directory. There is a hidden directory within this directory. Explore the options for ls to find out how to see hidden directories. List the contents of the directory and identify the name of the text file in that directory.

Hint: hidden files and folders in Unix start with ., for example .my_hidden_directory

Solution

First use the man command to look at the options for ls.

$ man ls

The -a option is short for all and says that it causes ls to “not ignore entries starting with .” This is the option we want.

$ ls -a
.  ..  .hidden	sra_metadata  untrimmed_fastq

The name of the hidden directory is .hidden. We can navigate to that directory using cd.

$ cd .hidden

And then list the contents of the directory using ls.

$ ls
youfoundit.txt

The name of the text file is youfoundit.txt.

9.5. Examining the contents of other directories

By default, the ls commands lists the contents of the working directory (i.e. the directory you are in). You can always find the directory you are in using the pwd command. However, you can also give ls the names of other directories to view. Navigate to your home directory if you are not already there.

$ cd

Then enter the command:

$ ls shell_data
sra_metadata  untrimmed_fastq

This will list the contents of the shell_data directory without you needing to navigate there.

The cd command works in a similar way.

Try entering:

$ cd
$ cd shell_data/untrimmed_fastq

This will take you to the untrimmed_fastq directory without having to go through the intermediate directory.

Navigating practice

Navigate to your home directory. From there, list the contents of the untrimmed_fastq directory.

9.6. Solution

$ cd
$ ls shell_data/untrimmed_fastq/
SRR097977.fastq  SRR098026.fastq

9.7. Full vs. Relative Paths

The cd command takes an argument which is a directory name. Directories can be specified using either a relative path or a full absolute path. The directories on the computer are arranged into a hierarchy. The full path tells you where a directory is in that hierarchy. Navigate to the home directory, then enter the pwd command.

$ cd
$ pwd

You will see:

/home/sateeshp

This is the full name of your home directory. This tells you that you are in a directory called dcuser, which sits inside a directory called home which sits inside the very top directory in the hierarchy. The very top of the hierarchy is a directory called / which is usually referred to as the root directory. So, to summarize: dcuser is a directory in home which is a directory in /.

Now enter the following command:

$ cd /home/sateeshp/shell_data/.hidden

This jumps forward multiple levels to the .hidden directory. Now go back to the home directory.

$ cd

You can also navigate to the .hidden directory using:

$ cd shell_data/.hidden

These two commands have the same effect, they both take us to the .hidden directory. The first uses the absolute path, giving the full address from the home directory. The second uses a relative path, giving only the address from the working directory. A full path always starts with a /. A relative path does not.

A relative path is like getting directions from someone on the street. They tell you to “go right at the stop sign, and then turn left on Main Street”. That works great if you’re standing there together, but not so well if you’re trying to tell someone how to get there from another country. A full path is like GPS coordinates. It tells you exactly where something is no matter where you are right now. You can usually use either a full path or a relative path depending on what is most convenient. If we are in the home directory, it is more convenient to enter the relative path since it involves less typing.

Over time, it will become easier for you to keep a mental note of the structure of the directories that you are using and how to quickly navigate amongst them.

Relative path resolution

Using the filesystem diagram below, if pwd displays /Users/thing, what will ls ../backup display?

  1. ../backup: No such file or directory
  2. 2012-12-01 2013-01-08 2013-01-27
  3. 2012-12-01/ 2013-01-08/ 2013-01-27/
  4. original pnas_final pnas_sub

Solution

  1. No: there is a directory backup in /Users.
  2. No: this is the content of Users/thing/backup, but with .. we asked for one level further up.
  3. No: see previous explanation. Also, we did not specify -F to display / at the end of the directory names.
  4. Yes: ../backup refers to /Users/backup.