Linux introduction: Using the command line

Introduction

What is a command line and why do you need it?

Many bioinformatics tools don’t have graphical interfaces, so no windows, no buttons, no drop-down menus, etc
You need to learn to install and run these tools
Typically you run these tools using the “Command Line”, where all commands are written as text
The other BIG advantage of the command line (which we will not cover today) is that you can write scripts for repetitive tasks on 100s of files.
The command line concept started in Linux/Unix computers, but is now available on Windows / Mac as well. It is sometimes also called a “shell” because it is a layer on the outside to interact with a computer’s core, it’s internal systems!

What you will do today:

In this practical you will learn some very simple commands and principles of how to interact with the “command line”
Normally, when you use the command line, you will be either interacting with your own computer, or a desktop, or with a server provided by your university/institution/etc. However, today, for training purposes, we are running all commands on temporary computers provided by a company called gitpod.io, and you will be launching these machines in your web browser.

Starting the tutorial

Click on this link in your browser: https://gitpod.io/#https://github.com/sujaikumar/exrna

You should see a screen like this:
Click “Continue (Enter)”

After a few minutes, you should see a screen like this:
This is a new computer running on a remote website (gitpod.io) and this computer is ready to take commands at this point.

Typically, the “prompt” is the $ sign, but it can have extra information before it. In this case, it says (exrna) gitpod /workspace/exrna (main) $.

There is a vertical box after the $ sign where you can start typing your first command (next section)
If you leave your computer, this workspace will stop to prevent you wasting your free 500 credits / 50 hours. To restart it, go to https://gitpod.io and click on the greyed out workspace to restart it.

My first commands - Where am I?

Type ls and press Enter (or Return on some keyboards).

ls is a command to the computer to list all files in the current folder

It should look like this:

That means there are three things already inside the current folder (they were created when the new machine started)
It can be hard to see the list in horizontal format, so we can add a parameter to ls to show us a vertical list of files and folders with other useful information: type ls -l

It should look like this:

Notice the extra columns - they include additional information about the permissions, owners of the files/folders, the size, the date last modified, etc. We will not be going into this in detail today
Try adding another parameter -a by typing ls -al. -a is another parameter that says list ALL file types, including hidden folders

The output looks like this:

Notice the extra entries. They all begin with .. They were not shown previously because they were hidden/

Exercise 1: Try putting the parameters -al in these different ways and tell us what each does:

ls -la
ls - al
ls -a -l
ls -l -a
-al ls
ls -l --all

Note: Some parameters have a long-form version like -a can also be written as --all

To see all parameters for a command, you can type ls --help, i.e. put the parameter --help after the command.

That was the ls command. Now let’s try a new command:

Type pwd and press Enter

pwd is also a command to the computer, and it stands for print working directory. It tells you which folder you are currently in

It should look like this:

This means you are in the folder /workspace/exrna . The starting / indicates the topmost level of the computer and there is a folder at that level there called workspace. The next /exrna means there is a subfolder called exrna inside /workspace, and you are inside that subfolder.

To summarise: - the prompt is the point at which you can start typing commands. On most systems it looks like a $. There can be additional information before the $

a command is the first word you type, without spaces, to tell the computer what to do. Examples so far are:
- ls tells you what is in the current folder
- pwd tells you where you are
You can add extra parameters to a command with - to make it behave differently, like -al. Parameters can be specified short-form -, or long form --
You can usually get help for any command by typing command --help

Moving around

From now on, we won’t say “Type __ and press Enter”, If you see the command written in this format: $ ls, then you should type the command ls at the $ prompt and press Enter afterwards yourself:

How do you get to a different folder on the computer? You can use the cd command which stands for change directory (a directory is the same as a folder)

$ cd .. Note: Don’t forget the space after cd
$ pwd to see where you are.

.. means one level above the current level, so cd .. took you one level above /workspace/exrna into /workspace
$ ls to see what is in this `/workspace’ folder:

The computer prints exrna as the contents of this folder.
$ cd exrna to change directory back to the exrna folder

If you type a location without a / at the start, then the computer assumes it is relative to the current folder
$ cd ../../ to go two folders up

You should now be in / . Double check with $ pwd
$ ls

These are all the folder locations at the highest level in your computer - at the / level

One of them is workspace
$ cd /workspace/exrna - this will change directory straight to the /workspace/exrna folder in one step

The first / tells the computer to start at the highest level rather than at the current level.

If you type a location that does not exist, the computer will give you an error

Exercise 2:

Without typing it in, can you figure out which folder you will be in after typing these two commands:
- $ cd /home/gitpod
- $ cd ..
- check your answer by entering the command and then $ pwd
Again, try to figure what the computer will print after each command, and then check your answer:
- $ cd /var
- $ cd ../home/gitpod
- $ cd ../../workspace/
- $ pwd
- $ ls

Other basic commands everyone should know

If you are ever stuck, type Ctrl + c (ie press the control key first and keep it pressed, then press the c key once, then release both) - this will kill the current command and bring you back to the $ prompt
TAB COMPLETION: rather than type a full filename or foldername, you can just press the TAB key and it will complete it if it can. Try this:

Type cd /wor and then press TAB. It should complete it to cd /workspace after which you can press ENTER
$ rm FILENAME removes FILENAME (be careful, it won’t ask you for confirmation)
$ tar -xvf FILENAME.tar will unpack FILENAME.tar and put it’s contents in the current folder
$ tar -cf FILENAME.tar FOLDERNAME will take the folder FOLDERNAME and pack it into a single file called FILENAME.tar
$ gzip FILENAME will take FILENAME and compress it into a smaller file called FILENAME.gz
$ gunzip FILENAME.gz will take FILENAME and decompress it into a larger file called FILENAME
$ gzip -d FILENAME.gz will do the same as above. That means gunzip is an alias for gzip -d
$ head FILENAME will show you the first 10 lines of FILENAME
$ tail FILENAME will show you the last 10 lines of FILENAME
$ cat FILENAME will print the whole file out
$ less FILENAME will show you the file contents in a browsable mode one screen at a time. Type h to see options, or q to quit and come back to the command prompt.

Concept of pipes

The linux/mac/windows subsystem command line shell is very powerful. You can also chain commands using a pipe symbol |.

For example, do this:

$ cd /workspace/exrna - to go the /workspace/exrna folder

What if we just wanted to see the first 10 lines of the compressed file Hbakeri_small_UniVec.fa.gz ?

We could do it in two steps:

$ gzip -d Hbakeri_small_UniVec.fa.gz - this will create a decompressed file called Hbakeri_small_UniVec.fa
$ head Hbakeri_small_UniVec.fa - this will show you the first 10 lines

But maybe we didn’t want to decompress the file just to look inside it. So we would have to recompress it using this command:

$ gzip Hbakeri_small_UniVec.fa

This can take quite long and be annoying for very large files. Instead we can use a single line chained command like this:

$ gzip -dc Hbakeri_small_UniVec.fa.gz | head - the -c parameter in the first command says “don’t make a new decompressed file, just show me the output on the console”. And the result is “piped through” the command head - try it out

Things to remember

“Command line”, “Shell”, “Prompt”, are often used interchangeably. They all refer to the place where you type commands to the computer
When you type a command, always press the Return (or Enter) key afterwards, otherwise the computer does not know that you have finished the command.
Be VERY careful about spaces and typos - not leaving a space or adding an extra space can change the meaning of a command. The command line is case-sensitive, so be careful with UPPERCASE/lowercase: most (but not all) Linux commands use only lowercase letters. These are some of the things that most people find hard about the command line to begin with.
Here is a short cheatsheet list of the general commands you need for this workshop
- ls
- pwd
- cd
- gzip
- less
- rm
- tar
You can pipe the output of one command as the input to another using |