Difference between revisions of "Computer lab introduction"
(→Downloading files onto the class servers)
|Line 1:||Line 1:|
''"On the other side of the screen, it all looks so easy."'' (TRON, 1982)
''"On the other side of the screen, it all looks so easy."'' (TRON, 1982)
Revision as of 09:11, 23 July 2012
"On the other side of the screen, it all looks so easy." (TRON, 1982)
- 1 Goals
- 2 WiFi Login
- 3 Remote computer access software
- 4 Plain-text editors
- 5 SSH
- 6 Directory structure
- 7 Command line interface
- 8 Intro-to-Unix tutorial
- 9 Advanced topics (unfinished)
- 10 More about text editors
- 11 Online resources to learn UNIX
By the end of this introduction, you should be able to: - Log in to the class cluster
(either directly using ssh or via cyberduck)
-Explore the directory structure -Edit files on the cluster -Create and run a basic shell script Ask of help if any of these things aren't working!
To log on to the MBL wireless choose the MBL-REGISTER from the wireless list. Your username is your initials followed by the 5 digit number on the side of your identification card. Your password is the same. E.g. if you name is Joe Bloggs and the your card has the number 12345 on the side then your login details are:
Some people have been having trouble, if it isn't working find or email Emily Jane firstname.lastname@example.org
Remote computer access software
We will use Secure Shell (SSH) and sFTP to connect to the servers. The servers are powerful computers where we can run the programs much faster than would be possible on your laptop. In order to access these you need to log in to your assigned server
Please download the following programs as needed unless it is installed by default
- Mac Os X
- Cyberduck (file transfer via sFTP)
- SCP and SSH installed by default
- SCP and SSH installed by default
Programs that you might be used to using for manipulating text (e.g., Microsoft Word) do all sorts of complicated things to your files, including inserting lots of weird codes for specific formatting, special software-specific characters, page margin instructions, etc. The software that we are going to use frequently are controlled by text files (that's where their instructions are stored), and they are easily confused--they need a simple file with just their instructions, and none of that other stuff. So we need a plain-text editor to make and manipulate these files.
Note: different operating systems code line-break or end-of-line characters in different ways, and this can cause problems (Unix: LF '\n', Mac: CR '\r', Windows: CR+LF '\r\n').
- Linux: gedit (GNOME default), Kate/KWrite (KDE default)
- Mac: TextEdit (default), TextWrangler
- Windows: Notepad (default), Notepad++
SSH stands for "Secure Shell." These are programs that provide a Unix shell so that one can enter commands and log onto other computers (i.e., those on the server where we will be doing our analyses).
SSH on Windows
SSH on Linux and Mac Os X
First, open a terminal window:
- Linux: Konsole (KDE), gnome-terminal (GNOME)
- Mac: Terminal (in /Applications/Utilities)
In the following command replace username and servername with the user and server name found on the back of your name tag:
For example if my username is cmeehan and my server name is class-04 I type:
It will then ask for my password which is on the back of your card under your username
Changing your password
The first thing to do once you have successfully logged on to a server is to change your password. This is done by typing:
This will prompt you to enter a new password so do so and press the enter key. Next you re-enter the new password and again press enter. It may then ask you for your LDAP password and you should type in your original password given on the back of your card.
Once you have done this every time you ssh in to the server you will use the new password created.
The file systems used by Linux, Mac OS X, and Windows are organized in a hierarchical, multifurcating tree structure. That might sound confusing, but you're used to working with this organization scheme through the Mac Finder or the Windows Explorer--folders (directories) are stored inside other folders, and they in yet other folders. The path through this directory tree can be used to specify the absolute (starting at the root) or relative (to some other directory) location of any given file.
- NOTE: Regardless of the operating system on your laptop, when you log on to the cluster, you will be on the class machines, and they're all running linux.
The root folder of the directory tree is symbolized by a forward slash "/". Path names through the directory tree are formed by separating directory names with the "/" character. For example in the path /usr/local, local is a subdirectory of usr, which in turn is a subdirectory of the root folder (/). Further, usr is called the parent directory of local (whose parent is /). Users with an account on the computer have a so called home directory for their own files and folders, and where the operating system keeps user-specific settings. The operating system uses other locations in the directory tree to store system-wide files, like some executables, system-wide settings, etc.
In Windows, each disk partition has its own, separate root (without a common root for all partitions). However, each drive still has a most elemental directory, e.g. denoted C:\ for the partition named C. Absolute paths are therefore based on the partition root. Folder names in paths are separated by the backslash character "\".
Current working directory
The current working directory (or working directory) may be defined as the directory we are working in at the moment. Path names that do not start with / are interpreted as relative to the current working directory.
Command line interface
When you open up your SSH client (a Terminal window on a Mac, or PuTTY for Windows), you'll see a prompt that will look vaguely like: Macintosh-6:user$ There are lots of variations on the theme, but the prompt usually has a little bit of information on where you currently are on the computer (in this case, in a folder called "user"), and then some sort of symbol, and then a space where you can enter commands. Commands are only executed after you press enter. If you are logged onto a class server, the prompt will be something like [username@classServerName ~]$
Unix commands follow the general format of:
command -options target.
Not all commands need options (sometimes called flags, and generally preceded by a single or double hyphen ("-" or "--")) or targets, but others require them.
- For example:
- cd homedirectory uses the command "cd" (change directory) and the target "homedirectory" to move from the current directory into the subdirectory called "homedirectory"
- ls -l homedirectory uses the command "ls" (list), the option "-l" for long-list, and the target "homedirectory" to list the contents of homedirectory in the "long list" format, which provides more thorough descriptions than does the regular "ls".
Notes on syntax for directory structure
- Two dots (..) indicates the parent directory of the present working directory. So, for example, "cd .. will move you back one directory.
- One dot (.) indicates the present working directory. So, for example, "cd ." will keep you where you are. There are times where the single dot can be more useful than this...
- The tilde (~) refers to your home directory. On the class machines your home directory is /class/yourusername. You'll also have a unique home directory on your laptop, etc. The tilde is very helpful if you get lost while using the terminal -- just type "cd ~" and you'll be back in your homedirectory.
- A forward slash (/) by itself or at the start of a path refers to the root of the filing system -- the folder that contains all other folders.
- NOTE: do not make changes on the class server in the root folder or any shared folders. All your work is to be done in your home directory or a subdirectory of this.
Some suggestions concerning file and folder names
- Avoid spaces in script and filenames (use underscores, dots, or hyphens, use "CamelBack" notation). Spaces are used in command line editing to separate options etc so if there is a space in a filename it will mess up the correct running of programs.
- Do not use "weird" characters (#@!*&^, etc., especially ?, *, \, or /)
When it all goes south, "control-C" is your friend. It breaks whatever processes are running, and gives you your prompt back. Or, failing that, just close the Terminal and start again.
Start by entering
This will print your working directory (the directory you are currently in). You should be in /class/your_login
This list the contents of your working directory (which is likely empty).
You can also look at the contents of any other directory by supplying the path. For instance,
will list the contents of the parent directory that your current directory is in, in this case the class directory.
mkdir is the command to make a directory. Type
to make a new folder called myfolder. Type ls and then enter. It should be listed. We can also use the ls command with flags at this point. Typing
will list the contents of the current directory in "long" format which includes information about permissions and file size.
cd is the command to change directories. We can move into the new folder you made by typing
You can use pwd to confirm you've moved and are now in a new working directory. You can move back to your home directory by typing
And can move to the root class directory by typing
Confirm you are in the class directory with pwd. You can move from here to the myfolder directory in your home directory by typing
Calling a program
First, let's make a file using the command-line editor "nano." If you use a pre-existing file as nano's target, it will open it for editing, and if you use a non-existent file name, nano will create it for you (a new blank file). Let's do the latter:
This command both opens nano, and creates a file called "firstprogram.sh". In the nano window, enter the following two lines:
#!/bin/bash echo "hello mbl"
This file is called a shell script. It is a file that contains lists of unix commands like echo, ls, etc.
To move around your nano window, use the arrow keys, not your mouse. Control+x will exit, and nano will ask you if you want to save changes. Say yes.
Shell scripts almost always have the suffix ".sh". They are run by typing "sh" before the filename.
We can run this program by typing
This should print 'hello mbl' to the screen.
Copying, renaming, and moving files
The copy command (cp) is used to copy files to new places. The command basic syntax is cp source_file destination_file
First create a file called 'tmp1.txt' in nano and put whatever you want inside of it.
We will now make a copy of tmp1.txt called tmp2.txt by typing:
cp tmp1.txt tmp2.txt
We can also cp a file from the shared directory using absolute and relative paths.
cp /class/shared/testfile.txt .
will copy a file named "testfile.txt" to your current directory but will not change its name. Use ls to verify.
The move command (mv) can be used to move or rename files. The command syntax is mv source destination
mv testfile.txt example.txt
has the effect of renaming testfile.txt to example.txt. Use ls to check. We can move this file up one directory by using mv as well.
mv example.txt ..
will move the example.txt file to the parent directory of your current directory. Use ls .. to check.
- NOTE: Do not move files that are not in your home directory or a subdirectory of this. All files in shared or root folders are to be copied, never moved.
Loading pre-installed programs
Many of the programs that you will need for the course are already installed on the class servers. These need to be specifically loaded in for them to work.
For example on the command line type
You should get an error message saying: -bash: blastn: command not found
module load bioware
This should load all the preinstalled programs so that you can access them. Now type blastn again and you should see:
BLAST query/options error: Either a BLAST database or subject sequence(s) must be specified
Showing that blastn is now available for use.
If needed, programs can all be unloaded by typing module unload bioware
Downloading files onto the class servers
There are two places you may need to get files on to the class server from: your own computer or an online source
In order to get files from your computer to the server open a terminal window and navigate to the folder on your computer using the commands like cd. Once in the folder containing the file you want to upload you type
scp filename username@classServername:./
This will upload the file to your home directory on the cluster
In order to get a file from an online source you can use wget. Type:
Where the url is the website address of the file you wish to download. For example
Will download the html file that makes the main page of the molecular evolution website to the directory you are in.
The following table contains a list of commands that will allow us to navigate through the directory structure. The entries are linked to their Wikipedia pages, which contain very useful examples.
|pwd||chdir||print working directory||pwd|
|ls||dir||list directory contents||ls|
|history||doskey /history||display command history||history|
|cd||cd||change directory||cd directory_name|
|mkdir||mkdir||make directory||mkdir directory_name|
|cp||copy||copy files||cp original_filename copied_filename|
|mv||move||move files (the same as rename files)||mv original_filename moved_filename|
|rm||del||remove file(s)||rm filename|
|clear||cls||clear the screen||clear|
|exit||exit||quit command line||exit|
- UNIX/Linux command line cheat-sheet: http://fosswire.com/post/2007/08/unixlinux-command-cheat-sheet/
- MS-DOS full command list: MS-DOS commands
Command line editing
The following features of most command line interpreters often come in handy:
- Up and down arrow keys: cycle through previously issued commands
- Tab completion
- 'CTRL+a' moves cursor to beginning of the line
- 'CTRL+e' moves cursor to end of the line
Advanced topics (unfinished)
Adding a directory to the path
sed, grep, and awk
- sed (UNIX stream editor)
- Regular expressions cheat-sheet: http://www.addedbytes.com/cheat-sheets/regular-expressions-cheat-sheet/
More about text editors
Looking inside text files
There are different commands that can be used to look inside files from within the command line interface. Two of these are cat in Linux/Mac, type in MS-DOS, and less in Linux/Mac, more in MS-DOS. The command more is also available in Linux/Mac, and in fact more is less, but less has more features than more.
- cat: cat stands for concatenate. This command is useful for peeking at short files; using cat to view a long file results in the top lines scrolling off before one can even read them. The simplest use of cat is with
- less: This command is useful to read long text files because a page of text (i.e., a command line window filled from top to bottom) is displayed one at a time. With less is easy to move forward and backward by lines, pages and even between two or more files. less is a program in itself, so when it is invoked, a prompt appears at the bottom of the page awaiting for a new less command. The prompt in less is a colon (:). The simplest use of less is with
The following table contains a list of useful less commands
|spacebar||display next page|
|return||display next line|
|n f||move forward n lines|
|b||move backward one page|
|n b||move backward n lines|
|/ word||search forward for word|
|? word||search backward for word|