Welcome to our first tutorial!

Topics covered

  1. Short introduction to computational biophysics.
  2. Installing VMD
  3. Quick detour: amino-acids.
  4. Visualizing a protein with VMD and performing simple analyses.

This tutorial is adapted from Using VMD

A high-level overview of simulations

What do we gain from simulations?

Scheme adapted from Allen & Tildesley
Image adapt. from Allen & Tildesley Computer simulations of liquids, Oxford Science Publ.

Exact Mathematical models of complex systems are unattainable due to the very high number of degrees of freedom. One can hope to build an approximate physical theory. This poses some practical downsides though.

  1. The time required to build and validate a proper physical theory might be well beyond what is available to a project.
  2. Validating the theory requires a large amount of data from experiments -> more time, more money. It is often more convenient to simulate starting from first principles (e.g. Newton’s laws, Q.M., Statistical Mechanics)

First proper computer simulations (~1940-1950)

  1. Manhattan project.
  2. Meteorology/climate.
  3. equations of state of liquids.
  4. Evolutionary problems (cellular automata).

Nowadays

  1. Insight into otherwise intractable problems.
  2. In-silico experiments.
  3. Industrial workhorses.

evolution of computers
Adapted from P. Carloni

The machinery for simulating a biomolecule

Practically, the machinery we are going to use can be exemplified in the following way.

In a nutshell, the ingredients we need are:

  • the coordinates of the system;
  • the structure of the protein, related to the force field we use;
  • the force field parameters for the interactions;
  • a configuration file with the appropriate settings of the used algorithms;
  • a software to integrate the Newton’s equations;
  • the trajectory: a collection of coordinates that follow the evolution of the system.

We will rapidly see some of the files in this lesson, but keep in mind this scheme, since it will be useful later!

Software

Let’s start from the basic. For this tutorial we will use VMD, a visualisation program that will allow us to render pretty images and analyse our simulations as well. The computers provided to you already have VMD installed. If you want the software on your own laptop, you can download it from download VMD with a simple registration. Choose the version 1.9.3 suitable for your operating system.

Bash

The Bourne Again SHell (bash) is a command-line interpreter that allows us to execute commands via text instructions. Let’s have a quick overview of the basic of bash in the meanwhile.

On Ubuntu type ctrl-alt-T in order to open a terminal. For macOS, you should find the Terminal in the Applications/ folder. For Windows users, use the lab computers.

To check what bash you are using, type which bash. You should see something like

/bin/bash
/usr/bin/bash

Let’s have a walk into the computer filesystem. First of all, type pwd (print working directory). The output should be something like:

/home/your_username

What is inside your folder? Type ls (list) and read the output.

What changes if you type “ls -lh”? (without “”)

Let’s create now a new directory for the course.

$ mkdir QCB_course

Bash does not like white spaces, so use the underscore _.

With pwd we can see that we are still in the /home directory. Try!

Let’s go inside QCB_course.

$ cd QCB_course

To create an empty file, you can use touch dummy_file.sh. We will see in a minute what the extension .sh means.

We have no interest in the dummy_file.sh, so we can remove it with:

$ rm dummy_file.sh

Of course, we are all smart, but in order to avoid unwanted deletion, let’s add a small command to your ~/.bashrc (the . at the beginning of a filename makes it “invisible”). First, let’s make a backup copy of the file:

$ cd #to go into the home folder
$ cp .bashrc .OLD_bashrc

Guess what cp does.

Finally we can add our line:

$ echo “alias rm=’rm -i’” » .bashrc

If we go into ~/QCB_course folder, create a new file and remove it, we should be asked for confirm. Right? (Do the same on a new terminal or type source .bashrc)

With bash we can perform some basic arithmetics. Create a new file called my_script.sh and open it with a text editor.

Try to use an editor like ViM, Emacs or nano (without a GUI)

Write the following text into the file and try to understand what it does:

#!/bin/bash
i=5
for j in {1..4}
do
  echo "Adding 1 to $j: $((j + 1))"
done

echo "Math (should) work! 5*4 = $((i*j))"

To launch this script:

$ bash my_script.sh

You can also launch the commands with ./my_script.sh, if your file is an executable. List the content of your folder and then do:

$ chmod 700 my_script.sh

If you “list” again, did somethin change?

bash is not the only shell you have on your computer!

To further learn about bash, feel free to search it on Google.

Installing VMD (optional)

Extract the folder, run the configuration, go into the src/ folder and install VMD.

cd Downloads/
tar xvzf VMD_something.tar.gz
cd VMD_something
./config
cd src
make install

Now you should be ready to go.

Getting started

Download the source file. Move the .tar.gz archive into QCB_course and untar it with

$ tar xvzf tutorial1.tar.gz
$ rm -f tutorial1.tar.gz

Now launch VMD:

$ vmd

and you should have the following:

VMD graphical interface.

If everything works, close it and let’s define a common playground.

Quick and Easy recap

There are a lot of amino acids out there.

everywhere

But only 20 of them are coded into our DNA. We will deal mainly with these residues, but it is not uncommon to have acetylated residues especially in the termini.

“What is a terminus or a residue or an amino acid?” I hear you ask.

Amino acids are the building block of proteins. If isolated, an amino acid has an amine (-NH2)1 and a carboxyl (-COOH) group. If linked with others to form a polypeptide, they lose a hydrogen from the N-terminal (amine group) and an OH from the C-terminal (carboxyl group) and we have a residue. We will usually use residue without this distinction. Ça va sans dire, the first and the last residue have to be capped with the proper terminal end since they are not linked from both sides.

All amino acids have a Carbon, called carbon-alpha, where a functional group is attached. This group is specific to each amino acid and it is called sidechain, while the other (heavy) atoms, present in each residue, constitute the backbone.

Below we have an Alanine with different segments highlighted that exemplifies the nomenclature just introduced.

amino_recap

a) Alanine amino acid in Licorice with the C-alpha shown. b) Residue of the same amino acid.
c) Backbone (bottom) and sidechain (top).

Further information will be provided as we dig deeper in the course.

In media re with VMD

VMD is a software for visualising and analysing molecular systems like proteins. We will see a small subset of the available functionality of the program. In this lesson we will focus on the visualisation of a protein and on the basics of a scripting language called Tcl (read: tickle).

Let’s go back to the shell. In your working folder you should have the following files:

1ubq.pdb equilibration.dcd pulling.dcd ubiquitin.psf

Open VMD, typing vmd. We will have three windows:

  1. VMD Main
  2. VMD Display
  3. the Terminal

In the VMD Display our system will be draw. VMD Main is a graphical interface for all the functionality we need.

File: to load molecules, render them. Molecule: to perform actions on the loaded molecules. Graphics: to modify representation of the loaded structures. Display: to set option for the VMD Display. Mouse: to perform actions using the mouse. Extension: contains collection of modelling and analysis tools. Help: (really?).

Let’s load a file containing the coordinate of a protein with the extension .pdb. We will load a structure of the ubiquitin, a small protein that labels proteins that have to be degraded. So File -> New Molecule and browse 1ubq.pdb and load it. You will have something like this:

ubq

Now rotate the protein. Click on the VMD Display and press r and move the mouse with the left-button pressed.

What if you press the right-button?

You can change the center of rotation by pressing c and selecting an atom as pivot. With t you can translate the molecule in order to better place it in the display, while with s you can zoom in and out (or use the scroll button). Keep on play with it!

If you lose your molecule, press = or go to Display -> Reset View.

For now we saw the protein as a bunch of lines with some red dots overthere.

What are these dots?

Using the nomenclature defined above, we can modify what is displayed. Go to Graphics -> Representation and a new window appears.

The representation style should be highlighted; if not, click on the line Lines Name all. Then go to the Selected Atoms and write protein and backbone. You should have this.

change_sel

Modify the Drawing Method from Lines to Licorice. More keyword for the selections can be found if click on the Selection tab.

Most of the keywords will be clearer in the next lessons, so don’t be scared!

As you can see, you can combine keywords with and, or, not.

What happens if you select not protein?

The red dots are water molecules whose hydrogen are not resolved by the X-rays crystallography. This is why you do not have hydrogen in your protein (the colour code for H is white). If you change your selection into water, nothing should happen.

Visualise the protein backbone and the water.

Does your selection work? If not, why?

You can also use more complex selections! Let’s make an example.

Click on the VMD Display, press 1 and click on a oxygen atom of water. In the Terminal few lines should appear. Among them look for index: XXX (where XXX is an integer). Go back to the Graphical Representation window and use index XXX as Selected Atoms. Create a new representation (Create Rep button) and use the following selection: same residue as within 20 of (index XXX).

What happens if you don’t write same residue as?

Usually proteins have a secondary structure: -helices, -sheets etc… It’s hard to find them if we use only sticks. Delete all the representations you have and create a new one with the selection all. Now go to Drawing Method and select New Cartoon. In addition, change also the Coloring Method from Name into Secondary Structure (we are bored by the cyan colour!). The result:

The secondary structure is inferred by the 3D arrangement of the residues and the presence of h-bonds, a non covalent bond between a donor (a heavy electronegative atom with a Hydrogen, usually O-H, N-H, F-H) and an acceptor (same kind of atoms not covalently bonded to the Hydrogen) and it is established upon geometric criteria.

Here the colouring scheme for the secondary structure.

  • T = hydrogen bonded turn (3, 4 or 5 turn)
  • E = extended strand in parallel and/or anti-parallel β-sheet conformation.
  • B = residue in isolated β-bridge (single pair β-sheet hydrogen bond formation)
  • H = 4-turn helix (α helix).
  • G = 3-turn helix (310 helix)
  • I = 5-turn helix (π helix)
  • C = coil (residues which are not in any of the above conformations).

Now keep on playing with the colouring methods and representations. But before go to Extension -> Visualization -> ViewMaster. Click on Create New, so you have a representation to go back if you change too many things and you are not pleased with it.

The choice of the settings depends on what you want to convey with your image. Once you are happy we can (and will) do two things:

  1. save our work, so we can reload the representation for future use;
  2. render and save a picture.

To save the representation, click on File -> Save Visualization State, write a name for the file (usually with a .vmd extension) and save it.

Now quit VMD and reload the representation with File -> Load Visualization State.

Let’s now render. First of all, we need to change the backgound colour since black is not suitable for printing. So go to Graphics -> Colors.. and change the Display Backgound in white.

Then remove the axis: Display -> Axes -> off.

Probably your image is a bit pale. This is due to the Depth Cueing. Go to Display -> Depth Cueing -> off. If now it is too dark, turn the cuing on again and modify the parameter in Display -> Display Settings… and change the Cueing Mode in Linear. You can also play with the related parameters.

Now if you go to File -> Render -> Snapshot and Start Render, you create an image. Depending on your system, something will happen.

But, in general, the image has a pretty low quality. Let’s improve it!

First, in Graphics -> Representation you can increase the resolution of each representation you have. Then, change the Material in AOChalky. Finally go back to the rendering window and change the rendering engine in Tachyon.

Tcl scripting

A feature of VMD is the presence of a Tcl/Tk console. Every thing that you can do with the graphical interface can be done with the command line.

To open the console go to Extension -> Tcl/tk console and a new window pops up.

Just to see if my previous statement was correct, let’s type:

% color Display Background black

Did something happen?

Let’s move to the tcl basics:

  • to define a variable set a 4;
  • to print the variable value puts $a.

Note that to access the variable value you have to use the dollar sign $.

You can also print strings:

% puts “Hello, World! >.<”

Of course you can do simple arithmetics as below:

% expr 3 * 10
% expr $a + 5

To evaluate an instruction and assign its output to a variable you need to enclose the command into square brackets [ ].

% set a [expr $a * 13.4]

Common statements you will use are:

  • for loop:

% for {set i 0} {$i < 10} {incr i} {
puts “$i plus 1:\t[expr $i + 1]”
}

  • if clause:

% if { 10 > 4 } {
puts –nonewline “True “
puts “statement!”
}

Let’s check some VMD-specific commands. First let’s resize the display.

% display resize 600 600

This is not the only We saw already how to resize the display. For further info the display command, type:

% display

The help will be prompted out in red. You don’t need to memorise all the commands!

Read the output/errors! They are really useful to understand that happens.

Check what molinfo and atomselect do.

Suppose we want to compute the number of residue in our protein.

First, we need to make a selection of the protein.

Can you think about one atom present in all the residues?

% set selection [atomselect top “ Selection-here “]
% $selection num

Let’s finally clear the workspace.

% mol delete all

Trajectories

The output of an MD simulation is a trajectory. Let’s see what VMD can do with it.

Equilibration

Load the ubiquitin.psf (a Protein Structure File), then right-click on the brand new molecule and select Load Data Into Molecule. Here you have to select the equilibration.dcd (a trajectory file in binary) and load all at once.

In the VMD Main you can play with the trajectory.

The ubiquitin wobbles a bit around. So let align all the frames in order to have the protein almost in the same position. How many frames do we have?

molinfo top get numframes

Let’s comment the following script.

% set nf [molinfo top get numframes]
% set all [atomselect top “all”]
% set ref_0 [atomselect top “protein and name CA” frame 0]
% set ref [atomselect top “protein and name CA”]
% for {set f 0 } {$f < $nf} {incr f} {
$ref frame $f
$all frame $f
set M [measure fit $ref $ref_0]
$all move $M
}

Now we should have the protein almost in the same position for the whole trajectory. But still, the movie is not smooth.

In Graphics -> Representation there is a tab we previously ignored: Trajectory.

Here we can play with the Trajectory Smoothing Window Size. VMD will perform a rolling average of the positions of the atoms (therefore we would see something weird if we use licorice, such as wrong methyl group arrangement.) The smoothing is applied to the representation is set. So if you have multiple reprs, check this option carefully.

Measuring quantities

We used the command measure few lines ago. To have an idea of what you can actually measure, just type measure in the Tcl/tk console.

As an example, let’s compute the radius of gyration of our protein:

It is a quantity that tells you how compact your protein is. In VMD there are primitives you can use to measure properties of your system and combine them to do deeper and fancier analysis.

The command we look for is measure rgyr. Type:

% set calpha [atomselect top “alpha”]
% measure rgyr $calpha

Now compute with a script the radius of gyration for all the frames and save it into a file my_first_script.tcl

Solution not available ;)

Let’s improve our script by printing the results into a file. We will need to add a command to open a writeable file:

set output_file [open ‘eq_rgyr.dat’ w]

and modify the puts command with:

puts $output_file ‘$f\t$rgyr’

and, finally, we have to close the file:

close $output_file

We can plot the data using gnuplot (type it in a new shell), and then:

p “eq_rgyr.dat” w l

To quit from gnuplot, type q and Enter.

Bonus: What does pressing 2, 3, 4 in the VMD Display do?

Pulling

Let’s clear the workspace again mol delete all and load the pulling.dcd by using:

mol new ubiquitin.psf
mol addfile pulling.dcd waitfor all

In this simulation, one end of the protein is pulled, while another atom is kept fixed. It can be a way to unfold a protein.

Play the trajectory to see what happens.

You can visualise multiple frames at ones. To do so, switch the representation to New Cartoon and in the Trajectory tab modify the Draw Multiple Frames from now to 0:10:99. Modify now the colouring method in Trajectory -> Timestep.

Now, go to frame 0 and create two representation to visualise water molecules around the protein. For the protein: colour the secondary structure and use New Cartoon For the water: select water residues within 3 of protein with a VDW Draw Style. Play the trajectory now.

Do your water molecule have three atoms?

Is it the result you expect?

Go in the Trajectory tab and select Update Selection Every Frame and replay the trajectory.

Did something change?

Finally, we can modify my_first_script.tcl in order to apply it to this new trajectory.

  1. Change the output file name;
  2. launch the script with source my_first_script.tcl

Now plot the two radius of gyration.

Exercise

Downlod a structure from the Protein Data Bank and download a molecule you like (or ask for hints).

Create your custom representation of the protein, highlighting a particular residue in Licorice, ligands in Licorice or VDW.

Bonus

If you manage to complete the exercise, ask for a really small task.

Further readings

  1. VMD User’s Guide

Notes

  1. Under biological conditions, the N-group is protonated (-NH3+)