You are on page 1of 19

Molecular Modeling Lab #2: Using Molecular Dynamics Simulations To Compare the Stability of Polylysine in Different Protonation States

For this lab report, answer the following questions. Be sure to support your answers using results from GROMACS analyses and/or comparisons of structures from the simulations. In some cases, including multiple analyses may be appropriate for fully supporting a particular answer. 1) We used a (somewhat) arbitrary box size for our polylysine simulations. Was the size of the box sufficiently large for your polylysine simulations? If so, could you have used an even smaller box? 2) How long did it take the polylysine simulations to equilibrate? 3) Based on your simulations, is polylysine more likely to assume a stable helical structure at physiological pH or in a basic solution? 4) Your research advisor tells you to follow this protocol for a MD simulation:
The 15 ns MD simulation used a time step of 1 fs along with the LINCS routine to constrain bond lengths. The NVT ensemble with periodic boundary conditions was employed for the simulation. The system temperature of 25 C was maintained using a Berendsen thermostat with the temperature coupled separately for protein, lipid, solvent, and ions with a time constant (t) of 0.1 ps. Non-bonded interactions were cutoff at 10 for Lennard-Jones interactions and 1.8 for short-range Coulombic interactions. Coordinates and energies were saved from the simulation every 1 ps.

Produce an .mdp file that would perform a simulation with this protocol, and submit this file as part of your assignment. You can use the .mdp file for the polylysine simulations as a starting point for this new .mdp file. Remember that Chapter 7 of the GROMACS manual includes additional information on the options for the .mdp file.

Description of lab activities: In this lab, you will learn how to perform and analyze MD simulations in GROMACS. These simulations will allow you to consider how protonation state affects the structural stability of polylysine. As a reminder, the chemical structure of a three-residue long polylysine with its sidechains protonated and deprotonated is shown below:

You'll perform this lab over the next two weeks in class. During the first week, you'll set up simulations of polylysine at physiological pH (e.g. pH 7) and a basic pH (e.g. pH 11). These simulations should be completed before the next class meeting. Then, during the computational lab period for the following week you'll analyze the results of these simulations. (Note: If you want, you can certainly start on the analyses on your own before next week, but next week's computer lab time will be devoted to working on those.) Week 1: Setting up the Simulations The overall process to set up and run a simulation is given in the flowchart on the next page (GROMACS programs for each step in bold). As you can see, some of this process is very similar to what we did in Lab #2 to run minimizations. However, there are some differences here. First, we are performing these simulations in a box of water with salt (NaCl) ions. Second, we're doing both a minimization and a MD simulation. However, here we aren't interested in using the minimization to get to a fully minimized structure. Instead, we are using the

minimization as a tool to reduce any close atomic contacts that would cause the system to "blow up" in the initial steps of MD. First, we'll perform a simulation of polylysine in the protonation state it would be in at pH 7. (This is the same protonation state that you used for minimizations during Lab #1.) After logging into the workstation, create a new directory (with mkdir) for this lab. I'd also create a subdirectory within the one for the lab for each of the simulations we'll perform (physiological and basic conditions). Once you're in the directory for the pH 7 simulation, you can copy the old polylysine .pdb file from last week (either from your directory or from /Users/instructor/modelingclass/lab1/polylys.pdb).

Flowchart for setting up an MD simulation in GROMACS

Start with structure in .pdb file

pdb2gmx
Get a coordinate (.gro) and a topology (.top) file

editconf
Get a coordinate (.gro) file that has a box around it

genbox
Get coordinate (.gro) and topology (.top) files with box filled with water .mdp file with parameters for ion addition

grompp, genion
Get coordinate (.gro) and topology (.top) files with ions and water

.mdp file with parameters for minimization protocol

grompp

Get a .tpr file with coordinates, topology, and parameters for minimization

mdrun
Get a .gro file with minimized structure

grompp
.mdp file with parameters for MD Get a .tpr file with coordinates, topology, and parameters for MD simulation .top file (still the same as before)

mdrun
After completion get the following files: .log with run information; .gro with final structure; .xtc with trajectory of structures during the simulation; .trr with higher precision trajectory of structures and velocities during the simulation; .edr with energies and box size during the simulation

You can then convert this .pdb file into the corresponding coordinate (.gro) and topology (.top) files using pdb2gmx: > pdb2gmx f polylys.pdb o polylys.gro p polylys.top inter As before, you should use the Gromos96 43a1 force field here. Since this initial simulation is for a pH of 7, you should choose the protonation state of the termini and sidechains that would be appropriate at that condition. For Lys, that is a protonated (positively charged) sidechain, and you should also have charges at both termini. Now, we want to solvate this structure in a box of water. We can do this as a twostep process. First, we can create a box around the peptide. This box needs to be large enough such that the protein won't "see" itself in the neighboring boxes under periodic conditions during the simulation. We'll be using cutoffs for the nonbonded (van der Waals and electrostatic) interactions of 10 (1.0 nm). Thus, the box needs to extend at least 5 (0.5 nm) from each side of the protein. But, we'll also want to include some "slack" in case the box expands or contracts during the simulation, or if the peptide moves around some. We can create the box using the editconf program (which we used for a different purpose in the last labit's a general tool for editing structure files): > editconf f polylys.gro o polylys_box.gro bt cubic d .8 So, what do these commands mean? -f: This is the structure file to which the box will be added. -o: This is the structure file (with the box defined) that will be the output of the program. -bt cubic: This tells the program to make a cubic box. If this was omitted, the sides of the box could have different lengths. Alternatively, there are other types of periodic box shapes supported by GROMACS, but we won't work with those this semester. -d: This is the minimum distance (in nm) between the closest atom of the protein (in polylys.gro) and a side of the box. Since the box is being forced to be cubic the distance to the box side will be greater than 0.8 nm in some directions.

If you look at the polylys_box.gro file you can see its box size in the final line; this is different than the arbitrary information given for this line in pdb2gmx. Note that we didn't alter the topology file here. This was because the topology wasn't changed at all herewe didn't add any molecules, and all of the atoms and bonds are still the same type. Now that you have a box, you need to fill it with water. You can do this using a program called genbox. This program basically fills the box with water molecules and then removes any water that overlaps with protein atoms. (The filling is actually done using little boxes of 216 preequilibrated water molecules. Using preequilibrated water helps speed the equilibration of the overall system since the water molecules already have reasonable interactions with one another.) One quirk about genbox is that it will use the same topology file as the input and the output. Since this will add water to the topology file you use as input, I usually make a copy of the original .top file before running genbox. To do this, you can use a command like: > cp polylys.top polylys_water.top Once you have this copy of the topology file, you can run genbox using the command:
> genbox cp polylys_box.gro cs spc216.gro o polylys_water.gro p polylys_water.top

So, what do these commands mean? -cp: This is the input structure file to which the solvent will be added. -cs: This is the file that includes the solvent to be added. spc216.gro is a file that comes with GROMACS that is a box of 216 equilibrated water molecules. If you want to solvate with something other than water, a box of that solvent could be included here. -o: This is an output structure file that will include all the solvent molecules added by the program. -p: This is the topology file for the structure that is being solvated by the program. As mentioned above, this file will serve as both an input and output file; it will be edited by the program to include the topology for the added solvent molecules.

Now we have the peptide in a box of water. However, before running any simulations we should add ions to the system. We do this for two reasons. The firstand most importantis that it is best to perform simulations on systems that have a net neutral charge. This is essential for some protocols that calculate electrostatic interactions. Although we're not using one of those this week (we will be in the future, though!), it's still "good form" to keep the overall system charge neutral. When you think about periodic conditions with the cells repeated to infinity, it makes some intuitive sense that you wouldn't necessarily want one of the periodic cells to have a large net charge on it. Second, in addition to neutralizing the system salt ions can be added to give some ionic strength to a system. Ions can be added for both of these purposes using the genion program. This program will replace some of the water molecules in your system with either cations (e.g. Na+) or anions (e.g. Cl-). Based on previous commands we've used, you might expect genion to use a .gro file and a .top file as the input. However, it instead is designed to use a .tpr filethe combined coordinate/topology/simulation parameter file that we used to start the minimization last week. In order to get this .tpr file, you need to use the grompp program to combine the three files (.gro, .top, and .mdp) together. Since we're not actually doing a simulation here, we can use a fairly generic .mdp file for parameters. You can copy one saved to my directory as: /Users/instructor/modelingclass/lab2/genion.mdp Then you can process the files using the grompp command: > grompp f genion.mdp o genion.tpr c polylys_water.gro p polylys_water.top Once you've done this, you are almost ready to run genion. First, however, you will want to create another copy of the .top file with the water (polylys_water.top). This is because, like genbox, the genion program will use this file as the output of the system with the ions included in it. You can create this copy with the command: > cp polylys_water.top polylys_ions.top

Now, we're ready to actually run genion to add ions, which you can do with the following command: > genion s genion.tpr o polylys_ions.gro p polylys_ions.top g genion.log neutral conc 0.1 pname NA+ nname CL- random (Note that the above command should be typed on "one line"in other words without hitting return in the middle!) So, what do these commands mean? -s: This is the input file created above in grompp with the structure and topology of the solvated peptide. -o: This is the output structure file that will have the water molecules replaced with ions. -p: This is the output topology file that has the ions added to it (and the correct number of waters removed). -g: This is a log file with some output for the run; I admittedly almost never look at this file! -neutral: This tells genion to add enough ions (either positive or negative) to neutralize the system. -conc #: This tells genion to add additional salt ions (both positive and negative) to give a total salt concentration of # (in M, so here 0.1 M or 100 mM) after adding ions for neutralization. -pname XX: This tells genion what positive ion to use (NA+ here for sodium ions). -nname XX: This tells genion what positive ion to use (CL- here for sodium ions). -random: This tells genion to place the ions randomly versus using a potential energy function that doesn't work correctly in GROMACS. Thus, this is the default option, but I often include it for good measure. When it's running, genion will ask you which group is the solvent to replace with ions. You should choose the group of SOL molecules (which are the water). The program will then print how many NA+ and CL- are added to the system. Since the peptide originally had a positive charge, you'll see that it added an excess of CL- to neutralize the charge. If you look at the bottom of the .top file, you can see that the NA+ and CL- have been added, and the appropriate number of SOL (water) molecules have been removed.

Now, we finally have a system with our protein, water, and ions that we can use for a simulation. However, before running an MD simulation we want to perform a quick minimization. As discussed earlier, this minimization isn't performed to reach a potential energy minimum. Instead, it is being done to reduce any close contacts that might have been induced between atoms. These close contacts would lead to very strong forces in the first steps of the simulation, which could cause the system to "blow up" (which is a pretty literal description of what would happen!). Thus, we can just use a brief steepest descents minimization for this. You can copy the .mdp file for this from my directory, where it is saved as: /Users/instructor/modelingclass/lab2/min.mdp If you look at this .mdp file, it is very similar to the ones you were using last week for the minimization of the peptide. As before, we can setup and run this minimization by using the following commands in grompp and mdrun: > grompp f min.mdp o min.tpr c polylys_ions.gro p polylys_ions.top > mdrun v s min.tpr o min.trr x min.xtc c min.gro g min.log e min.edr This minimization should complete fairly quickly (it is only set to run for 100 steps). Although it won't converge in this number of steps, you can see by looking at the output to the screen that the energy is more favorable by the end of the minimization. Once the minimization is completed, you can set up and run the MD simulations. Just like for a minimization, you'll need three files to start the simulationa .gro file, a .top file, and a .mdp file with the MD parameters in it. The .gro file is the minimized structure output from the minimization (min.gro). The .top file is the same as the one you used to start the minimizationjust moving the atoms around in the minimization didn't change the topology. And, the .mdp file you can copy from my directory: /Users/instructor/modelingclass/lab2/md.mdp

It's worth taking a look at this .mdp file, and we'll do that in a moment. But, if you just want to get the simulation started first, you can do so with the grompp and mdrun commands: > grompp f md.mdp o md.tpr c min.gro p polylys_ions.top > mdrun s md.tpr o md.trr x md.xtc c md.gro g md.log e md.edr & (Note: The & here tells the workstation to run this program in the background. This means that you can keep doing other things when it's running. You can also log out of the workstation and the simulation will keep running. This is important since it will be running for about a day!) Once you've done this, the simulation should get started. This is a 20 ns simulation of the peptide, which will take about a day or so to complete. Now, we should go back and look at the .mdp file, since it gives the parameters for the MD run (you can open it in vi). Below are descriptions of what each line means. It is significantly longer than the minimization .mdp file we looked at last week, but I've marked the most important MD parameters we've talked about in bold. (Note that there are more detailed descriptions of these and many other .mdp parameters in Chapter 7 of the GROMACS manual.)
title = md cpp = /usr/bin/cpp integrator = md dt = 0.002 tinit = 0.0 nsteps = 10000000 ; Same as in minimizationgives title to run. ; Same as in minimizationgives name of C-preprocessor. ; This tells GROMACS to perform an MD simulation. ; Gives length of timestep (in ps, so this is 2 fs) for each step of the MD simulation. ; This gives the initial time of the simulation; if this was a continuation of a run it might start greater than 0. ; This gives the number of steps in the simulation. The length of the simulation in time is the number of steps multiplied by the timestep. Here the length is 10000000 * 0.002 ps = 20000 ps = 20 ns ; Saves coordinates to the .trr file once every 5000 steps (10 ps). ; Saves velocities for atoms to the .trr file once every 5000 steps (10 ps). ; Saves information to the .log file once every 2500 steps (5 ps). ; Saves energies to the .edr file once every 500 steps (1 ps). ; Saves coordinates to the .xtc file once every 500 steps (1 ps).

nstxout = 5000 nstvout = 5000 nstlog = 2500 nstenergy = 500 nstxtcout = 500

xtc_grps = Protein SOL NA+ CL-

; Tells which groups of atoms to save to the .xtc file. Here, these are protein, water (SOL), and the two ions. energygrps = Protein SOL NA+ CL; Tells which groups of atoms to save to the .edr file; here these are the same as the previous line. nstlist = 10 ; Tells GROMACS to determine which pairs of atoms should be on the "neighbor list" of atoms for which to calculate non-bonded interactions every 10 steps. ns_type = grid ; This defines the protocol that GROMACS uses to determine the neighbor list. rlist = 1.0 ; Same as minimization, defines minimum length for neighbor list. coulombtype = cut-off ; Tells GROMACS to use distance cut-offs to determine maximum length for calculating van der Waals and Coulombic interactions. There is a more sophisticated way to handle this that we'll see next week. rcoulomb = 1.0 ; Gives cutoff distance (in nm) for Coulombic interactions. rvdw = 1.0 ; Gives cutoff distance (in nm) for van der Waals interactions. pbc = xyz ; Says to use periodic conditions in the x, y, and z directions. tcoupl = berendsen ; Use the Berendsen temperature coupling protocol. tc-grps = Protein SOL NA+ CL- ; Do temperature coupling for each of these groups separately. We'll see for later simulations how to combine the water and ions together into a single group, but this is ok for this lab. tau_t = 0.1 0.1 0.1 0.1 ; This gives the time constant for the temperature coupling in each group. ref_t = 310 310 310 310 ; This gives the set temperature for the temperature coupling in each group (in K, so this simulations is at 37 C). Pcoupl = berendsen ; Use the Berendsen pressure coupling protocol. pcoupltype = isotropic ; This says to couple to the same pressure in each direction thus isotropic coupling. This is what you'll essentially always use for aqueous simulations. It is possible to couple to different pressures in each direction (anisotropic coupling). tau_p = 1.0 ; This is the time constant for the pressure coupling. compressibility = 4.5e-5 ; This is the compressibility of the system for pressure coupling. Here, this is just the compressibility of water, which is typical for aqueous simulations. ref_p = 1.0 ; This is the pressure (in bar, which are essentially just atm) that the system is coupled to. gen_vel = yes ; This will generate initial velocities for each atom in the system that are consistent with a particular temperature at the start of the simulation. gen_temp = 310 ; This is the temperature that is used for the initial velocity assignment called for in the last line. gen_seed = 173529 ; This gives a random number "seed" for the assignment of initial velocities. If you want to know what this means for the computer, let me know.

constraints = all-bonds

; This will constrain all bonds to keep essentially the same length during the simulation. constraint_algorithm = lincs ; This tells GROMACS to use the LINCS algorithm for constraining the bond lengths. This is usually better than the other common option (SHAKE) for bond constraints. unconstrained_start = no ; This will allow constraints to be applied to the initial structure in the simulation. This is sometimes set to "yes" when restarting a simulation.

Once you've run the simulation of polylysine at pH 7, you can set up and run a simulation of polylysine at a very basic pH. (You might want to do this in a different subdirectory to make it easier to keep the output files separated.) At this basic pH, the Lys sidechains and the N-terminal amine would be deprotonated. You'll make these changes when you initially process the polylysine in pdb2gmx by choosing the appropriate protonation options for the groups when asked. The rest of the simulation setup process will be essentially identical. You can do all of the setup through the minimization while the other simulation is still running. However, you shouldn't start the actual MD simulation until your first MD simulation is finished in order to allow processor time for other students simulations. You might be wondering how you can tell whether you still have a simulation running. One way is to open up the .log file for your MD run to see if it is completed yetthe end of the file will show the most recently updated time. Another way is to list your currently running processes on the workstation. You can do this with the ps command: > ps U Username where Username is just your own username (e.g. delmore). This will list any programs (e.g. mdrun) you have running on the workstation and how long it's been running. If you ever need to end a simulation you have running, you can do so using the kill command: > kill ## where ## is the ID number for the program you want to end. This is the number at the beginning of each line you get from the ps command.

Week 2: Analyzing the MD trajectories of polylysine Now that you have your simulations of polylysine at pH 7 and basic conditions completed, you can analyze them to see how the structure of the peptide was different under the two conditions. You should be able to use these analyses to answer the questions for this lab report. This handout will walk you through performing some of the types of analyses that were used in the papers we discussed this week. All of these examples assume that you are logged in to the workstation and working in the same directory used for a MD simulation. In this case, the simulation files are all called md.*** (where .*** is whatever extension the file has, such as .xtc or .edr). First of all, you might want to compare the structures from the simulations. The final structures were saved as .gro files, just as in the minimization. So, you can convert these to .pdb files using editconf in the same way that you did for the minimization. However, you may want to omit the water and ions from the .pdb file, since they will make the protein harder to see. To do this, you can use the command: > editconf f md.gro o md.pdb ndef So, what do these commands mean: -f: This is the input coordinate file for editconf. -o: This tells editconf what output file to save. Using the .pdb extension tells the program to save the final file in .pdb format. -ndef: This will cause editconf to ask you which portion of the original file to include in the output file. The program will ask you to choose which group to include in the output file. If you only want to include the polylysine molecule without water or ions, just choose the protein group. However, what if you want to look at a frame of the simulation other than the final frame? There are programs that you can use to view the whole trajectory as a movie, and we will look at one of those in the coming weeks. But, you can also extract individual frames of the trajectory as .pdb files using the trjconv (trajectory converter) program. One way to use this is to save a single frame, such as the 10 ns frame. You can do this with the command:

> trjconv f md.xtc s md.tpr o 10ns.pdb b 10000 e 10000 So, what do these commands mean? -f: This is the trajectory file with the coordinates of each frame from the simulation. -s: This is the .tpr file, which would include the topology information necessary to decide what the types of atoms were for the coordinates saved in the .xtc file. -o: This is the output file (you could give it a different name). Here, we're saving the output frame in .pdb format. -b: This is the first frame to export from the .xtc filehere the 10000 ps (10 ns) frame. -e: This is the final frame to export from the .xtc filehere the 10000 ps (10 ns) frame. Once trjconv is running it will ask you which group of atoms to include in the output file, and you can choose whether you want the protein or some other group from the simulation. You can also export a series of frames from a trajectory using trjconv, such as in the command: > trjconv f md.xtc s md.tpr o 10ns.pdb b 10000 e 20000 dt 1000 sep Here, the command is very similar to that above, except that the range given by b and e is longer than just a single frame. The new flags are: -dt: This says to save a frame every 1000 ps frame within the range exported. -sep: This says to save each exported frame as a separate file. If you omit this, all the frames will be saved in a single .pdb file, which is hard for most visualization programs to open. One type of analysis we looked at in some papers is the RMS deviation throughout a simulation. In this analysis, the structure in every saved frame of the simulation is fit onto another structure (usually the initial structure), and the RMS deviation between these structures is calculated. Often this fitting and RMS deviation calculation is done using the C in the structure, although other atoms can be used. You can run this program with the command:

> g_rms f md.xtc s min.gro o rms.xvg ng 1 So, what do these commands mean? -f: This is the coordinate trajectory file from a simulation that will be analyzed by the program. Either a .xtc or .trr file can be used, but it is more typical to use the smaller .xtc file (it should be quicker). -s: This is the reference structure to which each frame in the trajectory will be compared. Here, this command is using the initial structure, which was the final minimized structure. -o: This is the output file that will contain the RMS deviation information (you could give it a different name). -ng: This tells the program how many groups you want to calculate the RMS deviation for after fitting. Here we've just chosen one, but you may want to also measure the deviation separately for different parts of a larger protein. -dt: This tells the program to only output data every 10 ps. Skipping frames just makes a smaller output file, which can be easier to download and work with in Excel. However, omitting this and getting the full file would still work. After the program is run, it will interactively ask a few questions. First, it will ask which atoms should be used to fit each frame of the trajectory to the reference structure. Select a group of atoms by typing the appropriate number and hitting enter. Often you'll choose the C (and I'd do that here for polylysine), but it can be a different group. Then you will be asked which groups you want to measure the RMS deviation from; since you used ng 1, it will just ask for one group. One of these groups is usually the group of atoms used for the fit (so, C here), but it doesn't have to be. Select the ones you want by typing their numbers followed by enter. After entering these groups, the program will produce a file with the time in one column and the RMS deviation (in nm) in a second column. Like the .xvg file output from g_energy this file can be imported into Excel and graphed. Another type of analysis that we looked at in papers was the RMS fluctuation analysis. Although it is somewhat similar to the RMS deviation, it gives different information. While the RMS deviation shows how much a molecule has changed from its initial structure over the course of a similar, the RMS fluctuation shows how much an atom (or group of atoms) is moving around (fluctuating) during a given time period in the simulation. This gives a measure of how stable that

atom/group of atoms is/are during this time periodincreased fluctuation typically implies decreased structural stability. The most common form of this just calculates the RMS fluctuation for all the C, giving a fluctuation for the backbone part of each amino acid. But other atoms could be used to measure the fluctuation of other parts of the structure. Typically, an RMS fluctuation analysis is performed over a portion of the simulation that has an equilibrated structure. This means that the structure is not changing in any global way at this pointin other words, it's RMS deviation is sort of moving around a number and isn't still increasing (or decreasing) overall. (There are other measures one could consider for equilibration, too.) Thus, you would typically want to look at analyses (such as RMS deviation) to make sure that a system is equilibrated over the portion of the trajectory that you will use for an RMS fluctuation analysis. The RMS fluctuation is computed over this equilibrated portion of the simulation by comparing the position of the atom in each frame to its position in an equilibrated structurethis could actually be any structure in that equilibrated section, and GROMACS just uses the first frame of this section as a default. (Some protocols compare to the average structural position during the equilibrated part of the trajectory, which leads to essentially the same comparisons.) As a practical note, if you are comparing RMS fluctuation analyses between different simulations, it's often common to average over the same total length of time for those different simulationsjust making sure that the system is equilibrated over the full period of time in both simulations. Assuming that the polylysine system is equilibrated after the first 10000 ps (10 ns), you could calculated an RMS fluctuation of the C atoms using the following command:

> g_rmsf f md.xtc s md.tpr o rmsf.xvg b 10000 e 20000 So, what do these commands mean? -f: This is the coordinate trajectory file from a simulation that will be analyzed by the program. Either a .xtc or .trr file can be used, but it is more typical to use the smaller .xtc file (it should be quicker). -s: This is the .tpr file that has the topology information for the structures in the trajectory. -o: This is the output file that will contain the RMS fluctuation information (you could give it a different name). -b: This is the first frame to use for the analysishere the 10000 ps (10 ns) frame. -e: This is the last frame to use for this analysishere the 20000 ps (20 ns) frame. You will then be asked to choose a group of atoms to use for the analysis. As mentioned above, all C is a common group to use, and you can choose that for polylysine. After the program completes, you'll have an output file that has two columns. The first column has the residue number in it, and the second column has the RMS fluctuation value (in nm). You can then use this to plot the RMS fluctuation for each residue in Excel. However, one can also display this data in different ways, as we saw in the papers we discussed this week. You can also look at the energies of the system using the g_energy command. The use of this is very similar to that in minimizations, and can be done with a command like: > g_energy f md.edr o mdenergy.xvg skip 10 Here, the skip # tells the program to only put output into the .xvg file for every #th frame (here, every 10th frame). As in the minimization, the potential energy is something you can look at to see how it changes over the simulations. However, you can also use this tool to look at how the box size changes over the course of the simulation by choosing the Box-X, Box-Y, or Box-Z options. As before, you can import the .xvg file output into Excel for graphing.

A final analysis you can do is to determine whether the protein was able to "see" itself in an adjacent periodic image during the simulation. The g_mindist command can be used to do this (it also can be used to measure the minimum distance between any pair of atoms in two groups). This periodic image analysis can be performed with this command: > g_mindist f md.xtc s md.tpr o periodic_mindist.xvg pi dt 10 So, what do these commands mean? -f: This is the coordinate trajectory file from a simulation that will be analyzed by the program. Either a .xtc or .trr file can be used, but it is more typical to use the smaller .xtc file (it should be quicker). -s: This is the reference structure to which each frame in the trajectory will be compared. Here, this command is using the initial structure, which was the final minimized structure. -o: This is the output file that will contain the minimum distance information (you could call it whatever you want). -pi: This tells the program to compute the minimum distance between the molecule and its image in the nearest periodic cell. -dt: This tells the program to only output data every 10 ps. Once this program starts, you should choose the protein group (that's the molecule you're worried about "seeing" itself). After g_mindist completes, the output file shows the minimum distance between the polylysine in the system and its nearest neighbor in a periodic cell. This could be plotted in Excel if that makes it easier to view, or you could use Excel to find the closest together the structure gets to its neighbor. If this distance should greater than the cutoff used to calculate the nonbonded (van der Waals and electrostatic) interactions throughout the simulation, then the peptide was never able to see itself in the trajectory. The shortest distance is also directly printed to the terminal screen when the program completes.

You might also like