diff --git a/tools/msi2lmp/README b/tools/msi2lmp/README index 93179b579..9c9dcdb1a 100644 --- a/tools/msi2lmp/README +++ b/tools/msi2lmp/README @@ -1,208 +1,214 @@ Axel Kohlmeyer is the current maintainer of the msi2lmp tool. Please send any inquiries about msi2lmp to the lammps-users mailing list. +11 Sep 2014 Axel Kohlmeyer + +Refactored ReadMdfFile.c so it more consistently honors +the MAX_NAME and MAX_STRING string length defines and +potentially handles inputs with long names better. + 27 May 2014 Axel Kohlmeyer Added TopoTools style type hints as comments to all Mass, PairCoeff, BondCoeff, AngleCoeff, DihedralCoeff, ImproperCoeff entries. This should make it easier to identify force field entries with the structure and force field map in the data file later. 06 Mar 2014 Axel Kohlmeyer Fixed a bug in handling of triclinic cells, where the matrices to convert to and from fractional coordinates were incorrectly built. 26 Oct 2013 Axel Kohlmeyer Implemented writing out force field style hints in generated data files for improved consistency checking when reading those files. Also added writing out CGCMM style comments to identify atom types. 08 Oct 2013 Axel Kohlmeyer Fixed a memory access violation with Class 2 force fields. Free all allocated memory to better detection of memory errors. Print out version number and data with all print levels > 0. Added valgrind checks to the regression tests 08 Oct 2013 Axel Kohlmeyer Fixed a memory access violation with Class 2 force fields. Free all allocated memory to better detection of memory errors. Print out version number and data with all print levels > 0. Added valgrind checks to the regression tests 02 Aug 2013 Axel Kohlmeyer Added rudimentary support for OPLS-AA based on input provided by jeff greathouse. 18 Jul 2013 Axel Kohlmeyer Added support for writing out image flags Improved accuracy of atom masses Added flag for shifting the entire system Fixed some minor logic bugs and prepared for supporting other force fields and morse style bonds. 12 Jul 2013 Axel Kohlmeyer Fixed the bug that caused improper coefficients to be wrong Cleaned up the handling of box parameters and center the box by default around the system/molecule. Added a flag to make this step optional and center the box around the origin instead. Added a regression test script with examples. 1 Jul 2013 Axel Kohlmeyer Cleanup and improved port to windows. Removed some more static string limits. Added print level 3 for additional output. Make code stop at missing force field parameters and added -i flag to override this. Safer argument checking. Provide short versions for all flags. 23 Sep 2011 added support for triclinic boxes see msi2lmp/TriclinicModification.pdf doc for details ----------------------------- msi2lmp V3.6 4/10/2005 This program uses the .car and .mdf files from MSI/Biosyms's INSIGHT program to produce a LAMMPS data file. 1. Building msi2lmp Use the Makefile in the src directory. It is currently set up for gcc. You will have to modify it to use a different compiler. 2. Testing the program There are several pairs of input test files in the format generated by materials studio or compatible programs (one .car and one .mdf file each) in the test directory. There is also a LAMMPS input to run a minimization for each and write out the resulting system as a data file. With the runtests.sh script all of those inputs are converted via msi2lmp, then the minimization with LAMMPS is run and the generated data files are compared with the corresponding files in the reference folder. This script assumes you are on a unix/linux system and that you have compile a serial LAMMPS executable called lmp_serial with make serial. The tests are groups by the force fields they use. 3. To run the program The program is started by supplying information at the command prompt according to the usage described below. USAGE: msi2lmp.exe {-print #} {-class #} {-frc FRC_FILE} {-ignore} {-nocenter} {-shift # # #} -- msi2lmp.exe is the name of the executable -- is the base name of the .car and .mdf files -- -2001 Output lammps files for LAMMPS version 2001 (F90 version) Default is to write output for the C++ version of LAMMPS -- -print (or -p) # is the print level 0 - silent except for error messages 1 - minimal (default) 2 - verbose (usual for developing and checking new data files for consistency) 3 - even more verbose (additional debug info) -- -ignore (or -i) ignore errors about missing force field parameters and treat them as warnings instead. -- -nocenter (or -n) do not recenter the simulation box around the geometrical center of the provided geometry but rather around the origin -- -oldstyle (or -o) write out a data file without style hints (to be compatible with older LAMMPS versions) -- -shift (or -s) translate the entire system (box and coordinates) by a vector (default: 0.0 0.0 0.0) -- -class (or -c) # is the class of forcefield to use (I or 1 = Class I e.g., CVFF) (O or 0 = OPLS-AA) (II or 2 = Class II e.g., CFFx) default is -class I -- -frc (or -f) specifies name of the forcefield file (e.g., cff91) If the file name includes a directory component (or drive letter on Windows), then the name is used as is. Otherwise, the program looks for the forcefield file in $MSI2LMP_LIBRARY (or %MSI2LMP_LIBRARY% on Windows). If $MSI2LMP_LIBRARY is not set, ../frc_files is used (for testing). If the file name does not end in .frc, then .frc is appended to the name. For example, -frc cvff (assumes cvff.frc is in $MSI2LMP_LIBRARY or ../frc_files) -frc cff/cff91 (assumes cff91.frc is in cff) -frc /usr/local/forcefields/cff95 (assumes cff95.frc is in /usr/local/forcefields/) By default, the program uses $MSI2LMP_LIBRARY/cvff.frc or ../frc_files/cvff.frc depending on whether MSI2LMP_LIBRARY is set. -- the LAMMPS data file is written to .data protocol and error information is written to the screen. **************************************************************** * * msi2lmp * * This is the third version of a program that generates a LAMMPS * data file based on the information in MSI .car (atom * coordinates), .mdf (molecular topology) and .frc (forcefield) * files. The .car and .mdf files are specific to a molecular * system while the .frc file is specific to a forcefield version. * The only coherency needed between .frc and .car/.mdf files are * the atom types. * * The first version was written by Steve Lustig at Dupont, but * required using Discover to derive internal coordinates and * forcefield parameters * * The second version was written by Michael Peachey while an * intern in the Cray Chemistry Applications Group managed * by John Carpenter. This version derived internal coordinates * from the mdf file and looked up parameters in the frc file * thus eliminating the need for Discover. * * The third version was written by John Carpenter to optimize * the performance of the program for large molecular systems * (the original code for deriving atom numbers was quadratic in time) * and to make the program fully dynamic. The second version used * fixed dimension arrays for the internal coordinates. * * The current maintainer is only reluctantly doing so because John Mayo no longer * needs this code. * * V3.2 corresponds to adding code to MakeLists.c to gracefully deal with * systems that may only be molecules of 1 to 3 atoms. In V3.1, the values * for number_of_dihedrals, etc. could be unpredictable in these systems. * * V3.3 was generated in response to a strange error reading a MDF file generated by * Accelys' Materials Studio GUI. Simply rewriting the input part of ReadMdfFile.c * seems to have fixed the problem. * * V3.4 and V3.5 are minor upgrades to fix bugs associated mostly with .car and .mdf files * written by Accelys' Materials Studio GUI. * * V3.6 outputs to LAMMPS 2005 (C++ version). * * Contact: Kelly L. Anderson, kelly.anderson@cantab.net * * April 2005 diff --git a/tools/msi2lmp/src/ReadMdfFile.c b/tools/msi2lmp/src/ReadMdfFile.c index d3bffc175..9bc820f37 100644 --- a/tools/msi2lmp/src/ReadMdfFile.c +++ b/tools/msi2lmp/src/ReadMdfFile.c @@ -1,415 +1,422 @@ /****************************** * * This function opens the .mdf file and extracts connectivity information * into the atoms Atom structure. It also updates the charge from the .car * file because the charge in the .mdf file has more significant figures. * */ #include "msi2lmp.h" #include #include #include /* Prototype for function to process a single atom Returns int that flags end of data file */ static int get_molecule(char line[], int connect_col_no, int q_col_no, int *counter); /* Prototype for function that takes connectivty record as stated in .mdf file and fills in any default values */ static void MakeConnectFullForm(int *counter); /* prototype for function to clean strange characters out of strings */ static void clean_string(char *); static int blank_line(char *line) { while (*line != '\0') { if (isalnum((int) *line)) return 0; ++line; } return 1; } void ReadMdfFile(void) { char line[MAX_LINE_LENGTH]; /* Temporary storage for reading lines */ char *col_no; /* Pointer to column number stored as char */ char *col_name; /* Pointer to column name */ int connect_col_no = 0; /* Column number where connection info begins */ int q_col_no = 0; /* Column number containg charge information */ int atom_counter=0; /* Keeps track of current atom number */ int i,j,k,kk,l,n,match,match2,status; char *temp_string; char *temp_residue; char *temp_atom_name; char *sptr; unsigned char at_end = 0; /* Open .mdf file for reading */ sprintf(line,"%s.mdf",rootname); if (pflag > 0) printf(" Reading mdf file: %s\n",line); if ((MdfF = fopen(line,"r")) == NULL ) { printf("Cannot open %s\n",line); exit(41); } while (!at_end) { sptr = fgets(line,MAX_LINE_LENGTH,MdfF); if (sptr != NULL) { clean_string(line); if (strncmp(line,"#end",4) == 0) { at_end = 1; } else if (strncmp(line,"@column",7) == 0) { temp_string = strtok(line," "); col_no = strtok(NULL," "); col_name = strtok(NULL," "); if (strncmp(col_name,"charge",6) == 0) { if (strlen(col_name) < 8) { q_col_no = atoi(col_no); } } else if (strncmp(col_name,"connect",7) == 0) { connect_col_no = atoi(col_no); } } else if (strncmp(line,"@molecule",9) == 0) { if ((q_col_no == 0) | (connect_col_no == 0)) { printf("Unable to process molecule without knowing charge\n"); printf("and connections columns\n"); exit(42); } sptr = fgets(line,MAX_LINE_LENGTH,MdfF); status = get_molecule(line,connect_col_no,q_col_no,&atom_counter); if (status == 0) { printf("Trouble reading molecule - exiting\n"); exit(43); } } } else { printf("End of File found or error reading line\n"); at_end = 1; } } /* Next build list of residues for each molecule This will facilitate assigning connections numbers as well as figuring out bonds, angles, etc. This first loop just figures out the number of residues in each molecule and allocates memory to store information for each residue. The second loop fills in starting and ending atom positions for each residue */ - temp_string = (char *)calloc(16,sizeof(char)); + temp_string = (char *)calloc(MAX_STRING,sizeof(char)); for (n=0; n < no_molecules; n++) { molecule[n].no_residues = 1; - strncpy(temp_string,atoms[molecule[n].start].residue_string,16); + strncpy(temp_string,atoms[molecule[n].start].residue_string,MAX_NAME); for (i=molecule[n].start+1; i < molecule[n].end; i++) { - if (strncmp(temp_string,atoms[i].residue_string,16) != 0) { + if (strncmp(temp_string,atoms[i].residue_string,MAX_NAME) != 0) { molecule[n].no_residues++; - strncpy(temp_string,atoms[i].residue_string,16); + strncpy(temp_string,atoms[i].residue_string,MAX_NAME); } } molecule[n].residue = (struct ResidueList *) calloc(molecule[n].no_residues, sizeof(struct ResidueList)); if (molecule[n].residue == NULL) { printf("Unable to allocate memory for residue list - molecule %d\n",n); exit(44); } } for (n=0; n < no_molecules; n++) { j = 0; strncpy(molecule[n].residue[j].name, - atoms[molecule[n].start].residue_string,16); + atoms[molecule[n].start].residue_string,MAX_NAME); molecule[n].residue[j].start = molecule[n].start; for (i=molecule[n].start+1; i < molecule[n].end; i++) { if (strncmp(molecule[n].residue[j].name, - atoms[i].residue_string,16) != 0) { + atoms[i].residue_string,MAX_NAME) != 0) { molecule[n].residue[j].end = i; molecule[n].residue[++j].start = i; - strncpy(molecule[n].residue[j].name,atoms[i].residue_string,16); + strncpy(molecule[n].residue[j].name,atoms[i].residue_string,MAX_NAME); } } molecule[n].residue[j].end = molecule[n].end; /* printf("Molecule %d has %d residues",n,molecule[n].no_residues); for (i=0; i < molecule[n].no_residues; i++) { printf(" %s",molecule[n].residue[i].name); } printf("\n"); for (i=molecule[n].start; i < molecule[n].end; i++) { printf(" atom %d residue %s\n",i,atoms[i].residue_string); } printf(" residue %s start %d end %d\n",molecule[n].residue[i].name, molecule[n].residue[i].start,molecule[n].residue[i].end); } */ } /* Assign atom names in connections[] to corresponding atom numbers */ for (n=0; n < no_molecules; n++) { for (j=0; j < molecule[n].no_residues; j++) { for (i=molecule[n].residue[j].start; i < molecule[n].residue[j].end; i++) { for (l=0; l < atoms[i].no_connect; l++) { - strncpy(temp_string,atoms[i].connections[l],16); + strncpy(temp_string,atoms[i].connections[l],MAX_STRING); temp_residue = strtok(temp_string,":"); temp_atom_name = strtok(NULL,"%"); if (strcmp(temp_residue,molecule[n].residue[j].name) == 0) { /* atom and connection are part of same residue Search names on just that residue */ k = molecule[n].residue[j].start; match = 0; while (!match && (k < molecule[n].residue[j].end)) { if (strcmp(atoms[k].name,temp_atom_name) == 0) { atoms[i].conn_no[l] = k; match = 1; } else k++; } if (match == 0) { printf("Unable to resolve atom number of atom %d conn %d string %s:%s\n" " Something is wrong in the MDF file\n", i,l,temp_residue,temp_atom_name); exit(45); } } else { /* atom and connection are on different residues First find the residue that the connection is on then loop over its atoms */ k=0; match = 0; while (!match && (k < molecule[n].no_residues)) { if (strcmp(temp_residue,molecule[n].residue[k].name) == 0) { kk = molecule[n].residue[k].start; match2 = 0; while (!match2 && (kk < molecule[n].residue[k].end)) { if (strcmp(atoms[kk].name,temp_atom_name) == 0) { atoms[i].conn_no[l] = kk; match2 = 1; } else kk++; } if (match2 == 0) { printf("Unable to resolve atom number of atom %d conn %d string %s\n" " Something is wrong in the MDF file\n", i,l,atoms[i].connections[l]); exit(46); } match = 1; } else k++; } if (match == 0) { printf("Unable to find residue associated with conn %d %s on atom %d\n" " Something is wrong in the MDF file\n", l,atoms[i].connections[l],i); exit(47); } } /* end if */ } /* l - loop over connections on atom i */ } /* i - loop on atoms in residue j molecule n */ } /* j - loop on residues in molecule n */ } /* n - loop over molecules */ free(temp_string); /* for (n=0; n < no_molecules; n++) { printf("Molecule %d has %d residues\n",n,molecule[n].no_residues); for (j=0; j < molecule[n].no_residues; j++) { printf(" Residue %d named %s\n",j,molecule[n].residue[j].name); for (i=molecule[n].residue[j].start; i < molecule[n].residue[j].end; i++) { printf(" Atom %d type %s connected to ",i,atoms[i].potential); for (l=0; l < atoms[i].no_connect; l++) printf(" %d ", atoms[i].conn_no[l]); printf("\n"); } } } */ /* Close .mdf file */ if (fclose(MdfF) !=0) { printf("Error closing %s.car\n", rootname); exit(1); } } /* End ReadMdfFile function */ /*--------------------- get_molecule Function-----------------------*/ int get_molecule(char *line, int connect_col_no, int q_col_no, int *counter) { char *cur_field; /* For storing current string token */ int i; /* Used in loop counters */ int connect_no; /* Connection number within atom */ int r_val = 1; /* Return value. 1 = successful 0 = EOF encountered */ /* Loop over atoms */ /* blank line signals end of molecule*/ while(!blank_line(fgets(line,MAX_LINE_LENGTH,MdfF))) { /* while(strlen(fgets(line,MAX_LINE_LENGTH,MdfF)) > 2) { */ clean_string(line); /* Get atom name */ cur_field = strtok(line,":"); sscanf(cur_field, "%s", atoms[*counter].residue_string); cur_field = strtok(NULL," "); /* Compare atom name with that in .car file */ if (strcmp(atoms[*counter].name, cur_field)) { printf("Names %s from .car file and %s from .mdf file do not match\n", atoms[*counter].name, cur_field); printf("counter = %d\n",*counter); printf("Program Terminating\n"); exit(4); } /* Skip unwanted fields until charge column, then update charge */ for (i=1; i < q_col_no; i++) strtok(NULL," "); cur_field = strtok(NULL, " "); atoms[*counter].q = atof(cur_field); /* Continue skipping unwanted fields until connectivity records begin */ for ( i = (q_col_no + 1); i < connect_col_no; i++) strtok(NULL," "); /* Process connections */ connect_no = 0; /* reset connections counter */ while ((cur_field = strtok(NULL," ")) && (connect_no < MAX_CONNECTIONS)) { sscanf(cur_field, "%s", atoms[*counter].connections[connect_no++]); } atoms[*counter].no_connect = connect_no; MakeConnectFullForm(counter); (*counter)++; } /* End atom processing loop */ return r_val; } /* End get_molecule function */ /*------------------------MakeConnectFullForm Function--------------------*/ void MakeConnectFullForm(int *counter) { /* This function processes the connection names after all connections for an atom have been read in. It replaces any short forms that use implied default values with the full form connectivity record */ int i; /* Counter for character array */ int j; /* loop counter */ - char tempname[MAX_STRING]; /* name of connection */ + char tempname[2*MAX_STRING]; /* name of connection */ char tempcell[10]; /* Values from connectivity record */ char tempsym[5]; /* " " */ char tempbo[6]; /* " " */ char *charptr; for ( j = 0; j < atoms[*counter].no_connect; j++) { /* If not full name, make name full */ if (strchr(atoms[*counter].connections[j],':') == NULL) { strcpy(tempname,atoms[*counter].residue_string); strcat(tempname,":"); strcat(tempname, atoms[*counter].connections[j]); + /* truncate at capacity of target storage */ + tempname[MAX_STRING-1] = '\0'; sscanf(tempname, "%s", atoms[*counter].connections[j]); } else sscanf(atoms[*counter].connections[j], "%s", tempname); /* Set cell variables */ i=0; charptr = (strchr(tempname,'%')); if (charptr != NULL) { while ( *charptr!='#' && *charptr!='/' && *charptr!='\000') tempcell[i++] = *(charptr++); tempcell[i] = '\000'; } else strcpy(tempcell, "%000"); /* Set symmetry variables -- If not 1, cannot handle at this time */ i = 0; charptr = (strchr(tempname,'#')); if (charptr != NULL) { while (*charptr != '/' && *charptr !='\000') { tempsym[i++] = *(charptr++); if ((i==2) && (tempsym[1] != '1')) { printf("Msi2LMP is not equipped to handle symmetry operations\n"); exit(5); } } tempsym[i] = '\000'; } else strcpy(tempsym, "#1"); /* Set bond order and record in data structure */ i = 0; charptr = strchr(tempname,'/'); if (charptr != NULL) { charptr++; while (*charptr != '\000') tempbo[i++] = *(charptr++); tempbo[i] = '\000'; } else strcpy(tempbo, "1.0"); atoms[*counter].bond_order[j] = atof(tempbo); /* Build connection name and store in atoms data structure */ strtok( tempname, "%#/"); strcat( tempname, tempcell); strcat( tempname, tempsym); strcat( tempname, "/"); strcat( tempname, tempbo); - if (strlen(tempname) > 25) printf("tempname overrun %s\n",tempname); - sscanf( tempname, "%s", atoms[*counter].connections[j]); + if (strlen(tempname) >= MAX_STRING) { + printf("tempname overrun %s. truncating to ",tempname); + /* truncate at capacity of target storage */ + tempname[MAX_STRING-1] = '\0'; + puts(tempname); + } + sscanf(tempname, "%s", atoms[*counter].connections[j]); }/*End for loop*/ }/* End function MakeNameLong */ void clean_string(char *string) { int i,n; short k; n = strlen(string); for (i=0; i < n; i++) { k = (short)string[i]; if ((k<32) | (k>127)) string[i] = '\0'; } } diff --git a/tools/msi2lmp/src/msi2lmp.c b/tools/msi2lmp/src/msi2lmp.c index 20f30e223..44eec8f6c 100644 --- a/tools/msi2lmp/src/msi2lmp.c +++ b/tools/msi2lmp/src/msi2lmp.c @@ -1,432 +1,435 @@ /* * * msi2lmp.exe * +* v3.9.6 AK- Refactoring of MDF file parser with more consistent +* handling of compile time constants MAX_NAME and MAX_STRING +* * v3.9.5 AK- Add TopoTools style force field parameter type hints * * v3.9.4 AK- Make force field style hints optional with a flag * * v3.9.3 AK- Bugfix for triclinic cells. * * v3.9.2 AK- Support for writing out force field style hints * * v3.9.1 AK- Bugfix for Class2. Free allocated memory. Print version number. * * v3.9 AK - Rudimentary support for OPLS-AA * * v3.8 AK - Some refactoring and cleanup of global variables * - Bugfixes for argument parsing and improper definitions * - improved handling of box dimensions and image flags * - port to compiling on windows using MinGW * - more consistent print level handling * - more consistent handling of missing parameters * - Added a regression test script with examples. * * V3.7 STM - Added support for triclinic cells * * v3.6 KLA - Changes to output to either lammps 2001 (F90 version) or to * lammps 2005 (C++ version) * * v3.4 JEC - a number of minor changes due to way newline and EOF are generated * on Materials Studio generated .car and .mdf files as well as odd * behavior out of newer Linux IO libraries. ReadMdfFile was restructured * in the process. * * v3.1 JEC - changed IO interface to standard in/out, forcefield file * location can be indicated by environmental variable; added * printing options, consistency checks and forcefield * parameter versions sensitivity (highest one used) * * v3.0 JEC - program substantially rewritten to reduce execution time * and be 98 % dynamic in memory use (still fixed limits on * number of parameter types for different internal coordinate * sets) * * v2.0 MDP - got internal coordinate information from mdf file and * forcefield parameters from frc file thus eliminating * need for Discover * * V1.0 SL - original version. Used .car file and internal coordinate * information from Discover to produce LAMMPS data file. * * This program uses the .car and .mdf files from MSI/Biosyms's INSIGHT * program to produce a LAMMPS data file. * * The program is started by supplying information at the command prompt * according to the usage described below. * * USAGE: msi2lmp3 ROOTNAME {-print #} {-class #} {-frc FRC_FILE} {-ignore} {-nocenter} {-oldstyle} * * -- msi2lmp3 is the name of the executable * -- ROOTNAME is the base name of the .car and .mdf files * -- all opther flags are optional and can be abbreviated (e.g. -p instead of -print) * * -- -print * # is the print level: 0 - silent except for errors * 1 - minimal (default) * 2 - more verbose * 3 - even more verbose * -- -class * # is the class of forcefield to use (I or 1 = Class I e.g., CVFF, clayff) * (II or 2 = Class II e.g., CFFx, COMPASS) * (O or 0 = OPLS-AA) * default is -class I * * -- -ignore - tells msi2lmp to ignore warnings and errors and keep going * * -- -nocenter - tells msi2lmp to not center the box around the (geometrical) * center of the atoms, but around the origin * * -- -oldstyle - tells msi2lmp to write out a data file without style hints * (to be compatible with older LAMMPS versions) * * -- -shift - tells msi2lmp to shift the entire system (box and coordinates) * by a vector (default: 0.0 0.0 0.0) * * -- -frc - specifies name of the forcefield file (e.g., cff91) * * If the name includes a hard wired directory (i.e., if the name * starts with . or /), then the name is used alone. Otherwise, * the program looks for the forcefield file in $MSI2LMP_LIBRARY. * If $MSI2LMP_LIBRARY is not set, then the current directory is * used. * * If the file name does not include a dot after the first * character, then .frc is appended to the name. * * For example, -frc cvff (assumes cvff.frc is in $MSI2LMP_LIBRARY * or .) * * -frc cff/cff91 (assumes cff91.frc is in * $MSI2LMP_LIBRARY/cff or ./cff) * * -frc /usr/local/forcefields/cff95 (absolute * location) * * By default, the program uses $MSI2LMP_LIBRARY/cvff.frc * * -- output is written to a file called ROOTNAME.data * * **************************************************************** * * msi2lmp * * This is the third version of a program that generates a LAMMPS * data file based on the information in a MSI car file (atom * coordinates) and mdf file (molecular topology). A key part of * the program looks up forcefield parameters from an MSI frc file. * * The first version was written by Steve Lustig at Dupont, but * required using Discover to derive internal coordinates and * forcefield parameters * * The second version was written by Michael Peachey while an * in intern in the Cray Chemistry Applications Group managed * by John Carpenter. This version derived internal coordinates * from the mdf file and looked up parameters in the frc file * thus eliminating the need for Discover. * * The third version was written by John Carpenter to optimize * the performance of the program for large molecular systems * (the original code for deriving atom numbers was quadratic in time) * and to make the program fully dynamic. The second version used * fixed dimension arrays for the internal coordinates. * * John Carpenter can be contacted by sending email to * jec374@earthlink.net * * November 2000 */ #include "msi2lmp.h" #include #include #include /* global variables */ char *rootname; double pbc[6]; double box[3][3]; double shift[3]; int periodic = 1; int TriclinicFlag = 0; int forcefield = 0; int centerflag = 1; int hintflag = 1; int pflag; int iflag; int *no_atoms; int no_molecules; int replicate[3]; int total_no_atoms = 0; int total_no_bonds = 0; int total_no_angles = 0; int total_no_dihedrals = 0; int total_no_angle_angles = 0; int total_no_oops = 0; int no_atom_types = 0; int no_bond_types = 0; int no_angle_types = 0; int no_dihedral_types = 0; int no_oop_types = 0; int no_angleangle_types = 0; char *FrcFileName = NULL; FILE *CarF = NULL; FILE *FrcF = NULL; FILE *PrmF = NULL; FILE *MdfF = NULL; FILE *RptF = NULL; struct Atom *atoms = NULL; struct MoleculeList *molecule = NULL; struct BondList *bonds = NULL; struct AngleList *angles = NULL; struct DihedralList *dihedrals = NULL; struct OOPList *oops = NULL; struct AngleAngleList *angleangles = NULL; struct AtomTypeList *atomtypes = NULL; struct BondTypeList *bondtypes = NULL; struct AngleTypeList *angletypes = NULL; struct DihedralTypeList *dihedraltypes = NULL; struct OOPTypeList *ooptypes = NULL; struct AngleAngleTypeList *angleangletypes = NULL; void condexit(int val) { if (iflag == 0) exit(val); } static int check_arg(char **arg, const char *flag, int num, int argc) { if (num >= argc) { printf("Missing argument to \"%s\" flag\n",flag); return 1; } if (arg[num][0] == '-') { printf("Incorrect argument to \"%s\" flag: %s\n",flag,arg[num]); return 1; } return 0; } int main (int argc, char *argv[]) { int n,i,found_sep; const char *frc_dir_name = NULL; const char *frc_file_name = NULL; pflag = 1; iflag = 0; forcefield = FF_TYPE_CLASS1 | FF_TYPE_COMMON; shift[0] = shift[1] = shift[2] = 0.0; frc_dir_name = getenv("MSI2LMP_LIBRARY"); if (argc < 2) { printf("usage: %s [-class ] [-frc ] [-print #] [-ignore] [-nocenter] [-oldstyle]\n",argv[0]); return 1; } else { /* rootname was supplied as first argument, copy to rootname */ int len = strlen(argv[1]) + 1; rootname = (char *)malloc(len); strcpy(rootname,argv[1]); } n = 2; while (n < argc) { if (strncmp(argv[n],"-c",2) == 0) { n++; if (check_arg(argv,"-class",n,argc)) return 2; if ((strcmp(argv[n],"I") == 0) || (strcmp(argv[n],"1") == 0)) { forcefield = FF_TYPE_CLASS1 | FF_TYPE_COMMON; } else if ((strcmp(argv[n],"II") == 0) || (strcmp(argv[n],"2") == 0)) { forcefield = FF_TYPE_CLASS2 | FF_TYPE_COMMON; } else if ((strcmp(argv[n],"O") == 0) || (strcmp(argv[n],"0") == 0)) { forcefield = FF_TYPE_OPLSAA | FF_TYPE_COMMON; } else { printf("Unrecognized Forcefield class: %s\n",argv[n]); return 3; } } else if (strncmp(argv[n],"-f",2) == 0) { n++; if (check_arg(argv,"-frc",n,argc)) return 4; frc_file_name = argv[n]; } else if (strncmp(argv[n],"-s",2) == 0) { if (n+3 > argc) { printf("Missing argument(s) to \"-shift\" flag\n"); return 1; } shift[0] = atof(argv[++n]); shift[1] = atof(argv[++n]); shift[2] = atof(argv[++n]); } else if (strncmp(argv[n],"-i",2) == 0 ) { iflag = 1; } else if (strncmp(argv[n],"-n",4) == 0 ) { centerflag = 0; } else if (strncmp(argv[n],"-o",2) == 0 ) { hintflag = 0; } else if (strncmp(argv[n],"-p",2) == 0) { n++; if (check_arg(argv,"-print",n,argc)) return 5; pflag = atoi(argv[n]); } else { printf("Unrecognized option: %s\n",argv[n]); return 6; } n++; } /* set defaults, if nothing else was given */ if (frc_dir_name == NULL) #if (_WIN32) frc_dir_name = "..\\frc_files"; #else frc_dir_name = "../frc_files"; #endif if (frc_file_name == NULL) frc_file_name = "cvff.frc"; found_sep=0; #ifdef _WIN32 if (isalpha(frc_file_name[0]) && (frc_file_name[1] == ':')) found_sep=1; /* windows drive letter => full path. */ #endif n = strlen(frc_file_name); for (i=0; i < n; ++i) { #ifdef _WIN32 if ((frc_file_name[i] == '/') || (frc_file_name[i] == '\\')) found_sep=1+i; #else if (frc_file_name[i] == '/') found_sep=1+i; #endif } /* full pathname given */ if (found_sep) { i = 0; /* need to append extension? */ if ((n < 5) || (strcmp(frc_file_name+n-4,".frc") !=0)) i=1; FrcFileName = (char *)malloc(n+1+i*4); strcpy(FrcFileName,frc_file_name); if (i) strcat(FrcFileName,".frc"); } else { i = 0; /* need to append extension? */ if ((n < 5) || (strcmp(frc_file_name+n-4,".frc") !=0)) i=1; FrcFileName = (char *)malloc(n+2+i*4+strlen(frc_dir_name)); strcpy(FrcFileName,frc_dir_name); #ifdef _WIN32 strcat(FrcFileName,"\\"); #else strcat(FrcFileName,"/"); #endif strcat(FrcFileName,frc_file_name); if (i) strcat(FrcFileName,".frc"); } if (pflag > 0) { puts("\nRunning msi2lmp " MSI2LMP_VERSION "\n"); if (forcefield & FF_TYPE_CLASS1) puts(" Forcefield: Class I"); if (forcefield & FF_TYPE_CLASS2) puts(" Forcefield: Class II"); if (forcefield & FF_TYPE_OPLSAA) puts(" Forcefield: OPLS-AA"); printf(" Forcefield file name: %s\n",FrcFileName); if (centerflag) puts(" Output is recentered around geometrical center"); if (hintflag) puts(" Output contains style flag hints"); else puts(" Style flag hints disabled"); printf(" System translated by: %g %g %g\n",shift[0],shift[1],shift[2]); } n = 0; if (forcefield & FF_TYPE_CLASS1) { if (strstr(FrcFileName,"cvff") != NULL) ++n; if (strstr(FrcFileName,"clayff") != NULL) ++n; } else if (forcefield & FF_TYPE_OPLSAA) { if (strstr(FrcFileName,"oplsaa") != NULL) ++n; } else if (forcefield & FF_TYPE_CLASS2) { if (strstr(FrcFileName,"pcff") != NULL) ++n; if (strstr(FrcFileName,"cff91") != NULL) ++n; if (strstr(FrcFileName,"compass") != NULL) ++n; } if (n == 0) { if (iflag > 0) fputs(" WARNING",stderr); else fputs(" Error ",stderr); fputs("- forcefield name and class appear to be inconsistent\n\n",stderr); if (iflag == 0) return 7; } /* Read in .car file */ ReadCarFile(); /*Read in .mdf file */ ReadMdfFile(); /* Define bonds, angles, etc...*/ if (pflag > 0) printf("\n Building internal coordinate lists \n"); MakeLists(); /* Read .frc file into memory */ if (pflag > 0) printf("\n Reading forcefield file \n"); ReadFrcFile(); /* Get forcefield parameters */ if (pflag > 0) printf("\n Get force field parameters for this system\n"); GetParameters(); /* Do internal check of internal coordinate lists */ if (pflag > 0) printf("\n Check parameters for internal consistency\n"); CheckLists(); /* Write out the final data */ WriteDataFile(rootname); /* free up memory to detect possible memory corruption */ free(rootname); free(FrcFileName); ClearFrcData(); for (n=0; n < no_molecules; n++) { free(molecule[n].residue); } free(no_atoms); free(molecule); free(atoms); free(atomtypes); if (bonds) free(bonds); if (bondtypes) free(bondtypes); if (angles) free(angles); if (angletypes) free(angletypes); if (dihedrals) free(dihedrals); if (dihedraltypes) free(dihedraltypes); if (oops) free(oops); if (ooptypes) free(ooptypes); if (angleangles) free(angleangles); if (angleangletypes) free(angleangletypes); if (pflag > 0) printf("\nNormal program termination\n"); return 0; } diff --git a/tools/msi2lmp/src/msi2lmp.h b/tools/msi2lmp/src/msi2lmp.h index 1db78e068..009bf7426 100644 --- a/tools/msi2lmp/src/msi2lmp.h +++ b/tools/msi2lmp/src/msi2lmp.h @@ -1,225 +1,225 @@ /******************************** * * Header file for msi2lmp conversion program. * * This is the header file for the third version of a program * that generates a LAMMPS data file based on the information * in an MSI car file (atom coordinates) and mdf file (molecular * topology). A key part of the program looks up forcefield parameters * from an MSI frc file. * * The first version was written by Steve Lustig at Dupont, but * required using Discover to derive internal coordinates and * forcefield parameters * * The second version was written by Michael Peachey while an * intern in the Cray Chemistry Applications Group managed * by John Carpenter. This version derived internal coordinates * from the mdf file and looked up parameters in the frc file * thus eliminating the need for Discover. * * The third version was written by John Carpenter to optimize * the performance of the program for large molecular systems * (the original code for derving atom numbers was quadratic in time) * and to make the program fully dynamic. The second version used * fixed dimension arrays for the internal coordinates. * * The thrid version was revised in Fall 2011 by * Stephanie Teich-McGoldrick to add support non-orthogonal cells. * * The next revision was started in Summer/Fall 2013 by * Axel Kohlmeyer to improve portability to Windows compilers, * clean up command line parsing and improve compatibility with * the then current LAMMPS versions. This revision removes * compatibility with the obsolete LAMMPS version written in Fortran 90. */ # include -#define MSI2LMP_VERSION "v3.9.5 / 27 May 2014" +#define MSI2LMP_VERSION "v3.9.6 / 11 Sep 2014" #define PI_180 0.01745329251994329576 #define MAX_LINE_LENGTH 256 #define MAX_CONNECTIONS 8 #define MAX_STRING 64 #define MAX_NAME 16 #define MAX_ATOM_TYPES 100 #define MAX_BOND_TYPES 200 #define MAX_ANGLE_TYPES 300 #define MAX_DIHEDRAL_TYPES 400 #define MAX_OOP_TYPES 400 #define MAX_ANGLEANGLE_TYPES 400 #define MAX_TYPES 12000 #define FF_TYPE_COMMON 1<<0 #define FF_TYPE_CLASS1 1<<1 #define FF_TYPE_CLASS2 1<<2 #define FF_TYPE_OPLSAA 1<<3 struct ResidueList { int start; int end; char name[MAX_NAME]; }; struct MoleculeList { int start; int end; int no_residues; struct ResidueList *residue; }; /* Internal coodinate Lists */ struct BondList { int type; int members[2]; }; struct AngleList { int type; int members[3]; }; struct DihedralList { int type; int members[4]; }; struct OOPList { int type; int members[4]; }; struct AngleAngleList { int type; int members[4]; }; /* Internal coodinate Types Lists */ struct AtomTypeList { char potential[5]; double mass; double params[2]; int no_connect; }; struct BondTypeList { int types[2]; double params[4]; }; struct AngleTypeList { int types[3]; double params[4]; double bondangle_cross_term[4]; double bondbond_cross_term[3]; }; struct DihedralTypeList { int types[4]; double params[6]; double endbonddihedral_cross_term[8]; double midbonddihedral_cross_term[4]; double angledihedral_cross_term[8]; double angleangledihedral_cross_term[3]; double bond13_cross_term[3]; }; struct OOPTypeList { int types[4]; double params[3]; double angleangle_params[6]; }; struct AngleAngleTypeList { int types[4]; double params[6]; }; /* ---------------------------------------------- */ struct Atom { int molecule; /* molecule id */ int no; /* atom id */ char name[MAX_NAME]; /* atom name */ double x[3]; /* position vector */ int image[3]; /* image flag */ char potential[6]; /* atom potential type */ char element[4]; /* atom element */ double q; /* charge */ char residue_string[MAX_NAME]; /* residue string */ int no_connect; /* number of connections to atom */ char connections[MAX_CONNECTIONS][MAX_STRING]; /* long form, connection name*/ double bond_order[MAX_CONNECTIONS]; int conn_no[MAX_CONNECTIONS]; /* Atom number to which atom is connected */ int type; }; extern char *rootname; extern char *FrcFileName; extern double pbc[6]; /* A, B, C, alpha, beta, gamma */ extern double box[3][3]; /* hi/lo for x/y/z and xy, xz, yz for triclinic */ extern double shift[3]; /* shift vector for all coordinates and box positions */ extern int periodic; /* 0= nonperiodic 1= 3-D periodic */ extern int TriclinicFlag; /* 0= Orthogonal 1= Triclinic */ extern int forcefield; /* BitMask: the value FF_TYPE_COMMON is set for common components of the options below, * FF_TYPE_CLASS1 = ClassI, FF_TYPE_CLASS2 = ClassII, FF_TYPE_OPLSAA = OPLS-AA*/ extern int centerflag; /* 1= center box 0= keep box */ extern int hintflag; /* 1= print style hint comments 0= no hints */ extern int pflag; /* print level: 0, 1, 2, 3 */ extern int iflag; /* 0 stop at errors 1 = ignore errors */ extern int *no_atoms; extern int no_molecules; extern int replicate[3]; extern int total_no_atoms; extern int total_no_bonds; extern int total_no_angles; extern int total_no_dihedrals; extern int total_no_angle_angles; extern int total_no_oops; extern int no_atom_types; extern int no_bond_types; extern int no_angle_types; extern int no_dihedral_types; extern int no_oop_types; extern int no_angleangle_types; extern FILE *CarF; extern FILE *FrcF; extern FILE *PrmF; extern FILE *MdfF; extern FILE *RptF; extern struct Atom *atoms; extern struct MoleculeList *molecule; extern struct BondList *bonds; extern struct AngleList *angles; extern struct DihedralList *dihedrals; extern struct OOPList *oops; extern struct AngleAngleList *angleangles; extern struct AtomTypeList *atomtypes; extern struct BondTypeList *bondtypes; extern struct AngleTypeList *angletypes; extern struct DihedralTypeList *dihedraltypes; extern struct OOPTypeList *ooptypes; extern struct AngleAngleTypeList *angleangletypes; extern void FrcMenu(); extern void ReadCarFile(); extern void ReadMdfFile(); extern void ReadFrcFile(); extern void ClearFrcData(); extern void MakeLists(); extern void GetParameters(); extern void CheckLists(); extern void WriteDataFile(char *); extern void set_box(double box[3][3], double *h, double *h_inv); extern void lamda2x(double *lamda, double *x, double *h, double *boxlo); extern void x2lamda(double *x, double *lamda, double *h_inv, double *boxlo); extern void condexit(int);