nbody_by_type.py
No OneTemporary
Actions

Subscribers

None

File Metadata

Created: Wed, May 15, 01:38

nbody_by_type.py
View Options

	#!/usr/bin/env python

	# Author: Andrew Jewett (jewett.aij at g mail)
	# http://www.chem.ucsb.edu/~sheagroup
	# License: 3-clause BSD License (See LICENSE.TXT)
	# Copyright (c) 2011, Regents of the University of California
	# All rights reserved.

	man_page_text = """

	nbody_by_type.py reads a LAMMPS data file (or an excerpt of a LAMMPS)
	data file containing bonded many-body interactions by atom type
	(and bond type), and generates a list of additional interactions
	in LAMMPS format consistent with those type (to the standard out).

	Typical Usage:

	nbody_by_type.py X < old.data > new.data

	--or--

	nbody_by_type.py X \\
	-atoms atoms.data \\
	-bonds bonds.data \\
	-nbody X.data \\
	-nbodybytype X_by_type.data \\
	> new_X.data

	In both cases "X" denotes the interaction type, which
	is either "Angles", "Dihedrals", or "Impropers".
	(Support for other interaction types can be added by the user. See below.)

	-------- Example 1 -------

	nbody_by_type.py X < old.data > new.data

	In this example, nbody_by_type.py reads a LAMMPS data file
	"orig.data", and extracts the relevant section ("Angles",
	"Dihedrals", or "Impropers"). It also looks a section named "X By Type",
	(eg. "Angles By type", "Impropers By type", "Impropers By type")
	which contains a list of criteria for automatically defining additional
	interactions of that type. For example, this file might contain:

	Angle By Type

	7 1 2 1 * *
	8 2 2 * * *
	9 3 4 3 * *

	The first column is an interaction type ID.
	The next 3 columns are atom type identifiers.
	The final 2 columns are bond type identifiers.
	The * is a wildcard symbol indicating there is no preference for bond types
	in this example. (Optionally, regular expressions can also be used to
	define a type match, by enclosing the atom or bond type in / slashes.)

	The first line tells us to that there should be a 3-body "Angle"
	interaction of type "7" whenever an atom of type 1 is bonded to an atom
	of type "2", which is bonded to another atom of type "1" again.
	The second line tells us that an angle is defined whenever three atoms
	are bonded together and the first two are of type "2".
	(Redundant angle interactions are filtered.)

	New interactions are created for every group of bonded
	atoms which match these criteria if they are bonded together
	in the relevant way for that interaction type (as determined by
	nbody_X.py), and printed to the standard output. For example,
	suppose you are automatically generating 3-body "Angle" interactions using:

	nbody_by_type Angles < old.data > new.data

	The file "new.data" will be identical to "old.data", however the
	"Angles By Type" section will be deleted, and the following lines of
	text will be added to the "Angles" section:

	394 7 5983 5894 5895
	395 7 5984 5895 5896
	396 7 5985 5896 5897
	: : : : :
	847 9 14827 14848 14849

	The numbers in the first column are counters which assign a ID to
	every interaction of that type, and start where the original "Angles"
	data left off (New angle ID numbers do not overlap with old ID numbers).
	The text in the second column ("7", "9", ...) matches the text from the
	first column of the "Angle By Type" section of the input file.

	-------- Example 2 -------

	nbody_by_type.py X \\
	-atoms atoms.data \\
	-bonds bonds.data \\
	-nbody X.data \\
	-nbodybytype X_by_type.data \\
	-prefix "SOMESTRING" -suffix "ANOTHERSTRING" \\
	> new_X.data

	In particular, for Angle interactions:

	nbody_by_type.py Angles \\
	-atoms atoms.data \\
	-bonds bonds.data \\
	-nbody angles.data \\
	-nbodybytype angles_by_type.data \\
	> new_Angles.data

	When run this way, nbody_by_type.py behaves exactly the same way
	as in Example 1, however only the lines of text corresponding to
	the new generated interactions are printed, (not the entire data file).
	Also note, that when run this way, nbody_by_type.py does not read the
	LAMMPS data from the standard input. Instead, it reads each section of
	the data file from a different file indicated by the arguments following
	the "-atoms", "-bonds", "-nbody", and "-nbodybytype" flags.

	"Angles" is a 3-body interaction style. So when run this way,
	nbody_by_type.py will create a 5 (=3+2) column file (new_Angles.data).

	Note: the atom, bond and other IDs/types in need not be integers.

	Note: This program must be distributed with several python modules, including:
	nbody_Angles.py, nbody_Dihedrals.py, and nbody_Impropers.py. These
	contain bond definitions for angular, dihedral, and improper interactions.
	(In case any new interaction types are ever added to LAMMPS, with only
	a few lines of python it is easy to edit to define new bonded
	interaction types by supplying new "nbody_X.py" python module.
	Refer to the modules listed above for examples.)

	Note: Optional "-prefix" and "-suffix" arguments can be included to decorate
	the interaction IDs (first column). For example, -prefix "auto_" and
	-suffix "_angle", causes "new_Angles.data" to contain the following text:

	auto_394_angle 7 5983 5894 5895
	auto_395_angle 7 5984 5895 5896
	auto_396_angle 7 5985 5896 5897
	: : : : :
	auto_847_angle 9 14827 14848 14849

	"""


	import sys
	from extract_lammps_data import *
	from nbody_by_type_lib import GenInteractions_str
	from ttree_lex import *
	from lttree_styles import AtomStyle2ColNames, ColNames2AidAtypeMolid


	if sys.version < '2.6':
	raise InputError('Error: Using python '+sys.version+'\n'
	' Alas, you must upgrade to a newer version of python (2.6 or later).')
	elif sys.version < '2.7':
	sys.stderr.write('--------------------------------------------------------\n'
	'----------------- WARNING: OLD PYTHON VERSION ----------\n'
	' This program is untested on your python version ('+sys.version+').\n'
	' PLEASE LET ME KNOW IF THIS PROGRAM CRASHES (and upgrade python).\n'
	' -Andrew 2013-10-25\n'
	'--------------------------------------------------------\n'
	'--------------------------------------------------------\n')
	from ordereddict import OrderedDict
	else:
	from collections import OrderedDict



	def GenInteractions_lines(lines_atoms,
	lines_bonds,
	lines_nbody,
	lines_nbodybytype,
	atom_style,
	g_bond_pattern,
	canonical_order, #function to sort atoms and bonds
	prefix='',
	suffix='',
	report_progress = False):

	column_names = AtomStyle2ColNames(atom_style)
	i_atomid, i_atomtype, i_molid = ColNames2AidAtypeMolid(column_names)

	atomids_str = []
	atomtypes_str = []

	for iv in range(0, len(lines_atoms)):
	line = lines_atoms[iv].strip()
	if '#' in line:
	icomment = line.find('#')
	line = (line[:icomment]).strip()
	if len(line) > 0:
	tokens = SplitQuotedString(line)
	if ((len(tokens) <= i_atomid) or (len(tokens) <= i_atomtype)):
	raise(InputError('Error not enough columns on line '+str(iv+1)+' of \"Atoms\" section.'))
	tokens = SplitQuotedString(line)
	atomids_str.append(EscCharStrToChar(tokens[i_atomid]))
	atomtypes_str.append(EscCharStrToChar(tokens[i_atomtype]))

	bondids_str = []
	bondtypes_str = []
	bond_pairs = []
	for ie in range(0, len(lines_bonds)):
	line = lines_bonds[ie].strip()
	if '#' in line:
	icomment = line.find('#')
	line = (line[:icomment]).strip()
	if len(line) > 0:
	tokens = SplitQuotedString(line)
	if len(tokens) < 4:
	raise(InputError('Error not enough columns on line '+str(ie+1)+' of \"Bonds\" section.'))
	bondids_str.append(EscCharStrToChar(tokens[0]))
	bondtypes_str.append(EscCharStrToChar(tokens[1]))
	bond_pairs.append( (EscCharStrToChar(tokens[2]),
	EscCharStrToChar(tokens[3])) )

	typepattern_to_coefftypes = []

	for i in range(0, len(lines_nbodybytype)):
	line = lines_nbodybytype[i].strip()
	if '#' in line:
	icomment = line.find('#')
	line = (line[:icomment]).strip()
	if len(line) > 0:
	tokens = SplitQuotedString(line)

	if ((len(tokens) != 1 + g_bond_pattern.GetNumVerts()) and
	(len(tokens) != 1 + g_bond_pattern.GetNumVerts()
	+ g_bond_pattern.GetNumEdges())):
	raise(InputError('Error: Wrong number of columns in \"By Type\" section of data file.\n'
	'Offending line:\n'+
	'\"'+line+'\"\n'
	'Expected either '+
	str(1 + g_bond_pattern.GetNumVerts()) + ' or ' +
	str(1 + g_bond_pattern.GetNumVerts() +
	g_bond_pattern.GetNumEdges())
	+ ' colunms.'))

	coefftype = EscCharStrToChar(tokens[0])
	typepattern = []

	for typestr in tokens[1:]:
	if ((len(typestr) >= 2) and
	(typestr[0] == '/') and (typestr[-1] == '/')):
	regex_str = typestr[1:-1]
	typepattern.append( re.compile(regex_str) )
	else:
	typepattern.append(EscCharStrToChar(typestr))

	# If the user neglected to specify the bond types, assume '*'
	if len(tokens) == 1 + g_bond_pattern.GetNumVerts():
	typepattern += [''] g_bond_pattern.GetNumEdges()

	typepattern_to_coefftypes.append([typepattern, coefftype])

	coefftype_to_atomids_str = GenInteractions_str(bond_pairs,
	g_bond_pattern,
	typepattern_to_coefftypes,
	canonical_order,
	atomids_str,
	atomtypes_str,
	bondids_str,
	bondtypes_str,
	report_progress)
	lines_nbody_new = []
	for coefftype, atomids_list in coefftype_to_atomids_str.items():
	for atomids_found in atomids_list:
	n = len(lines_nbody) + len(lines_nbody_new) + 1
	line = prefix+str(n)+suffix+' '+ \
	coefftype+' '+(' '.join(atomids_found))+'\n'
	lines_nbody_new.append(line)

	return lines_nbody_new



	def GenInteractions_files(lines_data,
	src_bond_pattern,
	fname_atoms,
	fname_bonds,
	fname_nbody,
	fname_nbodybytype,
	section_name,
	section_name_bytype,
	atom_style,
	prefix='',
	suffix='',
	report_progress = False):

	if fname_atoms == None:
	lines_atoms = [line for line in ExtractDataSection(lines_data, 'Atoms')]
	else:
	try:
	f = open(fname_atoms, 'r')
	except:
	sys.stderr.write('Error: Unable to open file \"'+fname_atoms+'\" for reading.\n')
	sys.exit(-1)
	lines_atoms = [line for line in f.readlines()
	if ((len(line.strip())>0) and (line.strip()[0] != '#'))]
	f.close()


	if fname_bonds == None:
	lines_bonds = [line for line in ExtractDataSection(lines_data, 'Bonds')]
	else:
	try:
	f = open(fname_bonds, 'r')
	except IOError:
	sys.stderr.write('Error: Unable to open file \"'+fname_bonds+'\" for reading.\n')
	sys.exit(-1)
	lines_bonds = [line for line in f.readlines()
	if ((len(line.strip())>0) and (line.strip()[0] != '#'))]
	f.close()


	if fname_nbody == None:
	lines_nbody = [line for line in ExtractDataSection(lines_data, section_name)]
	else:
	try:
	f = open(fname_nbody, 'r')
	lines_nbody = [line for line in f.readlines()
	if ((len(line.strip())>0) and (line.strip()[0] != '#'))]
	f.close()
	except IOError:
	#sys.stderr.write(' (omitting optional file \"'+fname_nbody+'\")\n')
	lines_nbody = []


	if fname_nbodybytype == None:
	lines_nbodybytype=[line for
	line in ExtractDataSection(lines_data,
	section_name_bytype)]

	else:
	try:
	f = open(fname_nbodybytype, 'r')
	except:
	sys.stderr.write('Error: Unable to open file \"'+fname_nbodybytype+'\" for reading.\n')
	sys.exit(-1)
	lines_nbodybytype = [line for line in f.readlines()
	if((len(line.strip())>0)and(line.strip()[0]!='#'))]
	f.close()


	g = __import__(src_bond_pattern) #defines g.bond_pattern, g.canonical_order


	return GenInteractions_lines(lines_atoms,
	lines_bonds,
	lines_nbody,
	lines_nbodybytype,
	atom_style,
	g.bond_pattern,
	g.canonical_order,
	prefix,
	suffix,
	report_progress)






	if __name__ == "__main__":

	g_program_name = __file__.split('/')[-1] # = 'nbody_by_type.py'
	g_date_str = '2014-4-07'
	g_version_str = '0.16'

	####### Main Code Below: #######
	sys.stderr.write(g_program_name+' v'+g_version_str+' '+g_date_str+' ')
	if sys.version < '3':
	sys.stderr.write(' (python version < 3)\n')
	else:
	sys.stderr.write('\n')

	try:

	fname_atoms = None
	fname_bonds = None
	fname_nbody = None
	fname_nbodybytype = None
	atom_style = 'full'
	prefix=''
	suffix=''

	argv = [arg for arg in sys.argv]

	if len(argv) == 1:
	raise InputError('Error: Missing argument required.\n'
	' The \"'+g_program_name+'\" program requires an argument containing the\n'
	' name of a section from a LAMMPS data file storing bonded interactions.\n'
	' (For example: "Angles", "Dihedrals", or "Impropers".)\n'
	#' Note: The first letter of each section is usually capitalized.)\n'
	'\n'
	'--------------- general documentation -------------\n'
	'\n' + man_page_text + '\n')

	section_name = '' # (This will be replaced later.)
	section_name_bytype = '' # (This will be replaced later.)

	# Loop over the remaining arguments not processed yet.
	# These arguments are specific to the lttree.py program
	# and are not understood by ttree.py:
	i = 1
	while i < len(argv):
	#sys.stderr.write('argv['+str(i)+'] = \"'+argv[i]+'\"\n')
	if ((argv[i].lower() == '-?') or
	(argv[i].lower() == '--?') or
	(argv[i].lower() == '-help') or
	(argv[i].lower() == '-help')):
	if i+1 >= len(argv):
	sys.stdout.write(man_page_text+'\n')
	sys.exit(0)

	elif argv[i].lower() == '-atoms':
	if i+1 >= len(argv):
	raise InputError('Error: '+argv[i]+' flag should be followed by a file name containing lines of\n'
	' text which might appear in the "Atoms" section of a LAMMPS data file.\n')
	fname_atoms = argv[i+1]
	del(argv[i:i+2])

	elif argv[i].lower() == '-bonds':
	if i+1 >= len(argv):
	raise InputError('Error: '+argv[i]+' flag should be followed by a file name containing lines of\n'
	' text which might appear in the "Bonds" section of a LAMMPS data file.\n')
	fname_bonds = argv[i+1]
	del(argv[i:i+2])

	elif argv[i].lower() == '-nbody':
	if i+1 >= len(argv):
	raise InputError('Error: '+argv[i]+' flag should be followed by a file name\n')

	#raise InputError('Error: '+argv[i]+' flag should be followed by a file name containing lines of\n'
	# ' text which might appear in the "'+section_name+' section of a LAMMPS data file.\n')
	fname_nbody = argv[i+1]
	del(argv[i:i+2])

	elif argv[i].lower() == '-nbodybytype':
	if i+1 >= len(argv):
	raise InputError('Error: '+argv[i]+' flag should be followed by a file name\n')

	#raise InputError('Error: '+argv[i]+' flag should be followed by a file name containing\n'
	# ' text which might appear in the "'+section_name+' By Type" section\n'
	# ' of a LAMMPS data file.\n')
	fname_nbodybytype = argv[i+1]
	del(argv[i:i+2])

	elif ((argv[i].lower() == '-atom-style') or
	(argv[i].lower() == '-atom_style')):
	if i+1 >= len(argv):
	raise InputError('Error: '+argv[i]+' flag should be followed by a an atom_style name.\n'
	' (Or single quoted string which includes a space-separated\n'
	' list of column names.)\n')
	atom_style = argv[i+1]
	del(argv[i:i+2])

	elif argv[i].lower() == '-prefix':
	if i+1 >= len(argv):
	raise InputError('Error: '+argv[i]+' flag should be followed by a prefix string\n'
	' (a string you want to appear to the left of the integer\n'
	' which counts the bonded interactions you have generated.)\n')
	prefix = argv[i+1]
	del(argv[i:i+2])

	elif argv[i].lower() == '-suffix':
	if i+1 >= len(argv):
	raise InputError('Error: '+argv[i]+' flag should be followed by a suffix string\n'
	' (a string you want to appear to the right of the integer\n'
	' which counts the bonded interactions you have generated.)\n')
	prefix = argv[i+1]
	del(argv[i:i+2])

	elif argv[i].lower() == '-subgraph':
	if i+1 >= len(argv):
	raise InputError('Error: '+argv[i]+' flag should be followed by the name of a python file\n'
	' containing the definition of the subgraph you are searching for\n'
	' and it\'s symmetry properties.\n'
	' (See nbody_Dihedrals.py for example.)\n')
	bond_pattern_module_name = argv[i+1]
	# If the file name ends in ".py", then strip off this suffix.
	bond_pattern_module_name=bond_pattern_module_name.rstrip('.py')
	del(argv[i:i+2])

	elif argv[i].lower() == '-section':
	if i+1 >= len(argv):
	raise InputError('Error: '+argv[i]+' flag should be followed by the name of the LAMMPS\n'
	' Data section describing the type of interaction being generated.\n'
	' (For example: \"Angles\", \"Dihedrals\", \"Impropers\", etc...)\n')
	section_name = argv[i+1]
	del(argv[i:i+2])


	elif argv[i].lower() == '-sectionbytype':
	if i+1 >= len(argv):
	raise InputError('Error: '+argv[i]+' flag should be followed by the name of the\n'

	' write_once(\"???? By Type\") section describing how to create the\n'
	' interactions. (For example: \"Angles By Type\", \"Dihedrals By Type\",\n'
	' \"Impropers By Type\", etc... Note that this argument\n'
	' will contain spaces, so surround it with quotes.)\n')

	section_name_bytype = argv[i+1]
	del(argv[i:i+2])

	elif argv[i][0] == '-':
	raise InputError('Error('+g_program_name+'):\n'
	'Unrecogized command line argument \"'+argv[i]+'\"\n')
	else:
	i += 1

	#if len(argv) == 1:
	# raise InputError('Error: Missing argument required.\n'
	# ' The \"'+g_program_name+'\" program requires an argument containing the\n'
	# ' name of a section from a LAMMPS data file storing bonded interactions.\n'
	# ' (For example: "Angles", "Dihedrals", or "Impropers".)\n')
	# #' Note: The first letter of each section is usually capitalized.)\n'

	if len(argv) == 1:
	pass
	elif len(argv) == 2:
	section_name = argv[1]
	section_name_bytype = section_name + ' By Type'
	bond_pattern_module_name = 'nbody_'+section_name
	del(argv[1:2])
	else:
	# if there are more than 2 remaining arguments,
	problem_args = ['\"'+arg+'\"' for arg in argv[1:]]
	raise InputError('Syntax Error('+g_program_name+'):\n\n'
	' Problem with argument list.\n'
	' The remaining arguments are:\n\n'
	' '+(' '.join(problem_args))+'\n\n'
	' (The actual problem may be earlier in the argument list.)\n')

	if ((section_name == '') or
	(section_name_bytype == '') or
	(bond_pattern_module_name == '')):
	raise InputError('Syntax Error('+g_program_name+'):\n\n'
	' You have not defined the following arguments:\n'
	' -section name\n'
	' -sectionbytype namebytype\n'
	' -subgraph pythonfile.py\n')

	# ------------ Done parsing argument list ----------

	if (fname_atoms or fname_bonds or fname_nbody or fname_nbodybytype):
	output_full_DATA_file = False
	lines_data = []
	else:
	output_full_DATA_file = True
	lines_data = sys.stdin.readlines()

	# Calculate the interactions and generate a list of lines of text

	lines_new_interactions = \
	GenInteractions_files(lines_data,
	bond_pattern_module_name,
	fname_atoms,
	fname_bonds,
	fname_nbody,
	fname_nbodybytype,
	section_name,
	section_name_bytype,
	atom_style,
	prefix,
	suffix,
	report_progress=True)

	# Print this text to the standard out.

	# Question: Do we write out the entire DATA file,
	# or just the portion that was generated by this program?

	if not output_full_DATA_file:
	# ...then only print out the interactions which were generated
	# by this program, omitting any lines from the original data file:

	# (This is the way I usually run this program.)
	for line in lines_new_interactions:
	sys.stdout.write(line)


	else:


	# ...then print out the entire data file, deleting the "By Type"
	# section, and adding the generated lines of text to the corresponding

	# If present, update the interaction counter at the beginning
	# of the LAMMPS data file. (For example, if if 100 new "Angles"
	# interactions were generated, replace "2 Angles" with "102 Angles")
	#
	for i in range(0, len(lines_data)):
	line = lines_data[i].strip()
	tokens = SplitQuotedString(line)

	# updating the interaction counter
	if ((len(tokens) == 2) and (tokens[1] == (section_name).lower())):
	tokens[0] = str(int(tokens[0]) + len(lines_new_interactions))
	lines_data[i] = ' '.join(tokens) + '\n'

	# stop when you come to a section header
	elif line in lammps_data_sections:
	#"lammps_data_sections" is defined in "extract_lammps_data.py"
	break


	# locate the appropriate section of the data file
	# (storing the type of interactions we just created)
	i_nbody_a, i_nbody_b = \
	FindDataSection(lines_data, section_name)

	if i_nbody_a == -1:
	if len(lines_new_interactions) > 0:
	# If not found, create a new section at the end of the file,
	# containing a section name followed by the list of lines
	lines_data += ['\n', section_name+'\n', '\n'] + \
	lines_new_interactions + ['\n']
	else:
	# Insert the new lines into the existing section
	lines_data[i_nbody_b:i_nbody_b] = lines_new_interactions

	# Figure out where the "By Type" section is located
	# (so we skip over it)
	i_bytype_a, i_bytype_b = \
	FindDataSection(lines_data, section_name_bytype)

	in_bytype_section = False
	for i in range(0, len(lines_data)):
	line = lines_data[i].strip()
	# Omit all lines of text in the 'By Type' section (including the
	# header and commments or blank lines which immediately follow it.)
	if line == section_name_bytype:
	in_bytype_section = True
	elif i == i_bytype_b:
	in_bytype_section = False

	if not in_bytype_section:
	sys.stdout.write(lines_data[i])

	except (ValueError, InputError) as err:
	sys.stderr.write('\n'+str(err)+'\n')
	sys.exit(-1)

nbody_by_type.pyNo OneTemporaryActions

File Metadata

nbody_by_type.pyView Options

Event Timeline

nbody_by_type.py
No OneTemporary
Actions

nbody_by_type.py
View Options