Skip to content

New MOPAC files cannot be parsed #177

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rwest opened this issue Jan 31, 2014 · 2 comments
Closed

New MOPAC files cannot be parsed #177

rwest opened this issue Jan 31, 2014 · 2 comments

Comments

@rwest
Copy link
Member

rwest commented Jan 31, 2014

Traceback (most recent call last):
  File "/home/r.west/Code/RMG-Py/rmg.py", line 139, in <module>
    cProfile.runctx(command, global_vars, local_vars, stats_file)
  File "/shared/apps/python/Python-2.7.5/INSTALL/lib/python2.7/cProfile.py", line 49, in runctx
    prof = prof.runctx(statement, globals, locals)
  File "/shared/apps/python/Python-2.7.5/INSTALL/lib/python2.7/cProfile.py", line 140, in runctx
    exec cmd in globals, locals
  File "<string>", line 1, in <module>
  File "/home/r.west/Code/RMG-Py/rmgpy/rmg/main.py", line 364, in execute
    self.initialize(args)
  File "/home/r.west/Code/RMG-Py/rmgpy/rmg/main.py", line 352, in initialize
    self.reactionModel.enlarge([spec for spec in self.initialSpecies if spec.reactive])
  File "/home/r.west/Code/RMG-Py/rmgpy/rmg/model.py", line 720, in enlarge
    spec.generateThermoData(database, quantumMechanics=self.quantumMechanics)
  File "/home/r.west/Code/RMG-Py/rmgpy/rmg/model.py", line 149, in generateThermoData
    thermo0 = quantumMechanics.getThermoData(molecule) # returns None if it fails
  File "/home/r.west/Code/RMG-Py/rmgpy/qm/main.py", line 147, in getThermoData
    thermo0 = qm_molecule_calculator.generateThermoData()
  File "/home/r.west/Code/RMG-Py/rmgpy/qm/molecule.py", line 221, in generateThermoData
    self.qmData = self.generateQMData()
  File "/home/r.west/Code/RMG-Py/rmgpy/qm/mopac.py", line 220, in generateQMData
    result = self.parse() # parsed in cclib
  File "/home/r.west/Code/RMG-Py/rmgpy/qm/mopac.py", line 144, in parse
    cclibData = parser.parse()
  File "/home/r.west/Code/RMG-Py/external/cclib/parser/logfileparser.py", line 221, in parse
    self.extract(inputfile, line)
  File "/home/r.west/Code/RMG-Py/external/cclib/parser/mopacparser.py", line 142, in extract
    self.inputatoms.append(symbol2int(broken[1]))
  File "/home/r.west/Code/RMG-Py/external/cclib/parser/mopacparser.py", line 20, in symbol2int
    return t.number[symbol]
KeyError: 'COORDINATES'

The problem is that Mopac (Version 14.019L 64BITS) has started to append a CARTESIAN COORDINATES block immediately after the, um, cartesian coordinates block, with no blank line separating them.

   ATOM   CHEMICAL          X               Y               Z
  NUMBER   SYMBOL      (ANGSTROMS)     (ANGSTROMS)     (ANGSTROMS)

     1       C          1.04340794  *  -0.13423839  *  -0.33664008  *
     2       C         -1.29111814  *  -0.01621280  *   0.02192042  *
     3       O         -0.03088913  *   0.59322532  *   0.15602464  *
     4       O          2.09221467  *   0.07599275  *   0.54169424  *
     5       O          1.90690945  *   0.78462213  *  -0.90724500  *
     6       H          0.86602442  *  -1.12941882  *  -0.80058568  *
     7       H         -1.96883912  *   0.71715667  *   0.46568720  *
     8       H         -1.34818356  *  -0.96274023  *   0.57133225  *
     9       H         -1.55414040  *  -0.18501547  *  -1.02849078  *
                             CARTESIAN COORDINATES

   1    C        1.043407944    -0.134238385    -0.336640077
   2    C       -1.291118142    -0.016212799     0.021920417
   3    O       -0.030889132     0.593225325     0.156024640
   4    O        2.092214669     0.075992752     0.541694237
   5    O        1.906909447     0.784622131    -0.907245003
   6    H        0.866024417    -1.129418821    -0.800585685
   7    H       -1.968839125     0.717156668     0.465687195
   8    H       -1.348183561    -0.962740231     0.571332254
   9    H       -1.554140399    -0.185015471    -1.028490785

Quick fix on its way, but two thoughts:

  1. Why isn't this using regular expressions or something more robust?
  2. Why are we maintaining our own fork of cclib instead of contributing to and benefitting from other cclib developers/users? Even more motivation to avoid this is that the copy of cclib in RMG-Java is presumably broken too.
@pierrelb
Copy link
Contributor

We'd have to make a decision to answer those questions. Stick with cclib or
do it on our own with regex. I (hastily) built a parser for MOPAC
intermediate files using regex, so that could be generalized if we choose
the latter.
On Jan 31, 2014 3:08 PM, "Richard West" [email protected] wrote:

Traceback (most recent call last):
File "/home/r.west/Code/RMG-Py/rmg.py", line 139, in
cProfile.runctx(command, global_vars, local_vars, stats_file)
File "/shared/apps/python/Python-2.7.5/INSTALL/lib/python2.7/cProfile.py", line 49, in runctx
prof = prof.runctx(statement, globals, locals)
File "/shared/apps/python/Python-2.7.5/INSTALL/lib/python2.7/cProfile.py", line 140, in runctx
exec cmd in globals, locals
File "", line 1, in
File "/home/r.west/Code/RMG-Py/rmgpy/rmg/main.py", line 364, in execute
self.initialize(args)
File "/home/r.west/Code/RMG-Py/rmgpy/rmg/main.py", line 352, in initialize
self.reactionModel.enlarge([spec for spec in self.initialSpecies if spec.reactive])
File "/home/r.west/Code/RMG-Py/rmgpy/rmg/model.py", line 720, in enlarge
spec.generateThermoData(database, quantumMechanics=self.quantumMechanics)
File "/home/r.west/Code/RMG-Py/rmgpy/rmg/model.py", line 149, in generateThermoData
thermo0 = quantumMechanics.getThermoData(molecule) # returns None if it fails
File "/home/r.west/Code/RMG-Py/rmgpy/qm/main.py", line 147, in getThermoData
thermo0 = qm_molecule_calculator.generateThermoData()
File "/home/r.west/Code/RMG-Py/rmgpy/qm/molecule.py", line 221, in generateThermoData
self.qmData = self.generateQMData()
File "/home/r.west/Code/RMG-Py/rmgpy/qm/mopac.py", line 220, in generateQMData
result = self.parse() # parsed in cclib
File "/home/r.west/Code/RMG-Py/rmgpy/qm/mopac.py", line 144, in parse
cclibData = parser.parse()
File "/home/r.west/Code/RMG-Py/external/cclib/parser/logfileparser.py", line 221, in parse
self.extract(inputfile, line)
File "/home/r.west/Code/RMG-Py/external/cclib/parser/mopacparser.py", line 142, in extract
self.inputatoms.append(symbol2int(broken[1]))
File "/home/r.west/Code/RMG-Py/external/cclib/parser/mopacparser.py", line 20, in symbol2int
return t.number[symbol]KeyError: 'COORDINATES''''
The problem is that Mopac (Version 14.019L 64BITS) has started to append a CARTESIAN COORDINATES block immediately after the, um, cartesian coordinates block, with no blank line separating them.

     1       C          1.04340794  \*  -0.13423839  \*  -0.33664008  \*     2       C         -1.29111814  \*  -0.01621280  \*   0.02192042  \*     3       O         -0.03088913  \*   0.59322532  \*   0.15602464  \*     4       O          2.09221467  \*   0.07599275  \*   0.54169424  \*     5       O          1.90690945  \*   0.78462213  \*  -0.90724500  \*     6       H          0.86602442  \*  -1.12941882  \*  -0.80058568  \*     7       H         -1.96883912  \*   0.71715667  \*   0.46568720  \*     8       H         -1.34818356  \*  -0.96274023  \*   0.57133225  \*     9       H         -1.55414040  \*  -0.18501547  \*  -1.02849078  \*                             CARTESIAN COORDINATES
   1    C        1.043407944    -0.134238385    -0.336640077   2    C       -1.291118142    -0.016212799     0.021920417   3    O       -0.030889132     0.593225325     0.156024640   4    O        2.092214669     0.075992752     0.541694237   5    O        1.906909447     0.784622131    -0.907245003   6    H        0.866024417    -1.129418821    -0.800585685   7    H       -1.968839125     0.717156668     0.465687195   8    H       -1.348183561    -0.962740231     0.571332254   9    H       -1.554140399    -0.185015471    -1.028490785'''
Quick fix on its way, but two thoughts:
1. Why isn't this using regular expressions or something more robust?
2. Why are we maintaining our own fork of cclib instead of contributing to and benefitting from other cclib developers/users?  Even more motivation to avoid this is that the copy of cclib in RMG-Java is presumably broken too.

## 

Reply to this email directly or view it on GitHubhttps://github.com/GreenGroup/RMG-Py/issues/177
.

rwest added a commit that referenced this issue Jan 31, 2014
MOPAC has changed its output slightly.
This is a quick fix, but still not terribly robust.
Part of me feels we shouldn't be keeping a separate fork of cclib.
rwest added a commit to rwest/RMG-Py that referenced this issue May 12, 2014
…ing.

MOPAC has changed its output slightly.
This is a quick fix, but still not terribly robust.
Part of me feels we shouldn't be keeping a separate fork of cclib.

Cherry-picked from 9c321dd
on master branch
pierrelb pushed a commit to pierrelb/RMG-Py that referenced this issue Jun 15, 2015
…ing.

MOPAC has changed its output slightly.
This is a quick fix, but still not terribly robust.
Part of me feels we shouldn't be keeping a separate fork of cclib.

Cherry-picked from 9c321dd
on master branch
@connie
Copy link
Member

connie commented Jul 20, 2016

Closed via 9c321dd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants