Wednesday, February 27, 2008

Proton Prediction

A nice article on proton prediction can be found in the latest issue of Spectroscopy Europe

http://www.spectroscopyeurope.com/TD_20_1.pdf


A few more links summarizing where this new development has been already integrated, can be found on

http://nmrpredict.orc.univie.ac.at/

4 comments:

ChemSpiderMan said...

I agree that there are a lot of differing views regarding how prediction should be done and the different measures of success. I think it's time to get a neutral party/parties with enough knowledge of the different packages to compare the performance of the different packages on a diverse and appropriate dataset of, theoretically, internal and non-published structures and shifts. To do this well would take a lot of time and a lot of effort and for whose gain? Maybe for the sake of a publication someone would do it?

Wolfgang Robien said...

You are absolutely right ! I know, we have different opinions on the C-NMR comparison done on the NMRShiftDB-set. I respect your view, hopefully you respect my view.

Any serious comparison of the approaches available would be welcome. There are some constraints which should be respected - they are independent of the nucleus under investigation.

Any HOSE-code based approach directly reproduces the content of the underlying database, whereas any NN-approach does it in principle in the same way (selection of the training set from the available data). The same is true for any increment based approach. All the programs have different underlying databases, therefore any comparison will result in a mix from a) algorithm performance itself and b) how well fit the database to the queries. IMHO also the definition of a 'diverse' test-dataset sounds very 'flexible' to me. According to my personal opinion any investigation will be incomplete with respect to the central question any user of such a prediction tool has: 'How does it perform on MY query structure' - any statistical approach is unable to answer exactly THIS question ! Its easy to find examples where the program with the worst overall-performance produces the best result in a specific case (and vice versa !) - the reason is easy to understand: The largest spectroscopic databases hold roughly 1% of the known chemical structures - therefore we all do some type of 'extrapolation' when dealing with 'new chemistry'.

Ryan Sasaki said...

I agree with your comments Wolfgang.

Especially:

According to my personal opinion any investigation will be incomplete with respect to the central question any user of such a prediction tool has: 'How does it perform on MY query structure' - any statistical approach is unable to answer exactly THIS question !

The largest spectroscopic databases hold roughly 1% of the known chemical structures - therefore we all do some type of 'extrapolation' when dealing with 'new chemistry'.

And in the end, if several different methods produce errors within 0.1 ppm of each other, what does it really matter?

Ryan Sasaki said...

Sorry...I had a typo.

I meant to say:

if several different methods produce errors of within 0.01 ppm of each other, what does it matter!