Sunday, January 6, 2008

CSEARCH Data and InChIKeys

InChIKeys are an excellent tool for identical structure searches on the web using the usual search-engines like google, yahoo, msn, etc. The architecture of the InChIKeys allows searching for two-dimensional topologies when using only the first 14 characters, whereas using the full InChIKey allows the inclusion of all additional features like stereochemistry.

Within the CSEARCH-environment all structures have been converted into InChIKeys and together with my data-exchange protocols dating back into the late '80s a collection of links has been built. Each page summarizes for a specific two-dimensional molecular topology all systems, where the corresponding C13/O17/N15/F19/P31/B11/Si29-spectrum is available.

The following systems have indexed:

CSEARCH including upcoming data
SPECINFO
CHEMGATE
NMRPredict
NMRPredict ONLINE
KnowItAll
KnowItAll Anywhere
NMRShiftDB
University of Mainz, In-house database

A total number of nearly half a million spectra from approx. 350,000 different structures has been indexed in a systematic way. The pages have already been crawled by the most important search engines.

How to make use of this 'portal of existing NMR-spectral information' ?

Generate the InchiKey for your query structure, e.g. this is "ABCDEFGHIJKLMN-ABCDEFGHIJ"; now take the first 14 characters (before the hyphen!) and construct the follwoing URL:

http://nmrpredict.orc.univie.ac.at/inchikey/ABCDEFGHIJKLMN.html

Request the corresponding page; in case this particular 'structure family' (=two-dimensional molecular topology) has C/O/N/F/P/B/Si NMR-data available in one of the above listed systems, you will get a list of links to them. In case there is no NMR-spectral information available, your http-request will be answered by a 'Page not found (404)' - error.

2 comments:

ChemSpiderMan said...

Wolfgang, I tried the InChiKey search today and it worked perfectly. Took me right to the appropriate page. Am I supposed to be able to see spectra? It appears to just provide links to commercial sites but no actual spectrum. So, it tells you that there is a spectrum for the structure but you can't see it. Did I miss something?

Wolfgang Robien said...

No - my intention with these pages is to collect information about electronically available NMR-data and to summarize them in a structure-oriented way. I also link to non-commercial collections - e.g. NMRShiftDB has been treated the same way as I did with my spectra and some other collections.

see for example:

http://nmrpredict.orc.univie.ac.at/inchikey/AMQJEAYHLZJPGS.html

where also NMRShiftDB-data for this molecule (amyl alcohol) are available

I DO NOT intend to compete against any other (commercial or non-commercial) system with this installation - it is just a (hopefully growing) summary of electronically available NMR-data. The systematic choice of filenames allows automatic retrieval, which might be interesting in future application (e.g. reaction databases - please see my post on PMRs blog, etc.)