Showing posts with label data. Show all posts
Showing posts with label data. Show all posts

Sunday, January 6, 2008

Could someone explain to me

I have found an article about "Open Data in Science" written by Peter Murray-Rust (Article can be downloaded from http://www.dspace.cam.ac.uk/handle/1810/194890 ), where 'OSCAR-3' (a tool for extracting data from the chemical literature) in the context of C-NMR spectroscopy has been mentioned.

I am now surprised, that the NMRShiftDB-collection ( http://nmrshiftdb.org/ ) increased only by 8 structures within 7 weeks ( from Nov 18th, 2007 to Jan 6th, 2008 ), when OSCAR-3 is around, which allows automatic extraction of NMR-data from articles ?! For legal reasons only the automatic extraction of data from OA-journals seems to be possible, which reduces the number of available data. Therefore I simply want to see ONE, SINGLE FULLY ASSIGNED C-NMR spectrum. which has been AUTOMATICALLY EXTRACTED by OSCAR-3 from the chemical literature.

A corresponding question has been deposited at Peter Murray-Rust's Weblog - I hope to get an answer. Check back, I'll keep you up-to-date.

My questions can be found on
http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=916#comments

Just for your comparison:
The increase of spectra within CSEARCH can be found here - without OSCAR-3 support ;-))

CSEARCH Data and InChIKeys

InChIKeys are an excellent tool for identical structure searches on the web using the usual search-engines like google, yahoo, msn, etc. The architecture of the InChIKeys allows searching for two-dimensional topologies when using only the first 14 characters, whereas using the full InChIKey allows the inclusion of all additional features like stereochemistry.

Within the CSEARCH-environment all structures have been converted into InChIKeys and together with my data-exchange protocols dating back into the late '80s a collection of links has been built. Each page summarizes for a specific two-dimensional molecular topology all systems, where the corresponding C13/O17/N15/F19/P31/B11/Si29-spectrum is available.

The following systems have indexed:

CSEARCH including upcoming data
SPECINFO
CHEMGATE
NMRPredict
NMRPredict ONLINE
KnowItAll
KnowItAll Anywhere
NMRShiftDB
University of Mainz, In-house database

A total number of nearly half a million spectra from approx. 350,000 different structures has been indexed in a systematic way. The pages have already been crawled by the most important search engines.

How to make use of this 'portal of existing NMR-spectral information' ?

Generate the InchiKey for your query structure, e.g. this is "ABCDEFGHIJKLMN-ABCDEFGHIJ"; now take the first 14 characters (before the hyphen!) and construct the follwoing URL:

http://nmrpredict.orc.univie.ac.at/inchikey/ABCDEFGHIJKLMN.html

Request the corresponding page; in case this particular 'structure family' (=two-dimensional molecular topology) has C/O/N/F/P/B/Si NMR-data available in one of the above listed systems, you will get a list of links to them. In case there is no NMR-spectral information available, your http-request will be answered by a 'Page not found (404)' - error.