The massive use of simulation techniques in chemical research generates huge amounts of information, which starts to become recognized as the BigData problem. The main obstacle for managing big information volumes is its storage in such a way that facilitates data mining as a strategy to optimize the processes that enable scientists to face the challenges of the new sustainable society based on the knowledge and the rational use of existent resources.
The present project aims at creating a platform of services in the cloud to manage computational chemistry. As other related projects, the concepts underlying our platform rely on well defined standards and it implements treatment, hierarchical storage and data recovery tools to facilitate data mining of the Theoretical and Computational Chemistry’s BigData. Its main goal is the creation of new methodological strategies that promote an optimal reuse of results and accumulated knowledge and enhances daily researchers’ productivity.
This proposal automatizes relevant data extracting processes and transforms numerical data into labelled data in a database. This platform provides tools for the researcher in order to validate, enrich, publish and share information, and tools in the cloud to access and visualize data. Other tools permit creation of reaction energy profile plots by combining data of a set of molecular entities, or automatic creation of Supporting Information files, for instance. The final goal is to build a new reference tool in computational chemistry research, bibliography management and services to third parties. Potential users include computational chemistry research groups worldwide, university libraries and related services, and high performance supercomputer centers.