The bioinformatics community has been working on this problem for years. A few milestones: 1) Recognized the importance of meta-data (data about data, i.e., the running conditions to acquire the scientific data). 2) Utilized XML and Ontology to communicate.
However, it is still a great challenge. So, what did Google come up with?
In summary, here is the Google paradigm to large scientific data:
- Premises
b) The consumption of the data (in terms of user-comprehensible results) is largely asymmetric in terms of size, comparing to the raw data.
- Solution:
b) ANALYZE: Data will be co-located with the computational engine (at the Google empire??)
c) DELIVER: The analyzed results or query results (usually much smaller) will be delivered to the consumer via the Internet.
Will it work? I think so.
No comments:
Post a Comment