化石数据是了解地球历史以及深时生命演化的重要信息来源。通过数百年的积累,古生物学家已经发表了海量的古生物学数据。过去三、四十年里,随着计算机、数据库和互联网技术的快速发展,国内外涌现出大量的古生物学数据库,彼此间的目标、体系架构、数据组织方式和服务对象通常存在显著差异,呈现百花齐放的特点。文章系统介绍了古生物学领域主要数据库的发展历史、数据表结构、数据特征和数据量等建设情况,对比分析了其数据整理方式、核心在线功能、数据共享特点和数据质量控制措施。同时,结合近年来数据驱动下的古生物学领域的科学研究实例,提出一站式全生态链数据平台的建设设想,为深时数字地球(DDE)建设多学科融合、数据开放与共享的大数据平台提供参考。
Fossils are invaluable information resources for understanding the deep-time Earth history. Over hundreds of years, a huge amount of paleontological data recording those information has been published. With the rapid development of computer, database and internet technologies during the past 30-40 years, those data have been gathered into various paleontological databases under different goals. The databases all have distinguished system structure, data organization method and service objects. In the present study we reviewed the development of major paleontological databases around the world, including their history, architecture, data characteristics and data volume. Their data organization methods, key online functions, data sharing mechanisms as well as the quality control technique of taxonomic data have also been compared and evaluated. Moreover, several cutting-edge data-driven paleontological research have been introduced. Based on the experience of their data application routine, a concept of establishing a harmonized paleontological big data platform containing data compilation, standardization, sharing, analysis and application, was proposed. It can serve as an example in the Deep-time Digital Earth (DDE) Big Science Program for the construction of multi-disciplinary geosciences big data platform.