Bulletin of Chinese Academy of Sciences (Chinese Version)


scientific big data; intelligent analysis; data intensive scientific discovery; software system

Document Type



The field of artificial intelligence has made a breakthrough in recent years. How to promote scientific discovery in the field of natural science, especially the field of Earth Science with mass and multi-source data, has become the focus of scientists and industry. The scientific data mining analysis and knowledge discovery in the multidisciplinary and cross field intersecting background depend on building a set of efficient, easy to use and extensible scientific data analysis software system for scientific data. It provides learning models, algorithms and development tools for complex data processing, analysis, pattern extraction and knowledge discovery. In this study, the representative intelligent analysis software system in the typical scientific field is selected to make a full investigation and comparison on the generality and difference of this kind of software, and the development trend is also discussed. On this basis, this study proposes an integrated and customizable intelligent analysis framework for scientific big data, which supports the interactive construction of intelligent analysis models, and provides systems and tools supporting for the rapid development of scientific discovery research.

First page


Last Page





Bulletin of Chinese Academy of Sciences


Tony H, Stewart T, Kristin T. 第四范式: 数据密集型科学发现. 潘教峰, 等译. 北京: 科学出版社, 2012.

Gorelick N, Hancher M, Dixon M, et al. Google Earth Engine:Planetary-scale geospatial analysis for everyone. Remote Sensing of Environment, 2017, 202:18-27.

Ihaka R, Gentleman R. R:A Language for Data Analysis and Graphics. Journal of Computational and Graphical Statistics, 1996, 5(3):299-314.

Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn:Machine learning in python. The Journal of Machine Learning Research, 2011, 12:2825-2830.

Hall M, Frank E, Holmes G, et al. The WEKA data mining software: an update//SIGKDD. New York: ACM, 2009: 10-18.

Meng X, Bradley J, Yavuz B, et al. Mllib:Machine learning in apache spark. The Journal of Machine Learning Research, 2016, 17(34):1-7.

Barga R, Fontama V, Tok W H. Predictive Analytics with Microsoft Azure Machine Learning. Berkeley:Apress, 2015.

VanderPlas J, Connolly A J, Ivezić Ž, et al. Introduction to astroML: Machine learning for astrophysics. [2018-08-06].