Bulletin of Chinese Academy of Sciences (Chinese Version)
Keywords
big data;intelligent algorithms;intelligent algorithm safety;ethics and safety of artificial intelligence;TRC paradigm
Document Type
Artificial Intelligence and Public Security
Abstract
Intelligent algorithms refer to the methods embodied in the computational processes that realize intelligence. These methods are often characterized by being data-driven, involving uncertain computations, and with unexplainable model inferences. These characteristics simultaneously introduce potential safety risks to the application of intelligent algorithms and AI. This study firstly explores the concepts of intelligent algorithm safety. Specifically, intelligent algorithm safety, based on the degree of human-machine integration, extends from the univariate safety of the algorithm itself to the bivariate applicational safety when the algorithm serves humans, and finally evolves into the multivariate systemic safety arises within complex socio-technical systems of human-machine symbiosis. Therefore, this study proposes a hierarchical paradigm of intelligent algorithm safety, namely “TRC paradigm”, covering the univariate safety objective of trustworthiness in algorithm’s internal decision-making, the bivariate safety objective of regulatability in application services, and the multivariate safety objective of controllability for system-wide risk management. Furthermore, based on the current technical challenges in achieving the TRC paradigm and in line with the goals of trustworthiness, regulatability, and controllability, the study identifies three major scientific questions that need to be answered: determining the trust regions of uncertain algorithms, transparentized monitoring of black-box models, and sensing the critical point in human-machine symbiotic intelligent systems. Finally, this study outlines seven research directions, and four recommendations related to intelligent algorithm safety under the “measurement-evaluation-enhancement” technical framework of the TRC paradigm, while envisioning how this will help achieve a future of human-machine co-governance.
First page
419
Last Page
428
Language
Chinese
Publisher
Bulletin of Chinese Academy of Sciences
References
1Asimov I. I, Robot. New York: Spectra Books, 2004.
2Achiam O J, Adler S, Agarwal S, et al. GPT-4 technical report. arXiv, 2024, doi: 10.48550/arXiv.2303.08774.
3Bengio Y, Hinton G, Yao A, et al. Managing extreme AI risks amid rapid progress. Science. 2024, 384: 842-845.
4Anderljung M, Barnhart J, Korinek A, et al. Frontier AI regulation: Managing emerging risks to public safety. arXiv, 2023, doi: arxiv.org/abs/2307.03718.
5Hendrycks D, Mazeika M, Woodside T. An overview of catastrophic ai risks. arXiv, 2023, doi: arxiv.org/abs/2306.12001.
6Amodei D, Olah C, Steinhardt J, et al. Concrete problems in AI safety. arXiv, 2016, doi: arxiv.org/abs/1606.06565.
7Falco G, Shneiderman B, Badger J, et al. Governing AI safety through independent audits. Nature Machine Intelligence. 2021, 3(7): 566-571.
8徐志伟, 孙晓明. 计算机科学导论. 北京: 清华大学出版社, 2018.Xu Z W, Sun X M. Introduction to Computers Science. Beijing: Tsinghua University Press, 2018. (in Chinese)
9Knuth D E. The Art of Computer Programming. Massachusetts: Addison-Wesley, 1973.
10李国杰. 智能化科研(AI4R):第五科研范式. 中国科学院院刊, 2024, 39(1): 1-9. Li G J. AI4R: The fifth scientific research paradigm. Bulletin of Chinese Academy of Sciences, 2024, 39(1): 1-9. (in Chinese)
11Bommasani R, Hudson D A, Adeli E, et al. On the opportunities and risks of foundation models. arXiv, 2022, doi: arxiv.org/abs/2108.07258.
12Zhang Y, Li Y F, Cui L Y, et al. Siren’s song in the AI ocean: A survey on hallucination in large language models. arXiv, 2023, doi: arxiv.org/abs/2309.01219.
13Yi S B, Liu Y L, Sun Z, et al. Jailbreak attacks and defenses against large language models: A survey. arXiv, 2024, doi: arxiv.org/abs/2407.04295.
14Gallegos I O, Rossi R A, Barrow J, et al. Bias and fairness in large language models: A survey. Computational Linguistics, 2024, 50(3): 1097-1179.
15Yao Y F, Duan J H, Xu K D, et al. A survey on large language model (LLM) security and privacy: The good, the bad, and the ugly. High-Confidence Computing, 2024, 4(2): 100211.
16Jeckmans A J P, Beye M, Erkin Z, et al. Privacy in recommender systems// Ramzan N, van Zwol R, Lee J S, et al. Social Media Retrieval. London: Springer, 2013: 263-81.
17Tomlein M, Pecher B, Simko J, et al. An audit of misinformation filter bubbles on YouTube: Bubble bursting and recent behavior changes// Proceedings of the 15th ACM Conference on Recommender Systems. New York: Association for Computing Machinery, 2021.
18Zittrain J. “Netwar”: The unwelcome militarization of the Internet has arrived. Bulletin of the Atomic Scientists, 2017, 73(5): 300-304.
Recommended Citation
CHENG, Xueqi; CHEN, Wei; SHEN, Huawei; SHAN, Shiguang; CHEN, Xilin; and LI, Guojie
(2024)
"Intelligent algorithm safety: Concepts, scientific problems and prospects,"
Bulletin of Chinese Academy of Sciences (Chinese Version): Vol. 40
:
Iss.
3
, Article 3.
DOI: https://doi.org/10.16418/j.issn.1000-3045.20240720004
Available at:
https://bulletinofcas.researchcommons.org/journal/vol40/iss3/3
Included in
Artificial Intelligence and Robotics Commons, Defense and Security Studies Commons, Information Security Commons, Science and Technology Policy Commons, Theory and Algorithms Commons