了不起的甲骨文 ③ | 甲骨文研究搭上大数据快车
Amazing Oracle Bone Scripts | Study of Oracle Bones Meets the Big Data
编者按
一片甲骨惊天下。安阳殷墟,甲骨文的出土把中国信史向上推进了约1000年。镌刻在龟甲兽骨上的甲骨文,是迄今为止发现最早、最为完备、最为系统且能表达完整思想的文字,是汉字的源头。习近平总书记在考察殷墟时指出,“中国的汉文字非常了不起,中华民族的形成和发展离不开汉文字的维系”。甲骨文,了不起,却凑不齐甲骨文“了不起”三个字。目前发现的甲骨文字4500字左右,能释读的只有三分之一,看似不可思议,却正是汉字的魅力。
Editor's note:
A piece of oracle bone stuns the world. The unearthing of oracle bones in Yin Ruins of Anyang helps to extend the recorded history of China by about 1,000 years. The oracle bone inscriptions, engraved on tortoise shells or animal bones, are the earliest, most complete, most systematic, and most expressive form of writing ever discovered, and they are the origin of Chinese characters. Chinese President Xi Jinping pointed out during his inspection tour of Yin Ruins that Chinese characters are remarkable, and the formation and development of the Chinese nation cannot be separated from the preservation of them." The oracle bone scripts are remarkable; however, a complete set of characters to express the meaning of "remarkable" or "extraordinary" haven't been found in the oracle bone inscriptions. It is true that only about one-third of the approximately 4,500 oracle bone characters discovered so far can be interpreted, which may seem incredible. Still, it is precisely the charm of Chinese characters.
古来新学问,大都由于新发现。为推动中华文明创新发展,即日起,河南日报社重磅推出“了不起的甲骨文”系列报道,将以融媒的形式报道一个新时代。精心设计的logo以白文印章的形式、仿造甲骨卜辞的体例,从右至左、从上至下。这七个字当中,根据“殷契文渊”甲骨文大数据平台显示,“不”“起”“甲”“骨”字是甲骨文,其他字用的是小篆,即秦始皇统一六国后在大篆的基础上简化,创制的统一文字的汉字书写形式。字体虽不统一,风格却浑然一体,内含文化的传承、基因的延续之意。敬请垂注。
Throughout history, most new knowledge and disciplines have emerged due to discoveries. So, starting today, Henan Daily has launched a heavyweight series of reports titled “Amazing Oracle Bone Scripts” to promote the innovation and development of Chinese civilization. The series logo is elaborately designed as a white seal, imitating the arrangement of oracle bone inscriptions. According to the "Yin Qi Wen Yuan," an oracle bone inscription big data platform, of the seven characters on the seal, the characters “不” (bù),“起” (qǐ), “甲” (jiǎ), and “骨” (gǔ) are in oracle bone inscription, while the remaining characters are in small seal script (小篆). The small seal script was simplified by Emperor Qin Shi Huang on the basis of the original big seal script (大篆) after he unified the six states as the standardized form of Chinese characters. Although the font is not unified, the style is harmonious, implying the inheritance of Chinese culture and Chinese character genes. Please stay tuned.
“一片甲骨惊天下”,代代学人焚膏继晷。1899年,沉睡地下3000多年的甲骨被发现,隐藏的中华智慧和文明密码被一一破译。
A piece of oracle bone scripts stuns the world. In 1899, over 3000 years after being buried underground, the oracle bones were discovered, and then the hidden Chinese wisdom and civilization codes were deciphered one by one.
12月26日,记者来到安阳师范学院甲骨文信息处理教育部重点实验室。工作人员登录“殷契文渊”网站,在字形库中选择甲骨字“人”字形,所有包含该字形的402个甲骨片信息就全部显示出来。
On December 26, when the reporter from Henan Daily paid a visit to the Key Laboratory of Oracle Bone Processing at Anyang Normal University, the staff logged into the "Yin Qi Wen Yuan" website and searched the oracle bone inscription glyph for the character "人" (person). All 402 oracle bone pieces containing this script were displayed.
安阳师范学院计算机与信息工程学院院长、甲骨文信息处理教育部重点实验室主任刘永革介绍,“殷契文渊”是目前最大的甲骨文数据库,从开放的第一天起,平台就为全世界用户提供免费服务。它不仅服务全球的甲骨文、考古、历史、文字方面的研究专家,还吸引了古文字爱好者、中小学语文教师、书法爱好者等使用。
"'Yin Qi Wen Yuan' is currently the world's largest oracle bone inscription database. The platform has been providing free services to users worldwide, including experts of oracle bone inscriptions, archaeology, history, and linguistics, the enthusiasts of ancient writing, teachers and students in primary and secondary schools, and calligraphy lovers," said Liu Yongge, dean of the School of Computer and Information Engineering in Anyang Normal University and director of the Key Laboratory of Oracle Information Processing.
为什么要专门建立甲骨文数据库?甲骨文“撞上”现代科技,又会发生什么?
Why is it necessary to establish an oracle bone inscription database? What would happen when the study of oracle bone scripts "meets" modern technology?
Liu (right) and a researcher at the Key Laboratory of Oracle Information Processing [Photo/Henan Daily]
2000年,计算机软件硕士刘永革到安阳师范学院任教,两位研究甲骨文的老师建议他开展甲骨文信息化研究。“一头是中国古老文明中的甲骨文,另一头是先进的计算机技术、人工智能。”刘永革坦言,当时觉得这个课题简直是两个极端。
In 2000, Liu Yongge, a master's degree holder in computer software, began teaching at Anyang Normal University, and two teachers studying oracle bone scripts suggested he research the digitization of oracle bone scripts. He said: "One is oracle bone scripts, and the other is advanced computer technology and artificial intelligence. I felt that the subject was simply two extremes."
“释读甲骨文本就是世界难题,此外,还有一个难题摆在面前——甲骨文没法输入到电脑里。”刘永革说,这是他最初想要解决的问题。
Liu said that interpreting oracle bone inscriptions is a global problem, and there is a challenge ahead of the researchers - oracle bone scripts cannot be input into computers. This question is also the first he confronted.
“甲骨文和汉字不一样,好多字我们不认识,使用拼音输入法不行,而且还有一些字像画一样,不能用部首输入,也不能拆分。此前甲骨文输入法采用编码输入方式,记忆负担较重,学习成本较高。”刘永革说。
"Oracle bone inscriptions are different from modern Chinese characters. There are still many characters that cannot be recognized, and the Pinyin input method is not feasible," Liu said.
因此,研发甲骨文输入法,建立一个基本字库,让任何人都可以轻松输入,至关重要。
Therefore, it is crucial to develop the input methods of oracle bone inscriptions and build a font library that anyone can be easily input.
“用计算机技术研究甲骨文的编码、字库、输入法、数据库建设,为专家提供一些工具,辅助甲骨文研究,这是我们建立甲骨文数据库的初衷。”刘永革说.
"The initial intention behind establishing an oracle bone script database was to use computer techniques to research coding, character libraries, input methods, and database construction for oracle bone scripts. This would provide experts with tools to study these ancient Chinese scripts," said Liu.
可只有技术优势显然不够。为了学习最新的甲骨文研究成果,每当有甲骨文或古文字研究专家到安阳,刘永革和团队成员一定想方设法当面请教;当地、外地召开的甲骨文学术研究会议,时间再紧他们也要参加……随着了解越来越深入,刘永革发现,甲骨文作为中华民族最古老的文字,因晦涩难懂和研究资料较少,与现代技术不兼容,研究效率低下。
However, more than having technological advantages is required. To learn about the latest research findings, Liu and his team members try to seek experts' advice in person. No matter how busy they are, they will manage to attend academic conferences of oracle bones. Liu gradually became aware that oracle bone scripts, the earliest known form of Chinese writing, almost became a lost art due to the great difficulty in interpreting these ancient characters.
如何共享甲骨文的研究信息,让全人类资源互通、群力群策一起研究呢?刘永革他们决定利用自己的专业,让甲骨文这一“冷门”绝学“热”起来。
How do we share the research of oracle bone inscriptions with the public and work with collective wisdom and concerted efforts? Liu and his team decided to popularize the study of oracle bone scripts with computer technology.
于是,一个汇集甲骨文信息的“殷契文渊”项目悄然启动。刘永革带领团队30多名老师和50多名学生,对甲骨研究的权威资料逐条、逐页进行扫描、裁切、编号,寻找释文。
As a result, the "Yin Qi Wen Yuan" project, an information platform of oracle bone scripts, was launched. Liu led a team of more than 30 teachers and more than 50 students to scan, cut, and number the authoritative materials of oracle bones in search of explanatory texts.
2019年,团队用8年时间精心打造的全球首个免费甲骨文数据库“殷契文渊”惊艳亮相。该平台建设的甲骨文字库包含单字5086个、字形6234个,甲骨文研究文献3万多篇,支持多种甲骨文输入检索方式。
In 2019, after eight years of work, the team unveiled the world's first free oracle bone script database, "Yin Qi Wen Yuan," which received widespread acclaim. It contains 5,086 single characters, 6,234 glyphs, and more than 30,000 research documents about oracle bones and supports a variety of oracle input retrieval methods.
“此前甲骨文资料很难查到,即使甲骨文专家也不可能拥有全部资料,‘殷契文渊’项目建成后解决了这个问题。”甲骨文信息处理教育部重点实验室副主任高峰说。
"Previously, it was quite hard to find data on oracle bones; even oracle experts could not have all the materials. But the establishment of 'Yin Qi Wen Yuan' solved this problem," said Gao Feng, deputy director of the Key Laboratory of Oracle Information Processing.
The "人"(man) in oracle bone script
据介绍,平台接下来将继续补充基础数据,同时采用人工智能技术进行数据分析,包括甲骨文识别与字形分析、甲骨文语言计算、甲骨文知识图谱、计算机自动缀合甲骨文系统等,有可能成为海内外最详赡、最完备的甲骨文大数据平台。
It is reported that the platform will continue to supplement the basic data and utilize artificial intelligence technology for data analysis, including the recognition of oracle bone inscriptions and font analysis, language calculation, knowledge graphs, computer automatic compilation system, etc. It is expected to become the most detailed and complete big data platform for oracle bone inscriptions at home and abroad.
与此同时,安阳师范学院还积极整合校内资源,集合文学、历史、计算机、体育等专业优势,形成了一支跨专业、多学科联合攻关的学术团队——甲骨文信息化处理团队。凭借已经建成的甲骨文数据库优势,该团队开始尝试利用语言学、数学、计算机科学、信息技术对甲骨文进行语义、语法处理和知识挖掘。
Meanwhile, Anyang Normal University has formed a multi-disciplinary Oracle Information Processing Team. With the advantages of its established database, the team has begun exploring the application of linguistics, mathematics, computer science, and information technology in semantic and syntactic processing, as well as knowledge mining of oracle bone scripts.
“利用人工智能技术破译甲骨文,让科技赋能甲骨文研究,揭开一片片甲骨背后的文明密码,讲好甲骨文的故事。”刘永革说,他们要将甲骨文研究带入新的智能化时代。
"AI will be used to empower the study of oracle bone scripts to unveil the hidden civilization codes behind each oracle bone and to tell the story of oracle bones well," said Liu. He added that they will bring oracle bone inscription research into a new intelligence era. (中文来源/河南日报 记者/谢建晓 杨之甜 编译/李云娇 童林 审校/温晓梅)