摘要翻译:
这里我们给出了一个工具箱,用于与新型冠状病毒相关的自然语言处理任务。它包括新型冠状病毒和新冠肺炎同义词的英语词典、用词典生成的银标准语料库和10个Pubmed文摘的金标准语料库,这些文摘为疾病、病毒、症状和蛋白质/基因术语手工注释。该工具箱可在github(https://github.com/aitslab/corona)上免费获得,并可用于与新冠肺炎危机相关的各种设置中的文本分析。它将在接下来的几周内扩展并应用于NLP任务,并邀请社区做出贡献。
---
英文标题:
《English dictionaries, gold and silver standard corpora for biomedical
natural language processing related to SARS-CoV-2 and COVID-19》
---
作者:
Salma Kazemi Rashed, Johan Frid, Sonja Aits
---
最新提交年份:
2020
---
分类信息:
一级分类:Quantitative Biology 数量生物学
二级分类:Other Quantitative Biology 其他定量生物学
分类描述:Work in quantitative biology that does not fit into the other q-bio classifications
不适合其他q-bio分类的定量生物学工作
--
---
英文摘要:
Here we present a toolbox for natural language processing tasks related to SARS-CoV-2. It comprises English dictionaries of synonyms for SARS-CoV-2 and COVID-19, a silver standard corpus generated with the dictionaries and a gold standard corpus of 10 Pubmed abstracts manually annotated for disease, virus, symptom and protein/gene terms. This toolbox is freely available on github (on https://github.com/Aitslab/corona) and can be used for text analytics in a variety of settings related to the COVID-19 crisis. It will be expanded and applied in NLP tasks over the next weeks and the community is invited to contribute.
---
PDF链接:
https://arxiv.org/pdf/2003.09865