全部版块 我的主页
论坛 经济学人 二区 外文文献专区
382 0
2022-03-05
摘要翻译:
针对涉及序列数据的应用,生物医学领域机器学习算法的评估缺乏标准化。根据应用程序的需求,常见的定量标量评估指标,如敏感性和特异性,往往会产生误导。评价指标必须最终反映用户的需求,但要足够敏感以指导算法开发。在临床应用中使用自动事件检测软件的重症监护临床医生的反馈压倒性地强调,低误报率(通常以每24小时错误数为单位)是用户接受的唯一最重要的标准。虽然使用单一的度量通常不如检查一系列操作条件下的性能那样有洞察力,但仍然需要一个单一的标量值。在本文中,我们讨论了针对癫痫检测任务的现有度量标准的不足,并提出了几个新的度量标准,这些度量标准提供了一个更平衡的性能视图。我们在一个基于TUH EEG语料库的癫痫检测任务上演示了这些度量。我们证明了两个有前途的度量是一个基于从口语术语检测文献中借用的概念的度量,实际术语加权值(ATWV)和一个新的度量,时间对齐事件评分(TAES),它解释了假设与参考注释的时间对齐。我们还证明了基于深度学习的最新技术,尽管其性能令人印象深刻,但在满足非常严格的用户接受标准之前,仍需要显著的改进。
---
英文标题:
《Objective evaluation metrics for automatic classification of EEG events》
---
作者:
Saeedeh Ziyabari, Vinit Shah, Meysam Golmohammadi, Iyad Obeid and
  Joseph Picone
---
最新提交年份:
2019
---
分类信息:

一级分类:Computer Science        计算机科学
二级分类:Machine Learning        机器学习
分类描述:Papers on all aspects of machine learning research (supervised, unsupervised, reinforcement learning, bandit problems, and so on) including also robustness, explanation, fairness, and methodology. cs.LG is also an appropriate primary category for applications of machine learning methods.
关于机器学习研究的所有方面的论文(有监督的,无监督的,强化学习,强盗问题,等等),包括健壮性,解释性,公平性和方法论。对于机器学习方法的应用,CS.LG也是一个合适的主要类别。
--
一级分类:Electrical Engineering and Systems Science        电气工程与系统科学
二级分类:Signal Processing        信号处理
分类描述:Theory, algorithms, performance analysis and applications of signal and data analysis, including physical modeling, processing, detection and parameter estimation, learning, mining, retrieval, and information extraction. The term "signal" includes speech, audio, sonar, radar, geophysical, physiological, (bio-) medical, image, video, and multimodal natural and man-made signals, including communication signals and data. Topics of interest include: statistical signal processing, spectral estimation and system identification; filter design, adaptive filtering / stochastic learning; (compressive) sampling, sensing, and transform-domain methods including fast algorithms; signal processing for machine learning and machine learning for signal processing applications; in-network and graph signal processing; convex and nonconvex optimization methods for signal processing applications; radar, sonar, and sensor array beamforming and direction finding; communications signal processing; low power, multi-core and system-on-chip signal processing; sensing, communication, analysis and optimization for cyber-physical systems such as power grids and the Internet of Things.
信号和数据分析的理论、算法、性能分析和应用,包括物理建模、处理、检测和参数估计、学习、挖掘、检索和信息提取。“信号”一词包括语音、音频、声纳、雷达、地球物理、生理、(生物)医学、图像、视频和多模态自然和人为信号,包括通信信号和数据。感兴趣的主题包括:统计信号处理、谱估计和系统辨识;滤波器设计;自适应滤波/随机学习;(压缩)采样、传感和变换域方法,包括快速算法;用于机器学习的信号处理和用于信号处理应用的机器学习;网络与图形信号处理;信号处理中的凸和非凸优化方法;雷达、声纳和传感器阵列波束形成和测向;通信信号处理;低功耗、多核、片上系统信号处理;信息物理系统的传感、通信、分析和优化,如电网和物联网。
--
一级分类:Statistics        统计学
二级分类:Machine Learning        机器学习
分类描述:Covers machine learning papers (supervised, unsupervised, semi-supervised learning, graphical models, reinforcement learning, bandits, high dimensional inference, etc.) with a statistical or theoretical grounding
覆盖机器学习论文(监督,无监督,半监督学习,图形模型,强化学习,强盗,高维推理等)与统计或理论基础
--

---
英文摘要:
  The evaluation of machine learning algorithms in biomedical fields for applications involving sequential data lacks standardization. Common quantitative scalar evaluation metrics such as sensitivity and specificity can often be misleading depending on the requirements of the application. Evaluation metrics must ultimately reflect the needs of users yet be sufficiently sensitive to guide algorithm development. Feedback from critical care clinicians who use automated event detection software in clinical applications has been overwhelmingly emphatic that a low false alarm rate, typically measured in units of the number of errors per 24 hours, is the single most important criterion for user acceptance. Though using a single metric is not often as insightful as examining performance over a range of operating conditions, there is a need for a single scalar figure of merit. In this paper, we discuss the deficiencies of existing metrics for a seizure detection task and propose several new metrics that offer a more balanced view of performance. We demonstrate these metrics on a seizure detection task based on the TUH EEG Corpus. We show that two promising metrics are a measure based on a concept borrowed from the spoken term detection literature, Actual Term-Weighted Value (ATWV), and a new metric, Time-Aligned Event Scoring (TAES), that accounts for the temporal alignment of the hypothesis to the reference annotation. We also demonstrate that state of the art technology based on deep learning, though impressive in its performance, still needs significant improvement before it meets very strict user acceptance criteria.
---
PDF链接:
https://arxiv.org/pdf/1712.10107
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群