摘要翻译:
对作物产量和氮素损失等作物生产结果的季前预测可以为利益相关者提供决策时的洞察力。仿真模型可以辅助场景规划,但由于数据需求和运行时间长,其使用受到限制。因此,需要更多的计算方便的方法来扩大预测。我们评估了五种
机器学习(ML)算法作为种植系统模拟器(APSIM)元模型的潜力,以指导未来的决策支持工具开发。我们问:1)ML元模型如何利用季前信息预测玉米产量和氮素损失?2)训练ML算法以获得可接受的预测需要多少数据?3)哪些输入数据变量对准确预测最重要?;4)ML元模型的集合是否改善了预测?模拟数据集包括300多万个基因型、环境和管理场景。随机林最准确地预测了玉米产量和种植时氮素损失,RRMSE分别为14%和55%。ML Meta模型合理地再现了模拟玉米产量,但没有模拟氮素损失。他们对训练数据集大小的敏感度也不同。在所有ML模型中,当训练数据集从0.5个数据点增加到180万个数据点时,产量预测误差下降了10-40%,而N损失预测误差没有一致的模式。ML模型对输入变量的敏感度也不同。在所有ML模型中,天气条件、土壤性质、管理信息和初始条件在预测产量时大致同等重要。适度的预测改进来自ML集成。这些结果可以帮助加快耦合仿真模型和ML在开发季前管理动态决策支持工具方面的进展。
---
英文标题:
《Maize Yield and Nitrate Loss Prediction with Machine Learning Algorithms》
---
作者:
Mohsen Shahhosseini, Rafael A. Martinez-Feria, Guiping Hu, Sotirios V.
Archontoulis
---
最新提交年份:
2020
---
分类信息:
一级分类:Quantitative Biology 数量生物学
二级分类:Other Quantitative Biology 其他定量生物学
分类描述:Work in quantitative biology that does not fit into the other q-bio classifications
不适合其他q-bio分类的定量生物学工作
--
一级分类:Computer Science 计算机科学
二级分类:Machine Learning 机器学习
分类描述:Papers on all aspects of machine learning research (supervised, unsupervised, reinforcement learning, bandit problems, and so on) including also robustness, explanation, fairness, and methodology. cs.LG is also an appropriate primary category for applications of machine learning methods.
关于机器学习研究的所有方面的论文(有监督的,无监督的,强化学习,强盗问题,等等),包括健壮性,解释性,公平性和方法论。对于机器学习方法的应用,CS.LG也是一个合适的主要类别。
--
一级分类:Statistics 统计学
二级分类:Applications 应用程序
分类描述:Biology, Education, Epidemiology, Engineering, Environmental Sciences, Medical, Physical Sciences, Quality Control, Social Sciences
生物学,教育学,流行病学,工程学,环境科学,医学,物理科学,质量控制,社会科学
--
一级分类:Statistics 统计学
二级分类:Machine Learning 机器学习
分类描述:Covers machine learning papers (supervised, unsupervised, semi-supervised learning, graphical models, reinforcement learning, bandits, high dimensional inference, etc.) with a statistical or theoretical grounding
覆盖机器学习论文(监督,无监督,半监督学习,图形模型,强化学习,强盗,高维推理等)与统计或理论基础
--
---
英文摘要:
Pre-season prediction of crop production outcomes such as grain yields and N losses can provide insights to stakeholders when making decisions. Simulation models can assist in scenario planning, but their use is limited because of data requirements and long run times. Thus, there is a need for more computationally expedient approaches to scale up predictions. We evaluated the potential of five machine learning (ML) algorithms as meta-models for a cropping systems simulator (APSIM) to inform future decision-support tool development. We asked: 1) How well do ML meta-models predict maize yield and N losses using pre-season information? 2) How many data are needed to train ML algorithms to achieve acceptable predictions?; 3) Which input data variables are most important for accurate prediction?; and 4) Do ensembles of ML meta-models improve prediction? The simulated dataset included more than 3 million genotype, environment and management scenarios. Random forests most accurately predicted maize yield and N loss at planting time, with a RRMSE of 14% and 55%, respectively. ML meta-models reasonably reproduced simulated maize yields but not N loss. They also differed in their sensitivities to the size of the training dataset. Across all ML models, yield prediction error decreased by 10-40% as the training dataset increased from 0.5 to 1.8 million data points, whereas N loss prediction error showed no consistent pattern. ML models also differed in their sensitivities to input variables. Averaged across all ML models, weather conditions, soil properties, management information and initial conditions were roughly equally important when predicting yields. Modest prediction improvements resulted from ML ensembles. These results can help accelerate progress in coupling simulation models and ML toward developing dynamic decision support tools for pre-season management.
---
PDF链接:
https://arxiv.org/pdf/1908.06746