利用晶體結(jié)構(gòu)準(zhǔn)確預(yù)測(cè)材料性能在材料科學(xué)領(lǐng)域中發(fā)揮著關(guān)鍵的作用。在確定候選材料后,必須進(jìn)行一系列實(shí)驗(yàn)或者大量的密度泛函理論計(jì)算。根據(jù)系統(tǒng)的復(fù)雜性,這可能需要耗費(fèi)數(shù)小時(shí)、數(shù)天甚至數(shù)月。因此,在合成前準(zhǔn)確預(yù)測(cè)所關(guān)注的材料屬性,對(duì)擇優(yōu)分配模擬和實(shí)驗(yàn)資源非常有用。
![加速材料屬性預(yù)測(cè):結(jié)構(gòu)感知圖神經(jīng)網(wǎng)絡(luò) 加速材料屬性預(yù)測(cè):結(jié)構(gòu)感知圖神經(jīng)網(wǎng)絡(luò)](http://m.xiubac.cn/wp-content/themes/justnews/themer/assets/images/lazy.png)
僅基于組分的預(yù)測(cè)模型有助于篩選并識(shí)別潛在的候選材料而無(wú)需結(jié)構(gòu)輸入,但它們無(wú)法區(qū)分給定組分的結(jié)構(gòu)多態(tài)性。此外,由于給定組分的不同結(jié)構(gòu)可能具有截然不同的特性,因而與真實(shí)特性相比,僅基于組分的模型在預(yù)測(cè)值上可能存在顯著的誤差。這些缺陷可以通過(guò)在訓(xùn)練數(shù)據(jù)集中包含基于結(jié)構(gòu)的輸入得到緩解。因此,與基于組分的模型相比,基于結(jié)構(gòu)的模型為推進(jìn)材料科學(xué)領(lǐng)域的發(fā)現(xiàn)過(guò)程提供了更大的可能性。
![加速材料屬性預(yù)測(cè):結(jié)構(gòu)感知圖神經(jīng)網(wǎng)絡(luò) 加速材料屬性預(yù)測(cè):結(jié)構(gòu)感知圖神經(jīng)網(wǎng)絡(luò)](http://m.xiubac.cn/wp-content/themes/justnews/themer/assets/images/lazy.png)
來(lái)自美國(guó)西北大學(xué)電氣與計(jì)算機(jī)工程系的Vishu Gupta等,提出了一個(gè)材料屬性預(yù)測(cè)任務(wù)框架。該框架將先進(jìn)的數(shù)據(jù)挖掘技術(shù)與結(jié)構(gòu)感知圖神經(jīng)網(wǎng)絡(luò)相結(jié)合,以提高模型對(duì)具有稀疏數(shù)據(jù)的材料屬性的預(yù)測(cè)性能。研究者首先使用基于結(jié)構(gòu)感知圖神經(jīng)網(wǎng)絡(luò)的深度學(xué)習(xí)架構(gòu),從現(xiàn)有的包含晶體結(jié)構(gòu)信息的大數(shù)據(jù)中捕捉底層化學(xué)信息。學(xué)習(xí)得到的知識(shí)將被遷移到稀疏數(shù)據(jù)集上使用,以開發(fā)可靠和準(zhǔn)確的目標(biāo)模型。作者使用115個(gè)數(shù)據(jù)集對(duì)所提出的框架在跨屬性和跨材料類別的場(chǎng)景下進(jìn)行了評(píng)估,發(fā)現(xiàn)遷移學(xué)習(xí)模型在104種情形下(≈90%)優(yōu)于從頭開始訓(xùn)練的模型。此外,遷移學(xué)習(xí)模型在外推問(wèn)題中具有額外的性能優(yōu)勢(shì)。
![加速材料屬性預(yù)測(cè):結(jié)構(gòu)感知圖神經(jīng)網(wǎng)絡(luò) 加速材料屬性預(yù)測(cè):結(jié)構(gòu)感知圖神經(jīng)網(wǎng)絡(luò)](http://m.xiubac.cn/wp-content/themes/justnews/themer/assets/images/lazy.png)
Fig. 3 Training curve for predicting formation energy in JARVIS dataset for different training data sizes on a fixed test set.
![加速材料屬性預(yù)測(cè):結(jié)構(gòu)感知圖神經(jīng)網(wǎng)絡(luò) 加速材料屬性預(yù)測(cè):結(jié)構(gòu)感知圖神經(jīng)網(wǎng)絡(luò)](http://m.xiubac.cn/wp-content/themes/justnews/themer/assets/images/lazy.png)
Fig. 4 Prediction error analysis with mean absolute error (MAE) as error metric for predicting formation energy in JARVIS dataset using best?scratch (SC) and best transfer learning (TL) model.
Editorial Summary
Accurate materials property prediction using crystal structure occupies a primary and often critical role in materials science. Upon identification of a candidate material, one has to go through either a series of hands-on experiments or intensive density functional theory calculations which can take hours to days to even months depending on the complexity of the system. Hence, the ability to accurately predict the properties of interest of the material prior to synthesis can be extremely useful to prioritize available resources for simulations and experiments. Although composition-only based predictive models can be helpful for screening and identifying potential material candidates without the need for structure as an input, they are by design not capable of distinguishing between structure polymorphs of a given composition. Further, composition-only based models could potentially have substantial errors in the predicted values as compared to ground truth, as different structure polymorphs of a given composition can have drastically different properties. These shortcomings can be mitigated by incorporating structure-based inputs, and hence structure-based modeling presents bigger opportunities than composition-based modeling to advance the discovery process in the field of materials science.?
Vishu Gupta et al. from the Department of Electrical and Computer Engineering, Northwestern University, presented a framework for materials property prediction tasks that combines advanced data mining techniques with a structure-aware graph neural network (GNN) to improve the predictive performance of the model for materials properties with sparse data. They first applied a structure-aware GNN-based deep learning architecture to capture the underlying chemistry associated with the existing large data containing crystal structure information. The resulting knowledge learned was then transferred and used during training on the sparse dataset to develop reliable and accurate target models. The researchers evaluated the proposed framework in cross-property and cross-materials class scenarios using 115 datasets to find that transfer learning models outperform the models trained from scratch in 104 cases, i.e., ≈90%, with additional benefits in performance for extrapolation problems. The significant improvements gained by using the proposed framework are expected to be useful for materials science researchers to more gainfully utilize data mining techniques to help screen and identify potential material candidates more reliably and accurately for accelerating materials discovery.?This article was recently published in npj Computational Materials 10: 1 (2024).
原文Abstract及其翻譯
Structure-aware graph neural network based deep transfer learning framework for enhanced predictive analytics on diverse materials datasets?(基于結(jié)構(gòu)感知圖神經(jīng)網(wǎng)絡(luò)的深度遷移學(xué)習(xí)框架:應(yīng)用于不同材料數(shù)據(jù)集的增強(qiáng)預(yù)測(cè)分析)
Vishu Gupta,?Kamal Choudhary,?Brian DeCost,?Francesca Tavazza,?Carelyn Campbell,?Wei-keng Liao,?Alok Choudhary?&?Ankit Agrawal?
Abstract Modern data mining methods have demonstrated effectiveness in comprehending and predicting materials properties. An essential component in the process of materials discovery is to know which material(s) will possess desirable properties. For many materials properties, performing experiments and density functional theory computations are costly and time-consuming. Hence, it is challenging to build accurate predictive models for such properties using conventional data mining methods due to the small amount of available data. Here we present a framework for materials property prediction tasks using structure information that leverages graph neural network-based architecture along with deep-transfer-learning techniques to drastically improve the model’s predictive ability on diverse materials (3D/2D, inorganic/organic, computational/experimental) data. We evaluated the proposed framework in cross-property and cross-materials class scenarios using 115 datasets to find that transfer learning models outperform the models trained from scratch in 104 cases, i.e., ≈90%, with additional benefits in performance for extrapolation problems. We believe the proposed framework can be widely useful in accelerating materials discovery in materials science.
摘要現(xiàn)代數(shù)據(jù)挖掘方法在理解和預(yù)測(cè)材料性能方面展現(xiàn)出了高效性。材料發(fā)現(xiàn)過(guò)程中的一個(gè)重要環(huán)節(jié)是了解哪種材料將具有理想的特性。對(duì)許多材料屬性而言,進(jìn)行實(shí)驗(yàn)和密度泛函理論計(jì)算相當(dāng)昂貴且耗時(shí)。因此,由于可用的數(shù)據(jù)量較少,使用傳統(tǒng)的數(shù)據(jù)挖掘方法建立這些屬性的準(zhǔn)確預(yù)測(cè)模型極具挑戰(zhàn)性。這里,我們提出了一個(gè)使用結(jié)構(gòu)信息的材料屬性預(yù)測(cè)任務(wù)框架,該框架利用基于圖神經(jīng)網(wǎng)絡(luò)的架構(gòu)和深度遷移學(xué)習(xí)技術(shù),從而顯著提高模型在不同材料(3D/2D、無(wú)機(jī)/有機(jī)、計(jì)算/實(shí)驗(yàn))數(shù)據(jù)上的預(yù)測(cè)能力。我們使用115個(gè)數(shù)據(jù)集對(duì)所提出的框架在跨屬性和跨材料類別的場(chǎng)景下進(jìn)行了評(píng)估,發(fā)現(xiàn)遷移學(xué)習(xí)模型在104種情形下(≈90%)優(yōu)于從頭開始訓(xùn)練的模型。此外,遷移學(xué)習(xí)模型在外推問(wèn)題中具有額外的性能優(yōu)勢(shì)。我們相信所提出的框架能夠廣泛應(yīng)用于加速材料科學(xué)中的材料發(fā)現(xiàn)。
原創(chuàng)文章,作者:計(jì)算搬磚工程師,如若轉(zhuǎn)載,請(qǐng)注明來(lái)源華算科技,注明出處:http://m.xiubac.cn/index.php/2024/02/25/87f2fabbcf/