Friday, October 24, 2025

FerroAI: Predicting Ferroelectric Material Phase Diagrams with Deep Learning

Share

Construction of Phase Transformation Training Set by Text-Mining

Importance of a Comprehensive Dataset for Ferroelectrics

In the field of materials science, particularly in the study of ferroelectric materials, a robust dataset is crucial for developing artificial intelligence models. Ferroelectrics, known for their electrical polarization properties and phase transformations, require extensive research for predictive modeling. However, comprehensive datasets that encompass a broad spectrum of these materials often remain elusive, largely due to the fragmented understanding of complex crystal structures and their symmetries.

To bridge this gap, researchers have taken the initiative to systematically compile a large-scale dataset by mining existing literature. By utilizing advanced natural language processing (NLP) techniques, they extracted essential information such as chemical compositions, crystal structures, and transition temperatures from thousands of published articles.

Text-Mining Methodology

The dataset construction process is meticulously outlined in Figure 1, which illustrates the data-mining workflow for ferroelectric materials and symmetry-breaking phase transformations. In an extensive survey of the literature, a staggering 41,597 research articles were examined using the Elsevier API, resulting in the extraction of phase transformation details that include chemical compositions represented by chemical formulas, crystal structures, and the transformational temperatures associated with various symmetry sequences.

Following the automated extraction and subsequent verification of the data, a total of 2,838 phase transformations were compiled across approximately 800 ferroelectric materials. Figure 2a showcases the diversity of materials included in the dataset, highlighting that potassium sodium niobate, barium titanate, lead zirconate titanate, and lead magnesium niobate are among the most extensively studied in research publications.

Data Visualization and Clustering Analysis

Further insights from the dataset emerge through clustering analysis that visualizes the relationships among the seven crystal systems present. Using a chord diagram (Figure 2b), it becomes evident that cubic to tetragonal and tetragonal to rhombohedral phase transformations are the most frequently observed transitions within ferroelectrics. Additionally, the distribution of temperatures associated with specific symmetry-breaking transformations is illustrated in Figure 2c–g, revealing a predominant range of 100 K to 700 K for these phase transitions.

This dataset not only enhances our understanding of phase transformations but also contributes to the creation of a structured crystal dataset through data augmentation methodologies to ensure consistency for training purposes. Given the variable phase transformation sequences across different materials, the augmentation process meticulously segments labeled temperature ranges into smaller intervals, ensuring accurate data representation.

Developing the FerroAI Model

With the augmented crystal dataset in hand, the next step involves developing the FerroAI model aimed at predicting phase diagrams. This model utilizes a deep learning neural network trained on the structured dataset, where the inputs comprise materials tagged by their chemical formulas and atomic compositions. Figure 3 illustrates the overall workflow of this process, relying on a six-layer deep neural network to correlate the chemical vector and temperature inputs to predict crystal symmetry.

Key components of the model architecture involve the chemical vector, representing the material system as a 118-dimensional vector. Each dimension corresponds to an atom’s presence, defined by its ratio in the compound, which enables the neural network to effectively learn from the data. In the training phase, hyperparameters are systematically optimized to enhance model performance, with the final configuration resulting in over 811,015 parameters capable of capturing the influences of chemical compositions on phase transformations across various ferroelectric families.

Assessment of FerroAI Model’s Performance

To evaluate the performance of the FerroAI model, cross-entropy loss is employed to quantify the effectiveness of training, with a notable drop in loss observed across increasing training epochs (Figure 5a). The model’s accuracy is assessed on a test dataset, and the confusion matrix (Figure 5b) represents the prediction success rate for various crystal structures. Impressively, FerroAI achieves over 80% accuracy, particularly excelling in predicting cubic and rhombohedral phases.

High-resolution phase diagrams are generated using FerroAI to visualize the impact of doping elements on phase transformations, revealing compositional contours at varying temperatures (Figure 6). Remarkably, the predictive capabilities of FerroAI allow for high-resolution diagrams to be generated in under 20 seconds on standard personal laptops, a significant improvement over traditional simulation methods. The predictions align closely with experimental data, demonstrating the model’s accuracy and capability of capturing complex phase boundary behaviors.

Understanding Doping Effects with SHAP Analysis

To further understand how different chemical elements influence phase symmetry predictions within perovskite structures, the Shapley Additive Explanations (SHAP) analysis is utilized. This quantifies the contribution of A-site and B-site elements to the formation of cubic and tetragonal phases (Figure 7). Among B-site elements, Ti emerges as particularly influential in determining symmetry, while A-site elements display more uniform contributions, with Ba contributing slightly more than others.

This approach provides profound insights not only into how the crystal structure responds to doping but also into potential avenues for tailoring materials to exhibit desired ferroelectric properties. The confluence of deep learning, text-mining, and advanced data analysis opens new frontiers in understanding and designing complex ferroelectric materials.

Read more

Related updates