Skip to yearly menu bar Skip to main content


Spotlight
in
Workshop: Machine Learning Multiscale Processes

An Property-prompted Multi-scale Data Augmentation Approach for Crystal Representation

Zhongyi Deng · Shuzhou LI · Tong Zhang · C. L. Philip Chen

Keywords: [ Generative Language Model ] [ Crystal Representation ] [ SLICES ] [ Data Enhancement ] [ Prompt Learning ]


Abstract:

The inverse design of crystals with multiple objectives represents a significant challenge in materials science. The interplay among various desired properties often results in unbalanced crystal structure generation. In schemes based on generative language models, this issue primarily stems from the models' limited capability to learn continuous property values, compounded by the scarcity of high-quality material data for training. To address these challenges, a property prompt-based scheme has been proposed to achieve multi-scale data augmentation for crystal representation. This scheme constructs learnable prompt templates for the single property and extends them to multiple properties. The property prompt introduces learnable templates that map continuous property values to discrete prompt spaces, enhancing the learning ability of generative language models for discrete property values. Multi-scale data augmentation disentangles the interactions between various material properties and transforms them into mutual promotion through end-to-end pre-training, thereby alleviating the problem of insufficient high-quality material data. The scheme has been validated for key properties that affect the crystal structure composition, including the formation energy and the band gap, as well as their various combinations. Experimental results demonstrate that the proposed model achieves significant performance improvements across multiple target property combinations, showcasing its robust representation and generalization capabilities in the inverse design of crystals with multiple objectives.

Chat is not available.