Research highlights
Li-ion conductivity of a material determines its applicability in battery technologies. We aggregate reported experimental data on Li-ion conductors into a new database and present a machine learning (ML) model that can predict ionic conductivity for novel candidate materials.
I have been working on the automation of the workflows for discovery of novel materials and developing machine learning tools that assist human experts in decision-making in the workflows. Here, I highlight the selected publications and software, which is provided with permissive licenses. Please feel free to use these tools in your research; the appropriate citations would be greatly appreciated.
The first decision in the materials discovery workflow that chemists have to make is what chemical elements to combine that would result in a new functional material. The number of possible combinations is huge: the exhaustive exploration would take a lifetime of several generations of scientists.
I have developed a Variational Autoencoder-based model that ranks the candidates by their synthetic accessibility. Synthetic focus on the top-ranked candidates has led to the discovery of a number of new materials.
Selection of chemical elements can depend on the desired functional properties of the resulting materials. I have developed the PhaseSelect model that learns the contribution of different chemical elements to functional properties of materials (e.g., superconductivity, magnetism or energy band gap) and employs this knowledge to represent combinations of elements and predict their properties values. PhaseSelects can assess the functional performance of candidate materials at the early stage of materials discovery and reduces the search space by several orders of magnitude.
Element selection for materials discovery
Elements contribution and embedding for function
Any selected set of chemical elements represents a vast field of possible compositions - chemical formulae. For the discovery, the objective is a formula that stands for synthetically accessible material. An exhaustive investigation is impossible, and I have proposed an algorithm that employs Bayesian optimisation and discovers stable materials up to 100% faster than random sampling.
Bayesian optimisation of the search for stable materials
Machine learning model for ionic conductivity
Scalable outlier detection
Outliers may be present in any data - these are the data points that deviate significantly from the majority in a dataset. They can arise due to various reasons such as measurement errors, experimental anomalies, or genuine rare events. Outlier detection is important because these data points can distort statistical analysis leading to inaccurate results. I have implemented a Variational Autoencoder (VAE) to tackle this problem by learning a probabilistic representation of the data. By modeling the underlying distribution of the data, VAE can identify outliers as data points with low probability under the learned distribution, helping to detect and handle them effectively. This method is a part of the library for the outliers detection.
Yao, Zain Nasrullah, Zheng Li