Accelerating Material Discovery for CdTe Solar Cells Using Knowledge Intense Word Embeddings
Xiaolei Liu1, Kurt Barth1, David Windridge2, Kai Xu3
1Loughborough University, Loughborough, --, United Kingdom
/2Middlesex University, London, --, United Kingdom
/3University of Nottingham, Nottingham, --, United Kingdom

Thin film CdTe is the most successful second-generation solar photovoltaic technology, and further development will significantly contribute to net zero emission targets. Natural language processing technologies are applied to accelerate research on CdTe solar cells towards new material discoveries. In this work, various language models are used to extract the most frequently used words from the CdTe literature. The performance of these language models is tested and compared using a customised evaluation dataset. The optimised GloVe language model is exploited to construct a knowledge diagram in the vector space and track the material application timeline. The data-driven approach provides useful insights for future research and will accelerate material discoveries in CdTe solar cells.