首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Learning interpretable word embeddings via bidirectional alignment of dimensions with semantic concepts
Institution:1. Key Laboratory of Knowledge Engineering with Big Data, Hefei University of Technology, Ministry of Education, China;2. School of Computer Science and Information Engineering, Hefei University of Technology, China;1. Cryptography and Cognitive Informatics Laboratory, AGH University of Science and Technology, 30 Mickiewicza Ave, Krakow 30-059, Poland;2. School of Computing, Engineering and Mathematical Sciences, La Trobe University, Melbourne, Australia;3. Department of Computer Science, Ryerson University, Canada
Abstract:We propose bidirectional imparting or BiImp, a generalized method for aligning embedding dimensions with concepts during the embedding learning phase. While preserving the semantic structure of the embedding space, BiImp makes dimensions interpretable, which has a critical role in deciphering the black-box behavior of word embeddings. BiImp separately utilizes both directions of a vector space dimension: each direction can be assigned to a different concept. This increases the number of concepts that can be represented in the embedding space. Our experimental results demonstrate the interpretability of BiImp embeddings without making compromises on the semantic task performance. We also use BiImp to reduce gender bias in word embeddings by encoding gender-opposite concepts (e.g., male–female) in a single embedding dimension. These results highlight the potential of BiImp in reducing biases and stereotypes present in word embeddings. Furthermore, task or domain-specific interpretable word embeddings can be obtained by adjusting the corresponding word groups in embedding dimensions according to task or domain. As a result, BiImp offers wide liberty in studying word embeddings without any further effort.
Keywords:Word embeddings  Interpretability  Word semantics
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号