Myanmar Word Vectors


We provide two of the best of our Myanmar Word Embeddings trained on Continuous Bag-of-Words (CBOW) and Skip-gram with Dimension 200 to apply in various Myanmar NLP processes.

To train the models we collected a large monolingual Myanmar corpus for the purpose of building high quality word vectors with wide coverage. It contains 480,000 sentences from different domains including:

  • news
  • business
  • health
  • politics
  • tourism
  • education
  • arts
  • technology
  • sport
  • religion

Cite this paper if you use the word vectors we provide here.

Aye Mya Hlaing, Win Pa Pa, "Word Representations for Neural Network Based Myanmar Text-to-Speech System", International Journal of Intelligent Engineering and Systems (INASS), Vol.13, No.2, pp 239-249, April 2020


myword2vec CBOW Model

myword2vec Skip-gram Model

Download Paper

1 Sep., 2022