Translation Quality Assessment: Assessing Thematic Similarity between Original Text and its Translation using LDA Topic Model

Translation criticism lacks methods with which to efficiently measure the thematic similarity between a source text and its translation. To address this problem, we focused on a method of automatically assessing such thematic similarity. We proposed a bag-of-words-based LDA topic model to uncover latent topics in a document and used the cosine similarity measure to compute the similarity between topics. We hypothesized that the similarity between the topics of an original text and its translation reflects the level of thematic similarity between them. To test the hypothesis, we applied the proposed LDA-cosine method to a novel by F. M. Dostoevsky, “The Brothers Karamazov” (1880), and its English translation by Constance Garnett (1912). Two texts were assumed to be thematically similar to a high degree. We used the results of the method´s application to test this assumption.