Skip to main content

Do Your Social Profiles Reveal What Languages You Speak? Language Inference from Social Media Profiles

  • Conference paper
Book cover Advances in Information Retrieval (ECIR 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9626))

Included in the following conference series:

  • 4327 Accesses

Abstract

In the multilingual World Wide Web, it is critical for Web applications, such as multilingual search engines and targeted international advertisements, to know what languages the user understands. However, online users are often unwilling to make the effort to explicitly provide this information. Additionally, language identification techniques struggle when a user does not use all the languages they know to directly interact with the applications. This work proposes a method of inferring the language(s) online users comprehend by analyzing their social profiles. It is mainly based on the intuition that a user’s experiences could imply what languages they know. This is nontrivial, however, as social profiles are usually incomplete, and the languages that are regionally related or similar in vocabulary may share common features; this makes the signals that help to infer language scarce and noisy. This work proposes a language and social relation-based factor graph model to address this problem. To overcome these challenges, it explores external resources to bring in more evidential signals, and exploits the dependency relations between languages as well as social relations between profiles in modeling the problem. Experiments in this work are conducted on a large-scale dataset. The results demonstrate the success of our proposed approach in language inference and show that the proposed framework outperforms several alternative methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In this paper, comprehand means the user is able to grasp information in that language to a good extent.

  2. 2.

    http://www.theeuropeanlibrary.org.

  3. 3.

    https://en.wikipedia.org/wiki/List_of_multilingual_countries_and_regions.

  4. 4.

    https://en.wikipedia.org/wiki/Lexical_similarity.

  5. 5.

    http://en.wikipedia.org/wiki/List_of_official_languages.

  6. 6.

    http://svmlight.joachims.org/.

References

  1. Tucker, R.: A global perspective on bilingualism and bilingual education. In: Georgetown University Round Table on Languages and Linguistics, pp. 332–340 (1999)

    Google Scholar 

  2. Diamond, J.: The benefits of multilingualism. Sci. Wash. 330(6002), 332–333 (2010)

    Article  Google Scholar 

  3. Ghorab, M., Leveling, J., Zhou, D., Jones, G.J., Wade, V.: Identifying common user behaviour in multilingual search logs. In: Peters, C., Di Nunzio, G.M., Kurimo, M., Mandl, T., Mostefa, D., Peñas, A., Roda, G. (eds.) CLEF 2009. LNCS, vol. 6241, pp. 518–525. Springer, Heidelberg (2010)

    Google Scholar 

  4. Oakes, M., Xu, Y.: A search engine based on query logs, and search log analysis at the university of Sunderland. In: Proceedings of the 10th Cross Language Evaluation Forum (2009)

    Google Scholar 

  5. Kontaxis, G., Polychronakis, M., et al.: Minimizing information disclosure to third parties in social login platforms. Int. J. Inf. Secur. 11(5), 321–332 (2012)

    Article  Google Scholar 

  6. Burger, J.D., et al.: Discriminating gender on Twitter. In: EMNLP, pp. 1301–1309 (2011)

    Google Scholar 

  7. Li, R., Wang, S., Deng, H., et al.: Towards social user profiling: unified and discriminative influence model for inferring home locations. In: SIGKDD, pp. 1023–1031 (2012)

    Google Scholar 

  8. Dunning, T.: Statistical identification of language. Technical Report MCCS 940–273, Computing Research Laboratory, New Mexico State University (1994)

    Google Scholar 

  9. Xia, F., Lewis, W.D., Poon, H.: Language ID in the context of harvesting language data off the web. In: EACL, pp. 870–878 (2009)

    Google Scholar 

  10. Martins, B., et al.: Language identification in web pages. In: SAC, pp. 764–768 (2005)

    Google Scholar 

  11. Stiller, J., Gäde, M., Petras, V.: Ambiguity of queries and the challenges for query language detection. In: The proceedings of Cross Language Evaluation Forum (2010)

    Google Scholar 

  12. Carter, S., et al.: Microblog language identification: Overcoming the limitations of short, unedited and idiomatic text. Lang. Resour. Eval. 47(1), 195–215 (2013)

    Article  Google Scholar 

  13. Qiu, F., Cho, J.: Automatic identification of user interest for personalized search. In: WWW, pp. 727–736 (2006)

    Google Scholar 

  14. White, R.W., Bailey, P., Chen, L.: Predicting user interests from contextual information. In: SIGIR, pp. 363–370 (2009)

    Google Scholar 

  15. Liu, J., Dolan, P., Pedersen, E.R.: Personalized news recommendation based on click behavior. In: IUI, pp. 31–40 (2010)

    Google Scholar 

  16. Xu, S., et al.: Exploring folksonomy for personalized search. In: SIGIR, pp. 155–162 (2008)

    Google Scholar 

  17. Provost, F., Dalessandro, B., Hook, R., et al.: Audience selection for on-line brand advertising: privacy-friendly social network targeting. In: SIGKDD, pp. 707–716 (2009)

    Google Scholar 

  18. Mislove, A., Viswanath, B., Gummadi, K.P., Druschel, P.: You are who you know: inferring user profiles in online social networks. In: WSDM, pp. 251–260 (2010)

    Google Scholar 

  19. Maheshwari, S., Sainani, A., Reddy, P.: An approach to extract special skills to improve the performance of resume selection. In: Kikuchi, S., Sachdeva, S., Bhalla, S. (eds.) DNIS 2010. LNCS, vol. 5999, pp. 256–273. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  20. Wang, Z., Li, S., Kong, F., Zhou, G.: Collective personal profile summarization with social networks. In: EMNLP, pp. 715–725 (2013)

    Google Scholar 

  21. Yang, Z., Cai, K., et al.: Social context summarization. In: SIGIR, pp. 255–264 (2011)

    Google Scholar 

  22. Dong, Y., Tang, J., Wu, S., et al.: Link prediction and recommendation across heterogeneous social networks. In: ICDM, pp. 181–190 (2012)

    Google Scholar 

  23. Tang, W., Zhuang, H., Tang, J.: Learning to infer social ties in large networks. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011, Part III. LNCS, vol. 6913, pp. 381–397. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  24. Tang, J., Wu, S., Sun, J.: Confluence: Conformity influence in large social networks. In: SIGKDD, pp. 347–355 (2013)

    Google Scholar 

  25. Hammersley, J.M., Clifford, P.: Markov fields on finite graphs and lattices. Unpublished Manuscript (1971)

    Google Scholar 

Download references

Acknowledgements

This research is supported by Science Foundation Ireland through the CNGL Programme (Grant 12/CE/I2267) in the ADAPT Centre (www.adaptcentre.ie) at Trinity College Dublin. The work is also supported by the National Natural Science Foundation of China under Project No. 61300129, and a project Sponsored by the Scientific Research Foundation for the Returned Overseas Chinese Scholars, State Education Ministry, China under grant number [2013] 1792.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yu Xu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Xu, Y., Rami Ghorab, M., Wang, Z., Zhou, D., Lawless, S. (2016). Do Your Social Profiles Reveal What Languages You Speak? Language Inference from Social Media Profiles. In: Ferro, N., et al. Advances in Information Retrieval. ECIR 2016. Lecture Notes in Computer Science(), vol 9626. Springer, Cham. https://doi.org/10.1007/978-3-319-30671-1_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-30671-1_41

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-30670-4

  • Online ISBN: 978-3-319-30671-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics