Travel Package Recommendation System Using Topic Modeling Approach
Marianne P. Vitug
Discipline: Artificial Intelligence
Abstract:
The goal of the study is to build a recommender system that will
incorporate important factors in designing a travel package
– the inherent attributes of tourist destinations, and their
distance from other tourist destinations. This project sourced
its primary data from reviews from Tripadvisor collected using the
package rvest from R. 37 current tourist destinations were included in the
data gathering with 50 reviews gathered for each location.
Distance matrices using Geodesic distance calculation and driving
distance via Google API were gathered for this project. These were
used as a penalizing factor to produce the hierarchy of the final set of
recommendations. The Geodesic distances between the various tourist
destinations were gathered using R ggmap and lmap libraries. For the
Google driving distance, Google Cloud platform’s Distance Matrix API
registration had been necessary to get the key that was used for the R code
using gmapdistance library. The Google Distance Matrix API gave travel
distance based on the recommended route for a supplied matrix of origins
(start) and destinations (end point).
For the modeling technique, since the data had characteristics that were
unsuitable for the algorithms commonly used in recommender systems,
topic modeling was used as an alternative method of extracting the
intrinsic features of both the tourists and the locations. Latent Dirichlet
Allocation (LDA) and Bidirectional Encoder Representations from
Transformers (BERT) were used for topic modeling. The results were evaluated using a mix of eyeballing on top N words and
intrinsic evaluation metrics through topics interpretability. Aside from
this, the final list of recommendations was sent to Ark Travel’s President
and the head of local tour operations for evaluation and successfully fitted
their requirements. The model was able to produce recommendations
which are deemed acceptable based on these criteria. The solution made
using this recommender system can help not only the main stakeholders—
the travelers and the travel agency, but also the business owners on less
popular or just-emerging tourist destinations since they can also be
recommended as long as they are part of the dataset.
References:
- Aletras, N., & Stevenson, M. (2014). Measuring the Similarity between Automatically Generated Topics. Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pages 22–27. Doi: 10.3115/v1/E14-4005
- Blei, D., Ng, A., & Jordan, M. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research. 3(4–5), 993–1022. doi:10.1162/jmlr.2003.3.4-5.993.
- Burgos, P. (2021, July 3). ‘DOT eyes more ‘bubbles’ to boost domestic tourism. Inquirer.Net. https://newsinfo.inquirer.net/1423158/dot-eyes-more-bubbles-to-boost-domestic-tourism
- Coelho, J., Nitu, P., & Madiraju, P. (2018, September 10). A Personalized Travel Recommendation System Using Social Media Analysis. Institute of Electrical and Electronic Engineers (IEEE). Retrieved from https://epublications.marquette.edu/comp_fac/6/
- Devlin, J., Chang, M., Lee, K., Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Google AI Language. Retrieved from: https://arxiv.org/pdf/1810.04805.pdf
- Duca, A., & Marchetti, A. (2019). Open data for tourism: the case of Tourpedia. Journal of Hospitality and Tourism Technology. 10. 10.1108/JHTT-07-2017-0042.
- Floridi, L. (2016) The method of levels of abstraction. In: Floridi L (eds), The Routledge Handbook of Philosophy of Information (pp 67–72). Routledge
- Horev, R. (2018). BERT Explained: State of the art language model for NLP. Retrieved from: https://towardsdatascience.com/bert-explained-state-of-the-art-language-model-for-nlp-f8b21a9b6270
- Igarashi, Y. (2020). Minimal Requirements to Pretend You Are Familiar with BERT. Retrieved from https://towardsdatascience.com/minimal-requirements-to-pretend-you-are-familiar-with-bert- 3889023e4aa9
- Khatri, I. (2019). Information Technology in Tourism & Hospitality Industry: A Review of Ten Years’ Publications. Journal of Tourism & Hospitality Education, 9, 74-87.
- Koene, A., Perez, E., Carter, C.J., Statache, R., Adolphs, S., O’Malley, C., & McAuley, D. (2015) Ethics of Personalized Information Filtering. In: Tiropanis T., Vakali A., Sartori L., Burnap P. (eds) Internet Science. INSCI 2015. Lecture Notes in Computer Science, vol 9089. Springer, Cham. https://doi.org/10.1007/978-3-319-18609-2_10
- Liu, Q., Ge, Y., Li, Z., Chen, E., & Xiong, H. (2014). Personalized Travel Package Recommendation. IEEE 11th International Conference on Data Mining, 2011, pp. 407-416, doi: 10.1109/ICDM.2011.118.
- Pantano,E., Priporas, C., & Stylos, N. (2017). ‘You will like it!’ using open data to predict tourists’ response to a tourist attraction. Tourism Management, 60, 430-438.
- Prabhakaran, S. (2018). Cosine Similarity – Understanding the math and how it works (with python codes). Retrieved from https://www.machinelearningplus.com/nlp/cosine-similarity/
- Sangram, S., Pratik, K., Akshay, V., & Vishwajit, G. (2018). TRAVELMATE Travel Package Recommendation System. International Research Journal of Engineering and Technology (IRJET), 5, 4095- 4097.
- Shu, L., Long, B., & Meng, W. (2009). A Latent Topic Model for Complete Entity Resolution (PDF). 880-891. doi: 10.1109/ICDE.2009.29.
- Tirona, A. (2021). Tourism contribution to GDP lowest in at least 2 decades. BusinessWorld. Retrieved from https://www.bworldonline.com/tourism-contribution-to-gdp-lowest-in-at-least-2-decades/
- Department of Tourism. (2017). Tourism Guidebook [PDF file]. Retrieved from http://www.tourism.gov.ph/Guidebook_Manual/TourismGuidebook.pdf.
- Yan, X., Guo, J., Lan, Y., & Cheng, X. (2013). A biterm topic model for short texts. WWW 2013 – Proceedings of the 22nd International Conference on World Wide Web. 1445-1456. 10.1145/2488.388.2488514.