Contact Us Search Paper

Movie Success-Rate Prediction System through Optimal Sentiment Analysis

Dewan Muhammad Qaseem1, Nashit Ali1, Waseem Akram1, Aman Ullah2, Kemal Polat3, *

Corresponding Author:

Kemal Polat

Affiliation(s):

1Department of Computer Science, COMSATS University Islamabad, Vehari Campus, Vehari 61100, Pakistan

2School of Computer Science and Engineering, Central South University, Changsha, 410083, China

3Department of Electrical and Electronics Engineering, Bolu Abant Izzet Baysal University, Bolu 14280, Turkey

*Corresponding Author: Kemal Polat, Email: [email protected]

Abstract:

With the speedy growth of social media, it has become easy for people to express their feelings about anything and give their opinions. These opinions are helpful in business plans development, marketing trends, political parties’ popularity. Different social media sources are used for this purpose, i.e., Facebook, Instagram, YouTube, Twitter, etc. The rapid growth of text data on social media is required to develop algorithms and techniques for recognizing people’s opinions towards a specific subject. Nowadays, Twitter becomes a rapidly used social media application where people feel free to share their feelings about anything and give their opinions. Film Industry is one of the revenue-generating Industries for the economic growth of any country. People express their views about any upcoming movie by watching its trailer using social media. The practical sentiment analysis of opinions on social media such as Twitter can be helpful to predict movie ratings. This research focused on developing a technique to predict movie success rates based on viewers’ tweets on movie trailers. The results provide the movie rating in the star’s form (1-5). We have collected tweets about different movies after their trailer was released by applying the hashtag method (#Hash). We have used four key algorithms (Naïve Bayes, SVM, decision tree, and KNN) on NLTK Movie review corpora and train & test our models. Machine learning training data sets were not readily available for movies ratting; then, we shifted towards a lexicon-based approach. All these three dictionaries have a different word count, and each word in these dictionaries has its own polarity in the form of a score. Finally, we have also compared our results with other movie rating sites like IMDB rating, which are satisfactory.

Keywords:

Optimal Sentiment Analysis, Machine Learning, Analysis, Decision Making

Downloads: 154 Views: 777
Cite This Paper:

Dewan Muhammad Qaseem, Nashit Ali, Waseem Akram, Aman Ullah, Kemal Polat (2022). Movie Success-Rate Prediction System through Optimal Sentiment Analysis. Journal of the Institute of Electronics and Computer, 4, 15-33. https://doi.org/10.33969/JIEC.2022.41002.

References:

[1] N. Banovic, T. Buzali, F. Chevalier, J. Mankoff, and A. K. Dey,    “Modeling and understanding human routine behavior,” 2016, doi: 10.1145/2858036.2858557.

[2] R. Feldman, “Techniques and applications for sentiment analysis,” Commun. ACM, vol. 56, no. 4, pp. 82–89, 2013.

[3] G. Di Fabbrizio, A. Aker, and R. Gaizauskas, “Starlet: multi-document summarization of service and product reviews with balanced rating distributions,” in Data Mining Workshops (ICDMW), 2011 IEEE 11th International Conference on, 2011, pp. 67–74.

[4] K. I. Asad, T. Ahmed, and M. S. Rahman, “Movie popularity classification based on inherent movie attributes using C4. 5, PART and correlation coefficient,” in Informatics, Electronics & Vision (ICIEV), 2012 International Conference on, 2012, pp. 747–752.

[5] A. Pak and P. Paroubek, “Twitter as a corpus for sentiment analysis and opinion mining.” in LREc, 2010, vol. 10, no. 2010, pp. 1320–1326.

[6] S. Grimes, “Text/content analytics 2011: user perspectives on solutions and providers,” Alta Plana, 2011.

[7] H. Saif, Y. He, and H. Alani, “Semantic sentiment analysis of twitter,” in International semantic web conference, 2012, pp. 508–524.

[8] D. J. Kim, D. L. Ferrin, and H. R. Rao, “A trust-based consumer decision-making model in electronic commerce: The role of trust, perceived risk, and their antecedents,” Decis. Support Syst., vol. 44, no. 2, pp. 544–564, 2008.

[9] I. H. Witten, E. Frank, M. A. Hall, and C. J. Pal, Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, 2016.

[10] W. Fan, L. Wallace, S. Rich, and Z. Zhang, “Tapping the power of text mining,” Commun. ACM, vol. 49, no. 9, pp. 76–82, 2006.

[11] D. Delen and M. D. Crossland, “Seeding the survey and analysis of research literature with text mining,” Expert Syst. Appl., vol. 34, no. 3, pp. 1707–1720, 2008.

[12] D. S. Soper and O. Turel, “An n-gram analysis of Communications 2000--2010,” Commun. ACM, vol. 55, no. 5, pp. 81–87, 2012.

[13] D. M. Blei, “Probabilistic topic models,” Commun. ACM, vol. 55, no. 4, pp. 77–84, 2012.

[14] M. Rossetti, F. Stella, and M. Zanker, “Analyzing user reviews in tourism with topic models,” Inf. Technol. Tour., vol. 16, no. 1, pp. 5–21, 2016.

[15] B. Pang and L. Lee, “Opinion mining and sentiment analysis,” Found. Trends® Inf. Retr., vol. 2, no. 1–2, pp. 1–135, 2008.

[16] S. Asur and B. A. Huberman, “Predicting the future with social media,” in Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology-Volume 01, 2010, pp. 492–499.

[17] A. Oghina, M. Breuss, M. Tsagkias, and M. de Rijke, “Predicting imdb movie ratings using social media,” in European Conference on Information Retrieval, 2012, pp. 503–507.

[18] S. R. Das, “News analytics: Framework, techniques and metrics,” 2010.

[19] M. Daiyan, D. S. K. Tiwari, M. Kumar, and M. A. Alam, “A Literature Review on Opinion Mining and Sentiment Analysis,” Int. J. Emerg. Technol. Adv. Eng., vol. 5, no. 4, pp. 262–280, 2015.

[20] [20] A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts, “Learning word vectors for sentiment analysis,” in Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies-volume 1, 2011, pp. 142–150.

[21] B. Liu, “Sentiment analysis and opinion mining,” Synth. Lect. Hum. Lang. Technol., vol. 5, no. 1, pp. 1–167, 2012.

[22] H. Yu and V. Hatzivassiloglou, “Towards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences,” in Proceedings of the 2003 conference on Empirical methods in natural language processing, 2003, pp. 129–136.

[23] L. K.-W. Tan, J.-C. Na, Y.-L. Theng, and K. Chang, “Sentence-level sentiment polarity classification using a linguistic approach,” in International Conference on Asian Digital Libraries, 2011, pp. 77–87.

[24] O. Täckström and R. McDonald, “Semi-supervised latent variable models for sentence-level sentiment analysis,” in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers-Volume 2, 2011, pp. 569–574.

[25] S.-M. Kim and E. Hovy, “Automatic detection of opinion bearing words and sentences,” 2005.

[26] C. Tan, L. Lee, J. Tang, L. Jiang, M. Zhou, and P. Li, “User-level sentiment analysis incorporating social networks,” in Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, 2011, pp. 1397–1405.

[27] H. Tang, S. Tan, and X. Cheng, “A survey on sentiment detection of reviews,” Expert Syst. Appl., vol. 36, no. 7, pp. 10760–10773, 2009.

[28] R. Narayanan, B. Liu, and A. Choudhary, “Sentiment analysis of conditional sentences,” in Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1-Volume 1, 2009, pp. 180–189.

[29] C. Zirn, M. Niepert, H. Stuckenschmidt, and M. Strube, “Fine-grained sentiment analysis with structural features,” in Proceedings of 5th International Joint Conference on Natural Language Processing, 2011, pp. 336–344.

[30] P. C. Tetlock, M. Saar‐Tsechansky, and S. Macskassy, “More than words: Quantifying language to measure firms’ fundamentals,” J. Finance, vol. 63, no. 3, pp. 1437–1467, 2008.

[31] A.-M. Popescu and O. Etzioni, “Extracting product features and opinions from reviews,” in Natural language processing and text mining, Springer, 2007, pp. 9–28.

[32] N. Kobayashi, K. Inui, Y. Matsumoto, K. Tateishi, and T. Fukushima, “Collecting evaluative expressions for opinion extraction,” in International Conference on Natural Language Processing, 2004, pp. 596–605.

[33] P. Melville, W. Gryc, and R. D. Lawrence, “Sentiment analysis of blogs by combining lexical knowledge with text classification,” in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, 2009, pp. 1275–1284.

[34] R. McDonald, K. Hannan, T. Neylon, M. Wells, and J. Reynar, “Structured models for fine-to-coarse sentiment analysis,” in Proceedings of the 45th annual meeting of the association of computational linguistics, 2007, pp. 432–439.

[35] P. D. Turney, “Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews,” in Proceedings of the 40th annual meeting on association for computational linguistics, 2002, pp. 417–424.

[36] S.-M. Kim and E. Hovy, “Automatic identification of pro and con reasons in online reviews,” in Proceedings of the COLING/ACL on Main conference poster sessions, 2006, pp. 483–490.

[37] K. Lerman, S. Blair-Goldensohn, and R. McDonald, “Sentiment summarization: evaluating and learning user preferences,” in Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, 2009, pp. 514–522.

[38] J. Serrano-Guerrero, J. A. Olivas, F. P. Romero, and E. Herrera-Viedma, “Sentiment analysis: A review and comparative analysis of web services,” Inf. Sci. (Ny)., vol. 311, pp. 18–38, 2015.

[39] A. Go, R. Bhayani, and L. Huang, “Twitter sentiment classification using distant supervision,” CS224N Proj. Report, Stanford, vol. 1, no. 12, 2009.

[40] S. Van Canneyt, N. Claeys, and B. Dhoedt, “Topic-dependent sentiment classification on twitter,” in European Conference on Information Retrieval, 2015, pp. 441–446.

[41] L. Dong, F. Wei, C. Tan, D. Tang, M. Zhou, and K. Xu, “Adaptive recursive neural network for target-dependent twitter sentiment classification,” in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2014, vol. 2, pp. 49–54.

[42] M. Kaya, G. Fidan, and I. H. Toroslu, “Transfer learning using Twitter data for improving sentiment classification of Turkish political news,” in Information Sciences and Systems 2013, Springer, 2013, pp. 139–148.

[43] J. Fernández, Y. Gutiérrez, J. M. Gómez, P. Martínez-Barco, A. Montoyo, and R. Munoz, “Sentiment analysis of spanish tweets using a ranking algorithm and skipgrams,” 2013.

[44] S. Baccianella, A. Esuli, and F. Sebastiani, “Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining.” in Lrec, 2010, vol. 10, no. 2010, pp. 2200–2204.

[45] M. M. Bradley and P. J. Lang, “Affective norms for English words (ANEW): Instruction manual and affective ratings,” Citeseer, 1999.

[46] Y. Choi, C. Cardie, E. Riloff, and S. Patwardhan, “Identifying sources of opinions with conditional random fields and extraction patterns,” in Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, 2005, pp. 355–362.

[47] K. Dave, S. Lawrence, and D. M. Pennock, “Mining the peanut gallery: Opinion extraction and semantic classification of product reviews,” in Proceedings of the 12th international conference on World Wide Web, 2003, pp. 519–528.

[48] K. Eguchi and V. Lavrenko, “Sentiment retrieval using generative models,” in Proceedings of the 2006 conference on empirical methods in natural language processing, 2006, pp. 345–354.

[49] [C. Fernandez-Lozano et al., “Improving enzyme regulatory protein classification by means of SVM-RFE feature selection,” Mol. Biosyst., vol. 10, no. 5, pp. 1063–1071, 2014.

[50] E. Kouloumpis, T. Wilson, and J. D. Moore, “Twitter sentiment analysis: The good the bad and the omg!,” Icwsm, vol. 11, no. 538–541, p. 164, 2011.

[51] T. Wilson, J. Wiebe, and P. Hoffmann, “Recognizing contextual polarity in phrase-level sentiment analysis,” in Proceedings of the conference on human language technology and empirical methods in natural language processing, 2005, pp. 347–354.

[52] L. Zhang, R. Ghosh, M. Dekhil, M. Hsu, and B. Liu, “Combining lexiconbased and learning-based methods for twitter sentiment analysis,” HP Lab. Tech. Rep. HPL-2011, vol. 89, 2011.

[53] F. Å. Nielsen, “A new ANEW: Evaluation of a word list for sentiment analysis in microblogs,” arXiv Prepr. arXiv1103.2903, 2011.

[54] S. M. Mohammad, S. Kiritchenko, and X. Zhu, “NRC-Canada: Building the state-of-the-art in sentiment analysis of tweets,” arXiv Prepr. arXiv1308.6242, 2013.

[55] P. A. Sánchez-Mirabal, Y. R. Torres, S. H. Alvarado, Y. Gutiérrez, A. Montoyo, and R. Muñoz, “UMCC_DLSI: Sentiment Analysis in Twitter using Polirity Lexicons and Tweet Similarity,” in Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), 2014, pp. 727–731.

[56] X. Wang, F. Wei, X. Liu, M. Zhou, and M. Zhang, “Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach,” in Proceedings of the 20th ACM international conference on Information and knowledge management, 2011, pp. 1031–1040.

[57] E. Sadikov, A. G. Parameswaran, and P. Venetis, “Blogs as Predictors of Movie Success.” 2009.

[58] D. Jennifer, “Affective Text based Emotion Mining in Social Media,” Int. J., vol. 2, no. 3, 2014.

[59] V. Jain, “Prediction of movie success using sentiment analysis of tweets,” Int. J. Soft Comput. Softw. Eng., vol. 3, no. 3, pp. 308–313, 2013.

[60] K. Yessenov and S. Misailovic, “Sentiment analysis of movie review comments,” Methodology, vol. 17, pp. 1–7, 2009.

[61] A. E. Thunestveit, “Sentiment Analysis on User-Based Reviews: Movie Recommendation Case.” NTNU, 2016.

[62] Nashit Ali, Anum Fatima, Hureeza Shahzadi, Nasrullah Khan, and Kemal Polat. "Online Reviews & Ratings Inter-contradiction based Product’s Quality-Prediction through Hybrid Neural Network." Journal of the Institute of Electronics and Computer 3, no. 1 (2021): 24-52.

[63] Nashit Ali, Anum Fatima, Hureeza Shahzadi, Aman Ullah and Kemal Polat. “Feature Extraction aligned Email Classification based on Imperative Sentence Selection through Deep Learning.” Journal of Artificial Intelligence and Systems, 3, (2021): 93–114.