TY - GEN
T1 - A Semi-discriminative Approach for Sub-sentence Level Topic Classification on a Small Dataset
AU - Ferner, C.
AU - Wegenkittl, S.
N1 - Conference code: 239479
Export Date: 14 December 2023
Correspondence Address: Ferner, C.; Salzburg University of Applied Sciences, Urstein Sued 1, Austria; email: [email protected]
References: Baldwin, T., de Marneffe, M.C., Han, B., Kim, Y.B., Ritter, A., Xu, W., Shared tasks of the 2015 workshop on noisy user-generated text: Twitter lexical normalization and named entity recognition (2015) Proceedings of the Workshop on Noisy User-Generated Text, pp. 126-135. , pp; Dai, A.M., Le, Q.V., Semi-supervised sequence learning (2015) Advances in Neural Information Processing Systems, pp. 3079-3087. , pp; (2018) Aggregated Text Corpus of Laptop Expert Reviews with Annotated Topics, , https://github.com/factai/corpus-laptop-topic; Klein, D., Manning, C.D., Conditional structure versus conditional estimation in NLP models (2002) Proceedings of the ACL Conference on Empirical Methods in Natural Language Processing, EMNLP 2002, pp. 9-16. , ACL; Lafferty, J.D., McCallum, A., Pereira, F.C.N., Conditional random fields: Probabilistic models for segmenting and labeling sequence data (2001) Proceedings of the Eighteenth International Conference on Machine Learning, ICML, 2001, pp. 282-289. , pp; Manning, C.D., Raghavan, P., Schütze, H., (2008) Introduction to Information Retrieval, 1. , vol., Cambridge University Press, Cambridge; Manning, C.D., Schütze, H., (1999) Foundations of Statistical Natural Language Processing, , MIT Press, Cambridge; McCallum, A., Efficiently inducing features of conditional random fields (2002) Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence, pp. 403-410. , pp., Morgan Kaufmann Publishers Inc; McCallum, A., Freitag, D., Pereira, F.C.N., Maximum entropy Markov models for information extraction and segmentation (2000) Proceedings of the Seventeenth International Conference on Machine Learning, ICML, 2000, pp. 591-598. , pp; Medlock, B.W., (2008) Investigating Classification for Natural Language Processing Tasks, , University of Cambridge, Computer Laboratory, Technical report; Ng, A.Y., Jordan, M.I., On discriminative vs. Generative classifiers: A comparison of logistic regression and naive Bayes (2002) Advances in Neural Information Processing Systems, pp. 841-848. , pp; Pang, B., Lee, L., Vaithyanathan, S., Thumbs Up? Sentiment classification using machine learning techniques (2002) Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, 10, pp. 79-86. , Volume, pp.,. Association for Computational Linguistics; Pedregosa, F., Scikit-learn: Machine learning in python (2011) J. Mach. Learn. Res., 12, pp. 2825-2830; Peng, F., McCallum, A., Information extraction from research papers using conditional random fields (2006) Inf. Process. Manage., 42 (4), pp. 963-979; Petrushin, V.A., Hidden Markov models: Fundamentals and applications (2000) Online Symposium for Electronics Engineer; Pinto, D., McCallum, A., Wei, X., Croft, W.B., Table extraction using conditional random fields (2003) Proceedings of the 26Th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval, pp. 235-242. , pp., ACM; Pontiki, M., SemEval-2016 task 5: Aspect based sentiment analysis (2016) Proceedings of the 10Th International Workshop on Semantic Evaluation (Semeval-2016), pp. 19-30. , pp; Pontiki, M., Galanis, D., Pavlopoulos, J., Papageorgiou, H., Androutsopoulos, I., Manandhar, S., SemEval-2014 task 4: Aspect based sentiment analysis (2014) Proceedings of the 8Th International Workshop on Semantic Evaluation (Semeval-2014), pp. 27-35. , pp; Schwartz, A.S., (2007) Posterior Decoding Methods for Optimization and Accuracy Control of Multiple Alignments, , Ph.D. thesis, EECS Department, University of California, Berkeley; Sha, F., Pereira, F., Shallow parsing with conditional random fields (2003) Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pp. 134-141. , pp., ACL; Strauss, B., Toma, B., Ritter, A., de Marneffe, M.C., Xu, W., Results of the WNUT16 named entity recognition shared task (2016) Proceedings of the 2Nd Workshop on Noisy User-Generated Text (WNUT), pp. 138-144. , pp; Sutton, C., McCallum, A., An introduction to conditional random fields (2012) Found. Trends ® Mach. Learn., 4 (4), pp. 267-373
PY - 2020/4
Y1 - 2020/4
N2 - This paper aims at identifying sequences of words related to specific product components in online product reviews. A reliable baseline performance for this topic classification problem is given by a Max Entropy classifier which assumes independence over subsequent topics. However, the reviews exhibit an inherent structure on the document level allowing to frame the task as sequence classification problem. Since more flexible models from the class of Conditional Random Fields were not competitive because of the limited amount of training data available, we propose using a Hidden Markov Model instead and decouple the training of transition and emission probabilities. The discriminating power of the Max Entropy approach is used for the latter. Besides outperforming both standalone methods as well as more generic models such as linear-chain Conditional Random Fields, the combined classifier is able to assign topics on sub-sentence level although labeling in the training data is only available on sentence level. © Springer Nature Switzerland AG 2020.
AB - This paper aims at identifying sequences of words related to specific product components in online product reviews. A reliable baseline performance for this topic classification problem is given by a Max Entropy classifier which assumes independence over subsequent topics. However, the reviews exhibit an inherent structure on the document level allowing to frame the task as sequence classification problem. Since more flexible models from the class of Conditional Random Fields were not competitive because of the limited amount of training data available, we propose using a Hidden Markov Model instead and decouple the training of transition and emission probabilities. The discriminating power of the Max Entropy approach is used for the latter. Besides outperforming both standalone methods as well as more generic models such as linear-chain Conditional Random Fields, the combined classifier is able to assign topics on sub-sentence level although labeling in the training data is only available on sentence level. © Springer Nature Switzerland AG 2020.
KW - Hidden Markov Model
KW - Small data
KW - Topic classification
KW - Entropy
KW - Hidden Markov models
KW - Information retrieval systems
KW - Machine learning
KW - Base-line performance
KW - Combined classifiers
KW - Conditional random field
KW - Discriminating power
KW - Discriminative approach
KW - Emission probabilities
KW - Online product reviews
KW - Sequence classification
KW - Classification (of information)
U2 - 10.1007/978-3-030-46147-8_42
DO - 10.1007/978-3-030-46147-8_42
M3 - Conference contribution
SN - 978-3-030-46146-1
VL - 11907 LNAI
BT - Machine Learning and Knowledge Discovery in Databases
PB - Springer
T2 - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2019
Y2 - 16 September 2019 through 20 September 2019
ER -