A 3-phase approach based on sequential mining and dependency parsing for enhancing hypernym patterns performance

Ahmad Issa Alaa Aldine; Mounira Harzallah; Giuseppe Berio; Nicolas Béchet; Ahmad Faour; Ahmad Issa Alaa Aldine; Mounira Harzallah; Giuseppe Berio; Nicolas Béchet; Ahmad Faour

doi:10.1017/S0269888921000126

2021 Volume 36

Article Contents

Next Previous

RESEARCH ARTICLE Open Access

A 3-phase approach based on sequential mining and dependency parsing for enhancing hypernym patterns performance

¹University Bretagne Sud, IRISA Lab, France – Vannes Email: ahmad.issa-alaa-eddine@univ-ubs.fr, giuseppe.berio@univ-ubs.fr, nicolas.bechet@irisa.fr
²LINA - University of Nantes, France E-mail: mounira.harzallah@univ-nantes.fr
³Lebanese University, Lebanon Email: ahmad.faour@ul.edu.lb

More Information

Received: 18 July 2020
Revised: 03 August 2021
Accepted: 20 August 2021
Published online: 22 September 2021
The Knowledge Engineering Review 36, Article number: e13 (2021) | Cite this article

Abstract

Abstract: Patterns have been extensively used to extract hypernym relations from texts. The most popular patterns are Hearst’s patterns, formulated as regular expressions mainly based on lexical information. Experiences have reported good precision and low recall for such patterns. Thus, several approaches have been developed for improving recall. While these approaches perform better in terms of recall, it remains quite difficult to further increase recall without degrading precision. In this paper, we propose a novel 3-phase approach based on sequential pattern mining to improve pattern-based approaches in terms of both precision and recall by (i) using a rich pattern representation based on grammatical dependencies (ii) discovering new hypernym patterns, and (iii) extending hypernym patterns with anti-hypernym patterns to prune wrong extracted hypernym relations. The results obtained by performing experiments on three corpora confirm that using our approach, we are able to learn sequential patterns and combine them to outperform existing hypernym patterns in terms of precision and recall. The comparison to unsupervised distributional baselines for hypernym detection shows that, as expected, our approach yields much better performance. When compared to supervised distributional baselines for hypernym detection, our approach can be shown to be complementary and much less loosely coupled with training datasets and corpora.
Rights and permissions
© The Author(s), 2021. Published by Cambridge University Press on behalf of Asian Journal of Law and Society2021Cambridge University Press

References

Agrawal , R. & Srikant , R. 1995. Mining sequential patterns. In Proceedings of the Eleventh International Conference on Data Engineering, ICDE 1995, IEEE Computer Society, 3–14, http://dl.acm.org/citation.cfm?id=645480.655281

Google Scholar

Aldine , A. I. A., Harzallah , M., Giuseppe , B., BÉchet , N. & Faour , A. 2018. Redefining hearst patterns by using dependency relations. In Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 2: KEOD, INSTICC, SciTePress, 148–155, doi: 10.5220/0006962201480155

Google Scholar

Baroni , M., Bernardi , R., Do , N. Q. & Chieh Shan , C. 2012. Entailment above the word level in distributional semantics. In EACL, 23–32.

Google Scholar

Bechet , N., Cellier , P., Charnois , T. & Cremilleux , B. 2012. Sequential pattern mining to discover relations between genes and rare diseases. In IEEE Int. Symp. on Computer-Based Medical Systems (CBMS), 1–6.

Google Scholar

BÉchet , N., Cellier , P., Charnois , T. & CrÉmilleux , B. 2015. Sequence mining under multiple constraints. In Proceedings of the 30th Annual ACM Symposium on Applied Computing, SAC 2015, ACM, 908–914, doi: 10.1145/2695664.2695889, http://doi.acm.org/10.1145/2695664.2695889.

Google Scholar

Buitelaar , P., Cimiano , P. & Magnini , B. 2005. Ontology learning from text: An overview. In Ontology Learning from Text: Methods, Applications and Evaluation, 3–12.

Google Scholar

Camacho-Collados , J., Delli Bovi , C., Espinosa-Anke , L., Oramas , S., Pasini , T., Santus , E., Shwartz , V., Navigli , R. & Saggion , H. 2018. SemEval-2018 Task 9: Hypernym discovery. In Proceedings of the 12th International Workshop on Semantic Evaluation (SemEval-2018), Association for Computational Linguistics.

Google Scholar

Cellier , P., Charnois , T. & Plantevit , M. 2010. Sequential patterns to discover and characterise biological relations. In Computational Linguistics and Intelligent Text Processing, Gelbukh, A. (ed). Springer Berlin Heidelberg, 537–548.

Google Scholar

Chandramouli , K., Kliegr , T., Nemrava , J., Svatek , V. & Izquierdo , E. 2008. Query refinement and user relevance feedback for contextualized image retrieval. In 2008 5th International Conference on Visual Information Engineering (VIE 2008), 453–458.

Google Scholar

Cui , H., Kan , M. Y. & Chua , T. S. 2007. Soft pattern matching models for definitional question answering. ACM Transactions on Information Systems 25, 8.

Google Scholar

Devlin , J., Chang , M. W., Lee , K. & Toutanova , K. 2019. Bert: Pre-training of deep bidirectional transformers for language understanding.

Google Scholar

Fellbaum , C. 1998. Wordnet: An Electronic Lexical Database. MIT Press.

Google Scholar

Gomez-PÉrez , A. & Manzano-Mancho , D. 2004. An overview of methods and tools for ontology learning from texts. The Knowledge Engineering Review 19(3), 187–212. doi: 10.1017/S0269888905000251.

CrossRef Google Scholar

Hearst , M. A. 1992. Automatic acquisition of hyponyms from large text corpora. In Proceedings of the 14th International Conference on Computational Linguistics, 539–545.

Google Scholar

Hearst , M. A. 1998. Automated Discovery of Wordnet Relations. WordNet: An Electronic Lexical Database, 131–152.

Google Scholar

Jacques , M. P. & Aussenac-Gilles , N. 2006. VariabilitÉ des performances des outils de tal et genre textuel. cas des patrons lexico-syntaxiques 47, 11–32.

Google Scholar

Klein , D. & Manning , C. D. 2003. Accurate unlexicalized parsing. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1, ACL 2003, Association for Computational Linguistics, 423–430, doi: 10.3115/1075096.1075150, https://doi.org/10.3115/1075096.1075150.

Google Scholar

Kotlerman , L., Dagan , I., Szpektor , I. & Zhitomirsky-Geffet , M. 2010. Directional distributional similarity for lexical inference. NLE, 359–389.

Google Scholar

Levy , O., Remus , S., Biemann , C. & Dagan , I. 2015. Do supervised distributional methods really learn lexical inference relations? In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, 970–976. doi: 10.3115/v1/N15-1098, https://www.aclweb.org/anthology/N15-1098.

Google Scholar

Lin , D. 2003. Dependency-based evaluation of minipar. Treebanks - Building and Using Parsed Corpora, 317–329.

Google Scholar

Mikolov , T., Sutskever , I., Chen , K., Corrado , G. S. & Dean , J. 2013. Distributed representations of words and phrases and their compositionality. In NIPS, 3111–3119.

Google Scholar

Mirkin , S., Dagan , I. & Geffet , M. 2006. Integrating pattern-based and distributional similarity methods for lexical entailment acquisition. In COLING and ACL, 579–586.

Google Scholar

Nguyen , D. P. T., Matsuo , Y. & Ishizuka , M. 2007. Exploiting syntactic and semantic information for relation extraction from wikipedia. In IJCAI07-TextLinkWS.

Google Scholar

Orna-Montesinos , C. 2011. Words & Patterns: Lexico-Grammatical Patterns and Semantic Relations in Domain-Specific Discourses, 24.

Google Scholar

Pei , J., Han , J., Mortazavi-Asl , B., Pinto , H., Chen , Q., Dayal , U. & Hsu , M. C. 2001. Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth. In International Conference on Data Engineering, 215–224.

Google Scholar

Pennington , J., Socher , R. & Manning , C. D. 2014. Glove: Global vectors for word representation. In EMNL, 1532–1543.

Google Scholar

Ponzetto , S. P. & Strube , M. 2011. Taxonomy induction based on a collaboratively built knowledge repository. Artificial Intelligence 175(9), 1737–1756, https://doi.org/10.1016/j.artint.2011.01.003, http://www.sciencedirect.com/science/article/pii/S000437021100004X

Google Scholar

Roller , S., Kiela , D. & Nickel , M. 2018. Hearst patterns revisited: Automatic hypernym detection from large text corpora. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Association for Computational Linguistics, 358–363, http://aclweb.org/anthology/P18-2057.

Google Scholar

Sang , E. T. K. & Hofmann , K. 2009. Lexical patterns or dependency patterns: Which is better for hypernym extraction? In Proceedings of the Thirteenth Conference on Computational Natural Language Learning, CoNLL 2009, Association for Computational Linguistics, 174–182.

Google Scholar

Seitner , J., Bizer , C., Eckert , K., Faralli , S., Meusel , R., Paulheim , H. & Ponzetto , S. P. 2016 A large database of hypernymy relations extracted from the web. In LREC.

Google Scholar

Sheena , N., Jasmine , S. M. & Joseph , S. 2016. Automatic extraction of hypernym and meronym relations in english sentences using dependency parser. In Procedia Computer Science, 539–546.

Google Scholar

Shwartz , V., Goldberg , Y. & Dagan , I. 2016. Improving hypernymy detection with an integrated path-based and distributional method. CoRR abs/1603.06076, http://arxiv.org/abs/1603.06076,

Google Scholar

Shwartz , V., Santus , E. & Schlechtweg , D. 2017. Hypernyms under siege: Linguistically-motivated artillery for hypernymy detection. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, Association for Computational Linguistics, 65–75, https://www.aclweb.org/anthology/E17-1007

Google Scholar

Snow , R., Jurafsky , D. & Ng , A. 2005. Learning Syntactic Patterns for Automatic Hypernym Discovery. MIT Press, 1297–1304.

Google Scholar

Srikant , R. & Agrawal , R. 1996. Mining sequential patterns: Generalizations and performance improvements. In Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology, EDBT 1996, Springer-Verlag, 3–17, http://dl.acm.org/citation.cfm?id=645337.650382

Google Scholar

Wang , J. & Han , J. 2004. Bide: Efficient mining of frequent closed sequences. In Proceedings of the 20th International Conference on Data Engineering, ICDE 2004, IEEE Computer Society, 79, http://dl.acm.org/citation.cfm?id=977401.978142

Google Scholar

Weeds , J. & Weir , D. 2003. A general framework for distributional similarity. In EMLP, 81–88.

Google Scholar

Yan , X., Han , J. & Afshar , R. 2003. Clospan: Mining closed sequential patterns in large datasets. In: SDM, 166–177.

Google Scholar

Yang , Z., Dai , Z., Yang , Y., Carbonell , J., Salakhutdinov , R. & Le , Q. V. (2020) Xlnet: Generalized autoregressive pretraining for language understanding.

Google Scholar

Yu , C., Han , J., Wang , P., Song , Y., Zhang , H., Ng , W. & Shi , S. (2020) When hearst is not enough: Improving hypernymy detection from corpus with distributional models.

Google Scholar

Zhang , E. & Zhang , Y. 2009. Average Precision, Springer US, 192–193. doi: 10.1007/978-0-387-39940-9_482, https://doi.org/10.1007/978-0-387-39940-9_482

Google Scholar

Zheng , W., Cheng , H., Yu , J. X., Zou , L. & Zhao , K. 2019. Interactive natural language question answering over knowledge graphs. Information Sciences 481, 141–159, doi: https://doi.org/10.1016/j.ins.2018.12.032, https://www.sciencedirect.com/science/article/pii/S0020025518309848

Google Scholar

About this article

Cite this article

Ahmad Issa Alaa Aldine, Mounira Harzallah, Giuseppe Berio, Nicolas Béchet, Ahmad Faour. 2021. A 3-phase approach based on sequential mining and dependency parsing for enhancing hypernym patterns performance. The Knowledge Engineering Review. 36:126 doi: 10.1017/S0269888921000126

Ahmad Issa Alaa Aldine, Mounira Harzallah, Giuseppe Berio, Nicolas Béchet, Ahmad Faour. 2021. A 3-phase approach based on sequential mining and dependency parsing for enhancing hypernym patterns performance. The Knowledge Engineering Review. 36:126 doi: 10.1017/S0269888921000126

Download PDF

Article Metrics

Article views(248) PDF downloads(143)

{{lists.name}}

A 3-phase approach based on sequential mining and dependency parsing for enhancing hypernym patterns performance

Abstract

Rights and permissions

References

About this article

Cite this article

Article Metrics

Access History

Other Articles By Authors