The Role of Named Entity Recognition (NER): Survey

Girma Yohannis Bade; Olga Kolesnikova; Jose Luis Oropeza

doi:https://doi.org/10.14445/22492593/IJCOT-V14I3P301

Research Article | Open Access | Download PDF

Volume 14 | Issue 3 | Year 2024 | Article Id. IJCOT-V14I3P301 | DOI : https://doi.org/10.14445/22492593/IJCOT-V14I3P301

The Role of Named Entity Recognition (NER): Survey

Girma Yohannis Bade, Olga Kolesnikova, Jose Luis Oropeza

Received	Revised	Accepted	Published
16 Jun 2024	23 Jul 2024	14 Aug 2024	31 Aug 2024

Citation :

Girma Yohannis Bade, Olga Kolesnikova, Jose Luis Oropeza, "The Role of Named Entity Recognition (NER): Survey," International Journal of Computer & Organization Trends (IJCOT), vol. 14, no. 3, pp. 1-7, 2024. Crossref, https://doi.org/10.14445/22492593/IJCOT-V14I3P301

Abstract

Named Entity Recognition (NER) is an Information Extraction (IE) building block. Though the information extraction process has been automated using various techniques to find and extract relevant information from unstructured documents, the discovery of targeted knowledge still poses many research difficulties because of Web data's variability and lack of structure. NER, a subtask of IE, came to exist to smooth such difficulty. It deals with finding the proper names (named entities), such as a person's name, country, location, organization, dates, and event in a document. It categorises them as predetermined labels, an initial step in IE tasks. This survey paper presents the roles and importance of NER to IE from the perspective of different algorithms and application area domains. Additionally, it summarizes how researchers implemented NER in particular application areas like finance, medicine, defense, business, food science, archeology, etc. It also outlines the three NER sequence labeling algorithms types: feature-based, neural network-based, and rule-based. Finally, the state-of-the-art and evaluation metrics of NER were presented.

Keywords

NER, Information Extraction (IE), Sequence labeling algorithms, Application area.

References

[1] Adnan Akhundov, Dietrich Trautmann, and Georg Groh, "Sequence Labeling: A Practical Approach," arxiv, pp. 1-10, 2018.

[CrossRef] [Google Scholar] [Publisher Link]

[2] Abdelaziz Bouras, Jannik Laval, and Houssem Gasmi, "Information Extraction of Cybersecurity Concepts: An LSTM Approach," Applied Sciences, vol. 9, no. 19, pp. 1-15, 2019.

[CrossRef] [Google Scholar] [Publisher Link]

[3] Girma Yohannis Bade, "Natural Language Processing and Its Challenges on Omotic Language Group of Ethiopia," Journal of Computer Science Research, vol. 3, no. 4, pp. 26-30, 2021.
[CrossRef] [Google Scholar] [Publisher Link]

[4] Girma Yohannis Bade, and Akalu Assefa Afaro, "Object Oriented Software Development for Artificial Intelligence." American Journal of Software Engineering and Applications, vol. 7, no. 2, pp. 22-24, 2018.

[CrossRef] [Google Scholar] [Publisher Link]

[5] Alex Brandsen et al., "Can BERT Dig It? Named Entity Recognition for Information Retrieval in the Archaeology Domain," Journal on Computing and Cultural Heritage, vol. 15, no. 3, pp. 1-18, 2022.

[CrossRef] [Google Scholar] [Publisher Link]

[6] Alebachew Chiche, and Betselot Yitagesu, "Part of Speech Tagging: A Systematic Review of Deep Learning and Machine Learning Approaches," Journal of Big Data, vol. 9, no. 1, pp. 1-25, 2022.

[CrossRef] [Google Scholar] [Publisher Link]

[7] Yunfei Ji et al., "A Deep Learning Method for Named Entity Recognition in Bidding Document," Journal of Physics: Conference Series, vol. 1168, no. 3, pp. 1-11, 2019.

[8] Zhiyong He et al., "A Survey on Recent Advances in Sequence Labeling from Deep Learning Models," arxiv, pp. 1-16, 2020.

[CrossRef] [Google Scholar] [Publisher Link]

[9] Emna Hkiri, Souheyl Mallat, and Mounir Zrigui, "Improving Coverage of Rule Based NER Systems," 2015 5th International Conference on Information & Communication Technology and Accessibility (ICTA), Marrakech, Morocco, pp. 1-6, 2015.

[CrossRef] [Google Scholar] [Publisher Link]

[10] Frederik Hogenboom et al., "An Overview of Event Extraction from Text," Proceedings of the Workshop on Detection, Representation, and Exploitation of Events in the Semantic Web (DeRiVE 2011), Workshop in conjunction with the 10th International Semantic Web Conference 2011 (ISWC 2011), Bonn, Germany, pp. 48-57, 2011.

[Google Scholar] [Publisher Link]

[11] Ali Jabbari et al., "A French Corpus and Annotation Schema for Named Entity Recognition and Relation Extraction of Financial News," Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), Marseille, France, pp. 2293-2299, 2020.

[Google Scholar] [Publisher Link]

[12] Mahboob Alam Khalid, Valentin Jijkoun, and Maarten de Rijke, "The Impact of Named Entity Normalization on Information Retrieval for Question Answering," Advances in Information Retrieval, 30th European Conference on IR Research, Glasgow, UK, vol. 4956, pp. 705-710, 2008.

[CrossRef] [Google Scholar] [Publisher Link]

[13] Jochen L. Leidner, Gail Sinclair, and Bonnie Webber, "Grounding Spatial Named Entities for Information Extraction and Question Answering," HLT-NAACL-GEOREF '03: Proceedings of the HLT-NAACL 2003 Workshop on Analysis of Geographic References, vol. 1, pp. 31-38, 2003.

[CrossRef] [Google Scholar] [Publisher Link]

[14] Girma Yohannis Bade, "Four Basic Concepts of Object Technology : A Structuring Method, a Reliability Discipline, an Epistemological Principle and a Classification Technique," Journal of Advancement in Engineering and Technology, vol. 7, no. 3, 2019.

[Google Scholar]

[15] Mary Ellen Okurowski, "Information Extraction Overview," TIPSTER '93: Proceedings of a Workshop on Held at Fredericksburg, Virginia, Fredericksburg, Virginia, pp. 117-121, 1993.

[CrossRef] [Google Scholar] [Publisher Link]

[16] Nadeesha Perera, Matthias Dehmer, and Frank Emmert-Streib, "Named Entity Recognition and Relation Detection for Biomedical Information Extraction," Frontiers in Cell and Developmental Biology, vol. 8, pp. 1-26, 2020.

[CrossRef] [Google Scholar] [Publisher Link]

[17] Rody Politt, Joy Pollock, and Elisabeth Waller, Day-to-Day Dyslexia in the Classroom, Routledge, 2nd Ed., pp. 1-224, 2004.

[CrossRef] [Google Scholar] [Publisher Link]

[18] Gorjan Popovski, Barbara Korousic Seljak, and Tome Eftimov, "A Survey of Named-Entity Recognition Methods for Food Information Extraction," IEEE Access, vol. 8, pp. 31586-31594, 2020.

[CrossRef] [Google Scholar] [Publisher Link]

[19] Utpal Kumar Sikdar, and Bjorn Gamback, "A Feature-Based Ensemble Approach to Recognition of Emerging and Rare Named Entities," Proceedings of the 3rd Workshop on Noisy User-Generated Text, Copenhagen, Denmark, pp. 177-181, 2017.
[CrossRef] [Google Scholar] [Publisher Link]

[20] Jie Tang et al., Information Extraction: Methodologies and Applications, Emerging Technologies of Text Mining: Techniques and Applications, Information Science Reference, pp. 1-358, 2008.

[Google Scholar] [Publisher Link]

[21] Mesay Gemeda Yigezu et al., "Multilingual Hope Speech Detection using Machine Learning," Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2023), Jaen, Spain, 2023.

[Google Scholar] [Publisher Link]

[22] Girma Yohannis Bade, and Hussien Seid, "Development of Longest-Match Based Stemmer for Texts of Wolaita Language," International Journal on Data Science and Technology, vol. 4, no. 3, pp. 79-83, 2018.
[CrossRef] [Google Scholar] [Publisher Link]

[23] Feifei Zhai et al., "Neural Models for Sequence Chunking," Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31, no. 1, pp. 3365-3371, 2017.
[CrossRef] [Google Scholar] [Publisher Link]

[24] Jiayu Zhang et al., "Chinese Named Entity Recognition for Apple Diseases and Pests Based on Character Augmentation," Computers and Electronics in Agriculture, vol. 190, 2021.
[CrossRef] [Google Scholar] [Publisher Link]

[25] Zexuan Zhong, and Danqi Chen, "A Frustratingly Easy Approach for Entity and Relation Extraction," Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 50-61, 2021.
[CrossRef] [Google Scholar] [Publisher Link]

[26] Girma Yohannis Bade, and Hussien Seid, "Development of Longest-Match Based Stemmer for Texts of Wolaita Language," International Journal on Data Science and Technology, vol. 4, No. 3, pp. 79-83, 2018.

[CrossRef] [Google Scholar] [Publisher Link]

[27] Girma Bade et al., "Social Media Hate and Offensive Speech Detection Using Machine Learning Method," Proceedings of the Fourth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages, St. Julian's, Malta, pp. 240-244, 2024.

[Google Scholar] [Publisher Link]

[28] Girma Bade, et al., "Social Media Fake News Classification Using Machine Learning Algorithm," Proceedings of the Fourth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages, St. Julian's, Malta, pp. 24-29, 2024.
[Google Scholar] [Publisher Link]