International Journal of Computer
& Organization Trends

Research Article | Open Access | Download PDF

Volume 14 | Issue 3 | Year 2024 | Article Id. IJCOT-V14I3P301 | DOI : https://doi.org/10.14445/22492593/IJCOT-V14I3P301

The Role of Named Entity Recognition (NER): Survey


Girma Yohannis Bade, Olga Kolesnikova, Jose Luis Oropeza

Received Revised Accepted Published
16 Jun 2024 23 Jul 2024 14 Aug 2024 31 Aug 2024

Citation :

Girma Yohannis Bade, Olga Kolesnikova, Jose Luis Oropeza, "The Role of Named Entity Recognition (NER): Survey," International Journal of Computer & Organization Trends (IJCOT), vol. 14, no. 3, pp. 1-7, 2024. Crossref, https://doi.org/10.14445/22492593/IJCOT-V14I3P301

Abstract

Named Entity Recognition (NER) is an Information Extraction (IE) building block. Though the information extraction process has been automated using various techniques to find and extract relevant information from unstructured documents, the discovery of targeted knowledge still poses many research difficulties because of Web data's variability and lack of structure. NER, a subtask of IE, came to exist to smooth such difficulty. It deals with finding the proper names (named entities), such as a person's name, country, location, organization, dates, and event in a document. It categorises them as predetermined labels, an initial step in IE tasks. This survey paper presents the roles and importance of NER to IE from the perspective of different algorithms and application area domains. Additionally, it summarizes how researchers implemented NER in particular application areas like finance, medicine, defense, business, food science, archeology, etc. It also outlines the three NER sequence labeling algorithms types: feature-based, neural network-based, and rule-based. Finally, the state-of-the-art and evaluation metrics of NER were presented.

Keywords

NER, Information Extraction (IE), Sequence labeling algorithms, Application area.

References

 [1] Adnan Akhundov, Dietrich Trautmann, and Georg Groh, "Sequence Labeling: A Practical Approach," arxiv, pp. 1-10, 2018.

[2] Abdelaziz Bouras, Jannik Laval, and Houssem Gasmi, "Information Extraction of Cybersecurity Concepts: An LSTM Approach," Applied Sciences, vol. 9, no. 19, pp. 1-15, 2019.

[3] Girma Yohannis Bade, "Natural Language Processing and Its Challenges on Omotic Language Group of Ethiopia," Journal of Computer Science Research, vol. 3, no. 4, pp. 26-30, 2021.
[CrossRef] [Google Scholar] [Publisher Link]

[4] Girma Yohannis Bade, and Akalu Assefa Afaro, "Object Oriented Software Development for Artificial Intelligence." American Journal of Software Engineering and Applications, vol. 7, no. 2, pp. 22-24, 2018.

[5] Alex Brandsen et al., "Can BERT Dig It? Named Entity Recognition for Information Retrieval in the Archaeology Domain," Journal on Computing and Cultural Heritage, vol. 15, no. 3, pp. 1-18, 2022.

[6] Alebachew Chiche, and Betselot Yitagesu, "Part of Speech Tagging: A Systematic Review of Deep Learning and Machine Learning Approaches," Journal of Big Data, vol. 9, no. 1, pp. 1-25, 2022.

[7] Yunfei Ji et al., "A Deep Learning Method for Named Entity Recognition in Bidding Document," Journal of Physics: Conference Series, vol. 1168, no. 3, pp. 1-11, 2019.

[8] Zhiyong He et al., "A Survey on Recent Advances in Sequence Labeling from Deep Learning Models," arxiv, pp. 1-16, 2020.

[9] Emna Hkiri, Souheyl Mallat, and Mounir Zrigui, "Improving Coverage of Rule Based NER Systems," 2015 5th International Conference on Information & Communication Technology and Accessibility (ICTA), Marrakech, Morocco, pp. 1-6, 2015.

[10] Frederik Hogenboom et al., "An Overview of Event Extraction from Text," Proceedings of the Workshop on Detection, Representation, and Exploitation of Events in the Semantic Web (DeRiVE 2011), Workshop in conjunction with the 10th International Semantic Web Conference 2011 (ISWC 2011), Bonn, Germany, pp. 48-57, 2011.

[11] Ali Jabbari et al., "A French Corpus and Annotation Schema for Named Entity Recognition and Relation Extraction of Financial News," Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), Marseille, France, pp. 2293-2299, 2020.

[12] Mahboob Alam Khalid, Valentin Jijkoun, and Maarten de Rijke, "The Impact of Named Entity Normalization on Information Retrieval for Question Answering," Advances in Information Retrieval, 30th European Conference on IR Research, Glasgow, UK, vol. 4956, pp. 705-710, 2008.

[13] Jochen L. Leidner, Gail Sinclair, and Bonnie Webber, "Grounding Spatial Named Entities for Information Extraction and Question Answering," HLT-NAACL-GEOREF '03: Proceedings of the HLT-NAACL 2003 Workshop on Analysis of Geographic References, vol. 1, pp. 31-38, 2003.

[14] Girma Yohannis Bade, "Four Basic Concepts of Object Technology : A Structuring Method, a Reliability Discipline, an Epistemological Principle and a Classification Technique," Journal of Advancement in Engineering and Technology, vol. 7, no. 3, 2019.

[15] Mary Ellen Okurowski, "Information Extraction Overview," TIPSTER '93: Proceedings of a Workshop on Held at Fredericksburg, Virginia, Fredericksburg, Virginia, pp. 117-121, 1993.

[16] Nadeesha Perera, Matthias Dehmer, and Frank Emmert-Streib, "Named Entity Recognition and Relation Detection for Biomedical Information Extraction," Frontiers in Cell and Developmental Biology, vol. 8, pp. 1-26, 2020.

[17] Rody Politt, Joy Pollock, and Elisabeth Waller, Day-to-Day Dyslexia in the Classroom, Routledge, 2nd Ed., pp. 1-224, 2004.

[18] Gorjan Popovski, Barbara Korousic Seljak, and Tome Eftimov, "A Survey of Named-Entity Recognition Methods for Food Information Extraction," IEEE Access, vol. 8, pp. 31586-31594, 2020.

[19] Utpal Kumar Sikdar, and Bjorn Gamback, "A Feature-Based Ensemble Approach to Recognition of Emerging and Rare Named Entities," Proceedings of the 3rd Workshop on Noisy User-Generated Text, Copenhagen, Denmark, pp. 177-181, 2017.
[CrossRef] [Google Scholar] [Publisher Link]

[20] Jie Tang et al., Information Extraction: Methodologies and Applications, Emerging Technologies of Text Mining: Techniques and Applications, Information Science Reference, pp. 1-358, 2008.

[21] Mesay Gemeda Yigezu et al., "Multilingual Hope Speech Detection using Machine Learning," Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2023), Jaen, Spain, 2023.

[22] Girma Yohannis Bade, and Hussien Seid, "Development of Longest-Match Based Stemmer for Texts of Wolaita Language," International Journal on Data Science and Technology, vol. 4, no. 3, pp. 79-83, 2018.
[CrossRef] [Google Scholar] [Publisher Link]

[23] Feifei Zhai et al., "Neural Models for Sequence Chunking," Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31, no. 1, pp. 3365-3371, 2017.
[CrossRef] [Google Scholar] [Publisher Link]

[24] Jiayu Zhang et al., "Chinese Named Entity Recognition for Apple Diseases and Pests Based on Character Augmentation," Computers and Electronics in Agriculture, vol. 190, 2021. 
[CrossRef] [Google Scholar] [Publisher Link]

[25] Zexuan Zhong, and Danqi Chen, "A Frustratingly Easy Approach for Entity and Relation Extraction," Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 50-61, 2021.
[CrossRef] [Google Scholar] [Publisher Link]    

[26] Girma Yohannis Bade, and Hussien Seid, "Development of Longest-Match Based Stemmer for Texts of Wolaita Language," International Journal on Data Science and Technology, vol. 4, No. 3, pp. 79-83, 2018.

[27] Girma Bade et al., "Social Media Hate and Offensive Speech Detection Using Machine Learning Method," Proceedings of the Fourth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages, St. Julian's, Malta, pp. 240-244, 2024.

[28] Girma Bade, et al., "Social Media Fake News Classification Using Machine Learning Algorithm," Proceedings of the Fourth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages, St. Julian's, Malta, pp. 24-29, 2024. 
[
Google Scholar] [Publisher Link]