Conferences in Research and Practice in Information Technology
  

Online Version - Last Updated - 20 Jan 2012

 

 
Home
 

 
Procedures and Resources for Authors

 
Information and Resources for Volume Editors
 

 
Orders and Subscriptions
 

 
Published Articles

 
Upcoming Volumes
 

 
Contact Us
 

 
Useful External Links
 

 
CRPIT Site Search
 
    

Extracting Crime Information from Online Newspaper Articles

Arulanandam, R., Savarimuthu, B.T.R. and Purvis, M.A.

    Information extraction is the task of extracting relevant information from unstructured data. This paper aims to `mine\' (or extract) crime information from online newspaper articles and make this information available to the public. Baring few, many countries that possess this information do not make them available to their citizens. So, this paper focuses on automatic extraction of public yet `hidden\' information available in newspaper articles and make it available to the general public. In order to demonstrate the feasibility of such an approach, this paper focuses on one type of crime, the theft crime. This work demonstrates how theft-related information can be extracted from newspaper articles from three different countries. The system employs Named Entity Recognition (NER) algorithms to identify locations in sentences. However, not all the locations reported in the article are crime locations. So, it employs Conditional Random Field (CRF), a machine learning approach to classify whether a sentence in an article is a crime location sentence or not. This work compares the performance of four different NERs in the context of identifying locations and their subsequent impact in classifying a sentence as a `crime location\' sentence. It investigates whether a CRF-based classifier model that is trained to identify crime locations from a set of articles can be used to identify articles from another newspaper in the same country (New Zealand). Also, it compares the accuracy of identifying crime location sentences using the developed model in newspapers from two other countries (Australia and India).
Cite as: Arulanandam, R., Savarimuthu, B.T.R. and Purvis, M.A. (2014). Extracting Crime Information from Online Newspaper Articles. In Proc. Australasian Web Conference (AWC 2014) Auckland, New Zealand. CRPIT, 155. Trotman, A., Cranefield, S. and Yang, J. Eds., ACS. 31-38
pdf (from crpit.com) pdf (local if available) BibTeX EndNote GS