Open Journal Systems

An Enumerative Framework for extraction of Bag-of-Words from Legal Documents

Basaveswar Rao. B, B.V.Rama Krishna Gangadhara Rao. K Chandan. K

crossmark logo side by side horizontal


In this paper an enumerative frame work is developed for extraction of Bag-of-Words from legal documents. For this purpose 100 judgments of Supreme Court of India   related to Dowry cases are considered. From the judgments the case notes are taken as a text input and extracted a set of Bag-of-Words. A novelistic algorithm is presented and implemented for this purpose. For filtering the insignificant words from the Bag-of-Words a threshold value has been applied on word frequencies. This Bag-of-Words may be utilized in Data Mining applications to extract Knowledge Discovery from judgments.

Full Text:



Rupali Sunil Wagh: Exploratory Analysis of Legal Documents using Unsupervised Text Mining Techniques, IJERT International Journal of Engineering Research & Technology, ISSN:2278-0181, Vol. 3, Issue 2, Feb-2014.

V. Bijalwan, Vinay kumar, pinki Kumri and Jordan Pascual: KNN based Machine Learning Approach for Text and Document Mining, IJDTA International Journal of Database Theory and Applications, Vol. 7, No. 1, pp.61-70, 2014.

R.G. Uthra: Data Mining Techniques to Analyze Crime Data, IJTRE International Journal for Technological Research in Engineering, ISSN: 2347-4718, Vol. 1, Issue 9, 2014.

Sotarat Thammaboosadee, Atchara Dokulab: A Framework of Integrated Intelligent Judicial Information System, Proceedings World Conference on Integration of Knowledge, 2013.

Sotarat Thammaboosadee, Bunthit Watanapa, Nipon Charoenkitkarn: A Framework of Multi-Stage Classifier for Identifying Criminal Law Sentences, Proceedings International Neural Network Society Winter Conference, 2012.

Deepika Sharma: Stemming Algorithms – A Comparative Study and their Analysis, IJAIS International Journal of Applied Information Systems, ISSN: 2249-0868, Vol. 4, No. 3, Sep-2012.

V. Bijalwan, Pinki Kumari, Jordan Pascua nd Vijay Bhskar Semwal: Machine Learning Approach for Text and Document Mining,

Nicole Kelly: Information Retrieval Using Vector Spaces, 2012.

Dhruv Gaur: Data Mining Visualization on Legal Documents, Proceedings International Conference on Recent Trends in Information Systems, 2011.

Rafael Geraldei Rossi, Solange Oliveira Rezende: Building a Topic Hierarchy Using the Bag-of-Related-Words Representation, Doc Eng’11, ACM, September [19-22], 2011.

Rafael Geraldeli Rossi, Solange Oliveria Rezende: Generating Features from Textual Documents through Association Rules, University Press, Sao Crlos, Brazil, 2010.

M. Sarvanan, B. Ravindran and S. Raman: Improving Legal Document Summarization using Graphical Models, Press-IIT Madras.

Nicola Zeni, Lusia Mich, Jhon Mylopoulos: Applying Gaius-T for Extracting Requirements from Legal Documents, University Press.

C. Biagioli, E. Francesconi, A. Passerini, C. Soria, S. Montemagni: Automatic Semantics Extraction in Law Documents, ICAIL’05, ACM, Italy, 2005.

Ronald P. Reck: Doing Justice to Data Standards – The Global Justice XML Data Model, 2005.

Laurens Mommers, Wim Voermans: Using Legal Definitions to Increase the Accessability of Legal Documents, University Press, Netherlands, 2005.

Erich Schweighofer, Andreas Rauber, Dieter Merki: Some Remarks on Vector Representations of Legal Documents, IEEE, 2000.

Erich Shweighofer, Dieter Meki: A Learning Technique for Legal Document Analysis, ICAIL-99, ACM 1-58113-165-8/99/9, Oslo, 1999.

Kanchaiya Lal, N.C. Mahanti : Roe of Soft Computing as a Tool in Data Mining, International Journal of Computer Science and Information Technologies, Vol. 2, ISSN:0975-9646, 2011.



  • There are currently no refbacks.