Social Communication Database

Published: 9 December 2019| Version 1 | DOI: 10.17632/wf5d5b2j52.1
Contributor:
Kailas PATIL

Description

The dataset was generated by real world text communication in the group conversation. So, maintain the anonymity of the usernames and the mobile numbers of those who participated in the survey. This dataset can be used for education and research purpose only. The main contribution of this research in text mining is to bring into being a standard dataset for research purpose in the realm of mining the chat conversation. We have observed that this dataset has an immense density to be utilized for research purpose. Our applications based on this dataset, you can utilize this dataset into semantic search, sentiment analysis, semantic clustering of conversation, topic extraction, spam detection, etc. We wish to offer this dataset for others to collaborate and research on further possibilities. We have used our algorithms to extract the textual information from whatsapp logs and stored it in a sqlite database file named as "social conversation.db". This dataset contains 16225 text messages and 839 distinct users. We have considered 17 whatsapp groups for extracting the textual information. Paper: Analysis of foul language usage in social media text conversation Authors: Sumit Kawate and Kailas Patil In the Proceedings of the Int. J. Social Media and Interactive Learning Environments (IJSMILE), Vol: 05, Issue: 03, Pages: 227-251, Inderscience, 201 DOI: https://doi.org/10.1504/IJSMILE.2017.087976 The data is stored in an .zip compressed archives. The uncompressed archive is in 6,020 KB (5.87 MB). Extract with any uncompressed standard software. ## The archive contains the following items: ## DATABASE/ | + Executable/ Directory containing executable files. | | + Social Conversation.db This file contains records of the database in .db format | + Source Code/ Directory containing source code files. | + Social Conversation (csv).csv This file contains records of the database in .csv format + Social Conversation (db).db This file contains records of the database in .db format + Social Conversation (html).html This file contains records of the database in .html format. | + Read Me/ Directory containing read me file. | | read me.txt This file contains detail information about dataset. ## The data format of the dataset are: ## =Table Name= -> CONVERSATION =Atttributes= =Meaning= USER_ID User id of the text message TEXT_MSG TextActual text message CONTACT_NUMBER Contact number of the user (We have masked the few digits of contact number of the user) DATE Date of the text message TIME Time of the text message =Atttributes= =Format= USER_ID User Id TEXT_MSG (text messsage in any format) CONTACT_NUMBER +contactnumber DATE dd/mm/yy TIME hh:mm AM or hh:mm PM =Atttributes= =Sample Example= USER_ID User 514 TEXT_MSG Any deal on formal shoes with prime CONTACT_NUMBER +919xxxxx927 DATE 04/12/17 TIME 1:35 AM --------------------------------------------------

Files

Steps to reproduce

HOW TO USE THE DATABASE -------------------------------------------------- Note: You can use any standard database manager or programming language for parsing records of the dataset. ## How to parse the dataset in Java ## *********************************** //import java.sql.*; //give path to the database in Netbeans IDE Connection c = null; Statement stmt = null; Class.forName("org.sqlite.JDBC"); c = DriverManager.getConnection("jdbc:sqlite:test.db"); c.setAutoCommit(false); System.out.println("Opened database successfully"); stmt = c.createStatement(); ResultSet rs = stmt.executeQuery( "SELECT * FROM CONVERSATION;" ); while ( rs.next() ) { String userid = rs.getString("USER_ID"); String textmsg = rs.getString("TEXT_MSG"); String contact = rs.getString("CONTACT_NUMBER"); String date = rs.getString("DATE"); String time = rs.getString("TIME"); System.out.println( "ID = " + userid ); System.out.println( "msg = " + textmsg ); System.out.println( "contact = " + contact ); System.out.println( "date = " + date ); System.out.println( "time = " + time ); System.out.println(); *********************************** ## How to parse the dataset in SQLiteStudio Manager ## [1] Install SQLiteStudio [2] Click on Database Menu->Add a database [3] Enter Database Type -> SQLite 3 [4] Give the path of the dataset where you have extracted. [5] Data will be displayed -> Data tab. [6] Use "Open SQL Editor" or Press -> Alt+E to open SQL editor. [7] sample query -> "select * from conversation"

Categories

Conversation Analysis, Text Mining, Sentiment Analysis

Licence