Posted: August 29th, 2009 | Author: Anuradha Uduwage | Filed under: Data Mining | Tags: Data Mining | No Comments »
Haven’t had time to do any blogging since I am extremely busy with finding relative publications for my proposal and reading them. On top of that I am reading Fractals Everywhere 1983 edition. Its all about fractal geometry and this is the first time I am reading a math book after a very long time.
Posted: May 1st, 2009 | Author: Anuradha Uduwage | Filed under: Facebook, Java Ruled | Tags: Data Mining, Facebook | No Comments »
So far the Our Data Mining Implementation for Facebook Data is going really good, I am currently working on an algorithm to identify 1 and n items sets out of our raw data set using the extraction I build earlier. But I just finished writing a sweet little code to pivot 3 million user records.
Our implementation has a DB.java file that handles all our DB call and direct sql stuff, I know what you thinking we could have use some fancy hibernate but this is fast paced development so we dont have time to work with hibernate stuff.
1 2 3 4 5 6
| // connect to the database, any database connection changes should take
// place in DB.java
DB db = new DB();
DB db2 = new DB();
db.init();
db2.init(); |
Call a DB.java to get disntinct type sub type facebook groups, and dump them in Java String Vector.
1 2 3 4 5 6 7 8 9 10 11 12 13
| // get the type and sub type
Vector<Vector<String>> grpTypeSubType = new Vector<Vector<String>>(
db.getTypeSubType());
// printing all the type sub type pairs as column headers
//System.out.print("UserId, ");
ps1.print("UserId, ");
for (Object object : grpTypeSubType) {
if(!object.toString().equals(null)) {
System.out.print(object.toString() + ",");
ps1.print(object.toString() + ",");
}
} |
Here we go fun begins, this loop ran 38 hours and finished the pivoting for 3m records.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
| // get user id vector and find type and subtype for each userid.
Vector<Long> users = new Vector<Long>(db.getDistinctUID());
ResultSet fbUsers = db.getUsers();
System.out.println();
ps1.println();
System.out.println("Start Pivot..");
while (fbUsers.next()) {
Long userId = fbUsers.getLong(1);
//System.out.print(userId + ",");
ps1.print(userId + ",");
String groupType = null;
String typeSubtype = null;
int count = 0;
for (Vector<String> vs : grpTypeSubType) {
groupType = vs.get(0);
typeSubtype = vs.get(1);
if(groupType != null && typeSubtype != null) {
count = db2.getTypeSubTypeCount(userId, groupType, typeSubtype);
if (count > 0) {
//System.out.print("Y, ");
ps1.print("Y, ");
} else {
//System.out.print("N, ");
ps1.print("N, ");
}
}
}
//System.out.println();
ps1.println();
}
fbUsers.close();
ps1.close();
System.out.println("Done pivot, check the file");
} |
You can get more information on our Data Extraction and FB Data Mining implementation at Google Code under GNU General Public License v3. Also you are more than welcome to use multiple Preprocessed data sets that I formatted to fit in applications like Weka etc.
Posted: March 14th, 2009 | Author: Anuradha Uduwage | Filed under: Data Mining, Facebook, Information Retrieval, Research Work | Tags: Data Mining, Facebook | No Comments »
I have created a Google Code project to host our Facebook base Data Mining development. We will be checking in all the development starting from extraction to all the development of our mining concepts. The project name is data-extraction-facebook. Since this is a research project, only myself and Mark are allowed to do check-ins. All the development will be done by myself and Mark and Anjana will provide her weka knowledge to compare results that we get from our data mining implementation with Weka.
Posted: February 28th, 2009 | Author: Anuradha Uduwage | Filed under: Facebook, Research Work | Tags: Data Mining, Facebook | No Comments »
I have created a project in Google code, project name is data-extraction-facebook, the project include extraction all code related to our Mining research will be checked in here. All the code is release under GNU Public license. Since this is a research project only team members have access to check-in, in that case most of the development will be done by myself and Mark Henning.
Posted: February 25th, 2009 | Author: Anuradha Uduwage | Filed under: Facebook, Research Work | Tags: Data Extraction, Data Mining, Facebook | 2 Comments »
After couple of stabs at facebook-java-api, I was able to build my frist facebook application which happened to be the first stage of developing a app to extract data from facebook for my Data Mining research. At Facebook has three types of RestClients that you can invoke to do your development. And in my case I use FacebookXMLRestClient.
At some point I will write a tutorial type blog post on building a facebook application, but for now I will have the screen shots. I have a feeling going forward I will run into an issue because, my intention is to populate the data model while I am extracting data. Since this is web approach I have to use Hibernate to get the database populate. For research project like this where I want to spend more time on developing our mining concepts rather than ORM technologies is not the best place to experiment my hibernate skills. So at somepoint I have to figure out away to make this as a desktop application and use regular JDBC connection to do data population.
As of now the application I wrote login and extract all the IDs of my Facebook friends, and the next step would be to iterate over each of my friend user id and get their friends.

Facebook Data Extraction
Here is the snapshot of the code, at very soon I will be hosting this project as a Google Code Project and will put the link out.
-
-
Facebook Data Extraction
-
-
FB App for Data Extraction
Looks like I will be more involve with the facebook-java-api team, so far I think we need more documentation on java api. Also I dont know why Facebook decided to stop supporting facebook java api.
Recent Comments