Dimension to analyze strength of the relationship

Posted: February 7th, 2011 | Author: | Filed under: Data Mining, Facebook, Research Work | Tags: , , | 1 Comment »

As a grad student I read many publications, and as requirement for one of my grad class, I read a paper published by University of Chicago urbana-champaign. The title of the paper was Predicting Tie Strength With Social Media . It was very interesting and I was fascinated by how social media can use (to some extend) to predict the strength of the relationship. But I do have my opinion on this paper and here it is,

Finding a dimension to analyze strength of a relationship is not an easy task. Human behavior and human thought process is very complicated. Emotions and feelings are two things that changes in a rapid pace. It would have been nice to identify the status of the users mood while they were answering because that can have an effect on emotions and feeling which may change the strength of the relationship.

Until the results sections I was questioning this paper with respect to how would they determine a relationship between a couple, most interestingly a relationship between husband and wife. They could very well open Facebook account and never communicate via Facebook. But results sections really answered my questions because they identify outliers seems to have higher strength in the relationship. I think this is a reasonable argument where I find its very true. It did fascinate me how these outliers very much could represent the married, engaged or people who are in a relationship. So can we identify relationship status based on strength of the relationship between each other.

I don’t agree how they used education as a social distance variable. I don’t think that can be used as measurement of the relationship. In real world people don’t become friends with each other by asking level of person’s education. Prime example for this is Asymmetric Friendship that they have identified. Most grad students end up having great friendship with their advisor and continue through life time. Now they may end up with a PhD later in their life upon graduation but not necessarily.

But overal this paper was really interesting. It also reminded me the The Social Network movie. According to the movie, Facebook was started based by ranking pictures of friends. This research is not that far from it. ;)


Pivoting 3Million user records with respect to FB User and Type, Subtype

Posted: May 1st, 2009 | Author: | Filed under: Facebook, Java Ruled | Tags: , | No Comments »

So far the Our Data Mining Implementation for Facebook Data is going really good,  I am currently working on an algorithm to identify 1 and n items sets out of our raw data set using the extraction I build earlier. But I just finished writing a sweet little code to pivot 3 million user records.

Our implementation has a DB.java file that handles all our DB call and direct sql stuff, I know what you thinking we could have use some fancy hibernate but this is fast paced development so we dont have time to work with hibernate stuff.

1
2
3
4
5
6
// connect to the database, any database connection changes should take
// place in DB.java
DB db = new DB();
DB db2 = new DB();
db.init();
db2.init();

Call a DB.java to get disntinct type sub type facebook groups, and dump them in Java String Vector.

1
2
3
4
5
6
7
8
9
10
11
12
13
                        // get the type and sub type
                        Vector<Vector<String>> grpTypeSubType = new Vector<Vector<String>>(
                                        db.getTypeSubType());

                        // printing all the type sub type pairs as column headers
                        //System.out.print("UserId, ");
                        ps1.print("UserId, ");
                        for (Object object : grpTypeSubType) {
                                if(!object.toString().equals(null)) {
                                        System.out.print(object.toString() + ",");
                                        ps1.print(object.toString() + ",");
                                }
                        }

Here we go fun begins, this loop ran 38 hours and finished the pivoting for 3m records.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
                        // get user id vector and find type and subtype for each userid.
                        Vector<Long> users = new Vector<Long>(db.getDistinctUID());                    
                       
                        ResultSet fbUsers = db.getUsers();
                        System.out.println();
                        ps1.println();
                        System.out.println("Start Pivot..");
                        while (fbUsers.next()) {
                                Long userId = fbUsers.getLong(1);
                                //System.out.print(userId + ",");
                                ps1.print(userId + ",");

                                String groupType = null;
                                String typeSubtype = null;
                                int count = 0;
                               
                                for (Vector<String> vs : grpTypeSubType) {
                                        groupType = vs.get(0);
                                        typeSubtype = vs.get(1);
                                        if(groupType != null && typeSubtype != null) {
                                                count = db2.getTypeSubTypeCount(userId, groupType, typeSubtype);
                                                if (count > 0) {
                                                        //System.out.print("Y, ");
                                                        ps1.print("Y, ");
                                                } else {
                                                        //System.out.print("N, ");
                                                        ps1.print("N, ");
                                                }
                                        }
                                }
                                //System.out.println();
                                ps1.println();
                        }
                        fbUsers.close();
                        ps1.close();
                        System.out.println("Done pivot, check the file");
                }

You can get more information on our Data Extraction and FB Data Mining implementation at Google Code under GNU General Public License v3. Also you are more than welcome to use multiple Preprocessed data sets that I formatted to fit in applications like Weka etc.


Moving out from MovableType to WordPress

Posted: April 14th, 2009 | Author: | Filed under: Facebook, Technology | Tags: , , , | No Comments »

For last five years I had a good time with MoveableType blog that I hosted under University of Minnesota UThink blog Server. But now that I have my own domain its time for me to move from UThink to WordPress. Plus I wanted to customize my blog with Facebook Connect, not having access to root level of the UThink was bit hard to get a good customization. So I have decided to install wordpress and also plan to heavily customize it by using fbconnect.

One other thought, I have seen the Google’s approach of open social and so far I have only took peek at it, and first thing I realized is that it has way better documentation that facebook.

In face if you would like to move out from MoveableType here is the link that might help you.


Hosting Facebook Data Mining Project in Google Code

Posted: March 14th, 2009 | Author: | Filed under: Data Mining, Facebook, Information Retrieval, Research Work | Tags: , | No Comments »

I have created a Google Code project to host our Facebook base Data Mining development. We will be checking in all the development starting from extraction to all the development of our mining concepts. The project name is data-extraction-facebook. Since this is a research project, only myself and Mark are allowed to do check-ins. All the development will be done by myself and Mark and Anjana will provide her weka knowledge to compare results that we get from our data mining implementation with Weka.


Created a Google Code Project for FB Data Mining

Posted: February 28th, 2009 | Author: | Filed under: Facebook, Research Work | Tags: , | No Comments »

I have created a project in Google code, project name is data-extraction-facebook, the project include extraction all code related to our Mining research will be checked in here. All the code is release under GNU Public license. Since this is a research project only team members have access to check-in, in that case most of the development will be done by myself and Mark Henning.


Facebook Application to Extract Data for Data Mining

Posted: February 25th, 2009 | Author: | Filed under: Facebook, Research Work | Tags: , , | 2 Comments »

After couple of stabs at facebook-java-api, I was able to build my frist facebook application which happened to be the first stage of developing a app to extract data from facebook for my Data Mining research. At Facebook has three types of RestClients that you can invoke to do your development. And in my case I use FacebookXMLRestClient.

At some point I will write a tutorial type blog post on building a facebook application, but for now I will have the screen shots. I have a feeling going forward I will run into an issue because, my intention is to populate the data model while I am extracting data. Since this is web approach I have to use Hibernate to get the database populate. For research project like this where I want to spend more time on developing our mining concepts rather than ORM technologies is not the best place to experiment my hibernate skills. So at somepoint I have to figure out away to make this as a desktop application and use regular JDBC connection to do data population.

As of now the application I wrote login and extract all the IDs of my Facebook friends, and the next step would be to iterate over each of my friend user id and get their friends.

Facebook Data Extraction

Facebook Data Extraction

Here is the snapshot of the code, at very soon I will be hosting this project as a Google Code Project and will put the link out.

Looks like I will be more involve with the facebook-java-api team, so far I think we need more documentation on java api. Also I dont know why Facebook decided to stop supporting facebook java api.


Get Adobe Flash playerPlugin by wpburn.com wordpress themes