Saturday, February 8, 2014

Facebook vs Princeton vs Sumpter

In an earlier post I wrote about how useful the logistic growth equation is in modeling different social phenomena. I thought it was fun then, when a logistic growth fitting war broke out between two Princeton PhD students and some Facebook researchers. John Canarella and Joshua Spechler put up a paper on arXiv fitting an adapted SIR disease model to My Space and Facebook searches on Google. Their SIR model is essentially two logistic growths put together: one for infection with Facebook and one for recovery afterwards. They predicted that Facebook was now facing a 'recovery' phase where people will stop using it, and this abandonment would spread through social infection. 

This result was reported in several places in the media, despite not having gone through any  peer-review. Facebook researchers were fast to do their peer review for them, pointing out that a whole range of search indicators show that Princeton is in terminal decline. The main problem for the Princeton study, according to Facebook, is that they don't show any causation, just a correlation. 

In my opinion, the Facebook researchers review fails to seriously address the possibility that we might be seeing a 'recovery' from Facebook. Unlike Princeton or Air (their other example), Facebook is actually a social media, and its success has spread through a social contagion. There is therefore an underlying causal reason for applying an SIR model to Facebook. Where the Princeton study is limited, however, is in the data set it uses. In order to do this study properly access is needed to the log-in events for the users of Facebook. Data which Facebook have, of course, and I can't imagine they are about to hand over to two graduate students at Princeton.

If Facebook do want to test whether they are in decline or not using log-in data, I can recommend the method Richard Mann and I developed to look at audience applause. We use exactly the same model as the Princeton guys, but fit it on individual events. Substitute hand claps for 'likes' and Facebook can find out how its going for them.

Anyway, all this is a digression. I thought it might be illuminating to do a study of my own meteoric rise in popularity on Twitter. In January, I surpassed 100 followers. I am still somewhat behind another collective behavior Twitterer, Simon Garnier, who just announced his passing 1001 followers. On the right I show my increase in followers and I fit the logistic equation to the data. I was shocked to find out that I have already entered a 'recovery' phase. According to the fit, my following has leveled off.

Luckily, none of these results are statistically significant and my ultimate following could be anywhere between 120 and 7 billion. So keep following!

If anyone out there wants to check their own following in Twitter, I include the Matlab code to do this below. All you have to do is add the days since starting and along with the number of additional followers you have got on those days, and you too can know if you too are still a social contagion.


4 comments:

  1. Matlab Code:

    %Followers for the days before I followed anyone else.
    beforefollow=[1 1 12 1 2 0 1 3 4 1 0 0 1]
    days=[1:13];

    %Follwers from after I followed about 70 people
    afterfollow=[6 4 2 6 5 3 6 8 6 6 5 4 6 7 6 2 3 7]
    days2=[14:19 21 24 34 38 43 51 58 82 95 96 103 111]

    figure(1)
    prop=plot(cumsum(beforefollow),'x')
    set(prop,'LineWidth',4)
    set(prop,'MarkerSize',16)
    hold on
    prop=plot(days2,sum(beforefollow)+cumsum(afterfollow),'rx')
    set(prop,'LineWidth',4)
    set(prop,'MarkerSize',16)

    hold off
    set(gca,'TickDir','out')
    set(gca,'FontSize',16)
    ylabel('Number of followers')
    xlabel('Days since David joined Twitter')
    axis([0 120 0 120])

    %Put all observations together.
    allfollow=[beforefollow afterfollow]
    diffdays=[1 diff([days days2])]
    cumfollow=cumsum(allfollow)

    figure(2)

    dx=allfollow./diffdays;
    x=cumfollow;
    prop = plot(cumfollow,allfollow./diffdays,'kx')
    set(prop,'LineWidth',4)
    set(prop,'MarkerSize',16)
    set(gca,'TickDir','out')
    set(gca,'FontSize',16)
    ylabel('Increase in followers')
    xlabel('Number of Followers')
    hold on

    %Fit the logistic equation
    P = polyfit(x,dx,2)
    prop=plot([1:121],P(3)+P(2)*[1:121]+P(1)*[1:121].^2,'k')
    set(prop,'LineWidth',4)
    axis([0 120 0 8])


    simx(1)=1
    for t=1:120
    simx(t+1)=simx(t)+P(3)+P(2)*simx(t)+P(1)*simx(t)^2;
    end

    figure(1)
    hold on
    prop=plot([1:121],simx,'k')
    set(prop,'LineWidth',4)

    ReplyDelete
  2. Don't worry, Twitter followers come by waves. Post a couple pictures of cute kittens and you'll have more followers than me soon :-)

    ReplyDelete
    Replies
    1. Ha Ha. I tried with Penguins, but it didn't work.

      Delete
  3. I think an interesting question to ask is whether Facebook, Google and the likes have now reached a critical mass that prevents their quick fall. MySpace was very popular with teens but never had more than 100M users (I think they peaked at 95M) and certainly never appealed to corporate businesses for their promotion like Facebook does today for instance. Most of MySpace user base was very volatile compared to the user base of Facebook, Google or Twitter today.

    ReplyDelete