May 28, 2009

Twitter Analytics using Mathematica

Twitter has been just about everywhere lately.  As its popularity continues to grow, how do you keep track of the expanding list of friends and followers?  As more companies are using Twitter to promote their products, good tools are necessary to manage hundreds or thousands of contacts and analyze their Twitter social graph .

Twitter has an open api, and Mathematica is the best analytical tool out there.  Mathematica’s image manipulation capability also makes it a very nice data visualizer.  So let us plug Twitter in Mathematica, and play with the data.

In the examples, let’s use Wolfram_Alpha (my personal twitter data is not very interesting).  WolframAlpha has just recently launched, and they have been actively posting news and updates with Twitter.  Let’s first look at W|A’s recent tweets.

In[329]:=

twitter_analytics_1.gif

In[331]:=

twitter_analytics_2.gif

So here we import the user’s timeline, and transform the list of XMLObjects to a list of rules from attribute to value.

Now we can visualize the time and frequency of the user’s recent tweets in a timeline, where x-axis is the date and  y-axis is the number of seconds during the day (eg. 30000s = 8:20 am).   

In[334]:=

twitter_analytics_5.gif

Out[336]=

twitter_analytics_6.gif

After seeing Mathematica’s visualizations, we want to do analysis and play around with the friendship data. Let us load up the users Wolfram Alpha is following,

In[338]:=

twitter_analytics_7.gif

Out[338]=

twitter_analytics_8.gif

In[339]:=

twitter_analytics_9.gif

Out[339]=

twitter_analytics_10.gif

All friends’ profile data are loaded, and the size is consistent with the number on the website.  To apply a filter, say look for the friends with screen name begins with “a”,

In[340]:=

twitter_analytics_11.gif

Out[341]=

twitter_analytics_12.gif

And put the name to the profile picture.

In[342]:=

twitter_analytics_13.gif

Out[342]=

twitter_analytics_14.gif

One of most important features on Wolfram Alpha is performing arithmetic and comparisons on objects.  For example, try comparing between three stocks GOOG, AAPL and MSFT, or add up your nutrition value of a meal at McDonald’s.  I also want to compare between a list of friends, which can be easily done by:

In[343]:=

twitter_analytics_15.gif

Out[343]=

twitter_analytics_16.gif




Now I’m interested to find out which friends have most followers, by sorting the friends with follower counts, and take top 20 results.

In[344]:=

twitter_analytics_17.gif

Visualize of the results with a directed graph.  An arrow from user A to user B indicates that user A is following user B.

In[345]:=

twitter_analytics_18.gif

Out[345]=

twitter_analytics_19.gif

We can also visualize the same list of users in a bubble plot.  X - axis is number of followings,  y - axis is number of followers, and the size of the image is determined by the ratio of  followers / friends (ie, image is bigger if number of followers / number of friends is higher).

In[346]:=

twitter_analytics_20.gif

Out[346]=

twitter_analytics_21.gif

Because of Wolfram Alpha, most programmers probably have seen what Mathematica can do.  Its applications are endless: I have been using it to analyze stock arbitrage opportunities, and we are also interested in analyzing the way people read comics on iPhone.

The examples above were over spare hours (or minutes) over last few nights just for fun.  If anyone has any suggestions about Twitter / Mathematica, let me know (@kevenlin).  This blog post from WolframBlog will also show you other cool things such as tweet from Mathematica.

April 25, 2009
Newest addition to the family: Damian Lin


function days_from_today(date1) {
    var ONE_DAY = 1000 * 60 * 60 * 24
    var today=new Date()
    var date1_ms = date1.getTime()
    var date2_ms = today.getTime()
    var difference_ms = Math.abs(date1_ms - date2_ms)
    return Math.round(difference_ms/ONE_DAY)
} 
var birth=new Date()
birth.setFullYear(2009,3,13)
document.write("He's " + days_from_today(birth) + " days old today")

Newest addition to the family: Damian Lin

April 24, 2009

Pair Trading using Mathematica

Pair Trading is one of the ideas that I have been very interested in, but could not find right tool for analysis. Sure, most traders use Excel, but that’s simply too many copying and pasting cells. Patrick showed me the cool things he could do with Mathematica, and I thought that might be the perfect tool for stock analysis.

So what is Pair Trading? In short, it is a market-neutral strategy used by hedge funds and investment bank. Trader first identifies two stocks which prices have moved together in past. When the spread between the stocks widen for any reasons, buy the underperforming stock and short the outperforming one. When the spread eventually converge and the trader will close the positions and profit.

Mathematica has lots of impressive features for stock analysis. For example,


One line of code will grab GOOG prices since 2006 and plot on a chart. Note that I don’t have to write parsers for stock data, because financial data is part of larger set of curated data that comes with Mathematica.

Now let’s use Mathematica on Coca-Cola(KO) and Pepsi(PEP), a classic example of a correlated pair.



As expected, the pair has similar movements in prices over lasts year. Now we plot the price of KO divided by PEP and get the following chart. (Note: Blue Curve is the KO / PEP pair ratio, Red Line in middle is mean, Blue and Yellow Lines are 1 standard deviation, and Red and Green Lines are 2 standard deviation).



The price ratio seems to oscillate around a mean. But we do see when the ratio appears to go below and above 2 standard deviation from the mean. The strategy is to execute a paired trade when the pair ratio is over +/- 2 standard deviation from mean. When the ratio reverts to mean, we will close the positions.

As shown on the chart, the price ratio is currently around +2 standard deviation, now might be a good time to SHORT KO / LONG PEP. For options trading, you would buy a KO put and buy a PEP call that expire on same time.

Trader should be aware of risk of drifting. This happens when the two correlated stock prices start to drift apart. The risk can be controlled if trader closes the pair (and take loss) if their prices do not converge within a time interval (eg. 6 months) or when pair ratio is above certain tolerable line (say +/- 3 standard deviations).

Mathematica has simplified the analysis with stats and charting library and curated finance data. Only 5 lines of code are needed to produce this analysis. It is also worth mentioning Mathematica also has awesome documentation to help you get started. If you have Mathematica, you can download the notebook.

January 22, 2009

Two reasons why most entrepreneurs fail

“by doing nothing, and by doing the wrong things.” - pg via Hacker News

December 11, 2008
Family Portrait

Family Portrait

November 21, 2008
Ingredients for great beer.  Taken from Granville Island Brewery.

Ingredients for great beer. Taken from Granville Island Brewery.

October 23, 2008
A classic metaphor from Taleb : A turkey is fed for a 1000 days - every day confirms to its statistical department that the human race cares about its welfare “with increased statistical significance”.  On the 1001 day, the turkey has a surprise. 

A classic metaphor from Taleb : 

A turkey is fed for a 1000 days - every day confirms to its statistical department that the human race cares about its welfare “with increased statistical significance”.  On the 1001 day, the turkey has a surprise. 

that code should be seen not as a static thing, like the answer to a math problem, but as an evolving effort to figure out the right question to be the answer to; and that it should thus be written to be easy to change.

— Paul Graham, on the one thing every software engineer should know

October 7, 2008
Update: Years are automatically updated.


var montharray=new Array("Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec")

function countup(yr,m,d){
var paststring=montharray[m-1]+" "+d+", "+yr
var today = new Date()
var difference=(Math.round((today -Date.parse(paststring))*100/(365*24*60*60*1000))/100)

document.write("Time flies.  It\'s been " + difference + " years since the girls were born")
}
countup(2005,07,18)

Update: Years are automatically updated.

Ack bundle in TextMate - Faster Find in Project

One thing that bothered me with TextMate was the Find in Project function.  In searching for replacement, Olivier pointed me to Ack!, which is an excellent tool to replace find+grep.

While doing so, I came across a nifty Ack bundle for Textmate.  It’s REALLY FAST.  Enjoy!.

September 19, 2008
Open source library for quantitative finance

Open source framework for modeling, trading and risk management in finance.  Includes Python binding.  Despite the market turmoil, anyone wants to start a hedge fund?

(found via http://news.ycombinator.com/item?id=309198)

September 15, 2008
research done by mechanical turk

brilliant!  casual research by surveying random population using mechanical turk.

September 11, 2008

MemCache Flush All (for memcache_client )

Some times we might want to clear memcache without restarting memcache server.

When calling Cache.get / set, we are really using a wrapper module Cache that simplify access to memcache client.

To flush memcache, call CACHE.flush_all

September 9, 2008
Contact form for iTune support

This link is hidden so deeply, it always takes me few minutes to dig it up again.

September 3, 2008

Ruby include, extend, require and load explained

The Ruby ‘include’ statement references to a named module; it appends features of a module into classes or other modules.  

‘extend’ adds the features of a module to one instance (object) at run-time

‘require’ is used for loading nonmodule Ruby sources and binaries.  ’require’ is similar to a ‘load’ ,but it will not load a file if it has already been loaded.

‘include’ and ‘require’ are essentially unrelated.  You may need to do a require follow by an include to use some externally stored module.