April 28, 2009

Mining twitter data

Who is the first reporter of the Mexico City earth quake?  I remember watching twitter second-by-second and @cjserrato was the first one reported the earth quake (the tweet id is 1630381373):

mexico city

Mining twitter data is a huge challenge.  So far I have not been able to see many interesting data/text mining and data analytics around twitter data.  I have been playing the data lately, and here’s a thematic/topic graph I had – a visualization of all tweets of the last eight hours that are related to to “mexico city”:

tweets of mexico city topic graph 

You can tell that “Swine Flu” still at the center of all topics, whereas earthquake is clustered alone to the side.

Have you seen any interesting twitter analytics (by the way, I do not mean the twitter metrics or counters etc..)?

Jeff Clark of NeoFormix has a great set of application, the best I have found so far.  FlowingData is another one.

September 11, 2006

random thought 2

Rakesh’s Data Mining Definition

An Expansive Definition of Data Mining (Rekesh Agrawal, KDD06):
Deriving value from a data collection by studying and understanding the structure of the constituent data.

This is the closest in meaning to what I have in mind for “Datarology”. I particular like the part of the definition where he used “understanding …” instead of “analyzing …”, because I believe Synthesizing is an equally important activity for Datarology as analyzing.

July 17, 2006

Douglas Lenat

interesting to watch and listen. His company is

watch the video: Video Presentation

