Analytics Strategist

March 28, 2007

BS in data analytics

Filed under: Datarology — Tags: , — Huayin Wang @ 5:57 pm

I am utterly disgusted of Bull Sh**ing in Business Intelligence and Data analytics world. this is not to say that I think BS appears less in other areas, just that my biology responds more to what happens in my world.

It is hard to get a collection of all the BSs I have seen, for I choose not to remember them.

One I saw this morning:

“XXX helps YYY achieve 100m in annual online sale.”
— how much help is there?

March 27, 2007

You are what you speak …

Filed under: Random Thoughts — Tags: — Huayin Wang @ 8:04 pm

it is utterly important of what language you use, actually: common language or programming language or …

Can you think of thing which is not in your language?

there is an interesting video related to this: commonlisp

March 26, 2007

Trading Analyatics

Filed under: Datarology, Technology — Tags: — Huayin Wang @ 7:41 pm

What is the right analytics for trading?

I believe this is a wrong question.
(Of course, it make sense in context, like everything else. But for those truly seeking, it is a pre-requisite for it not to be limited by context, implicit or explicit).

What’s the most important contextual factors that’s been ignored? The data available.

This apply to all data analytics. In fact, one of the blind spot of the field of data analytics is the linkage between data and the form of data to analytic techniques. What’s been emphasized before are the linkage between problem and analytics.

[Data(/Form)], [problem], [and analytical technique] are the three pillars of the field.

It is not just the “problem statement” that define the context of data analytics, it is the combination of “problem statement” and the data availabilility and the form of it that co-define the analytical context.

So we have to begin with the data components before we start talking about trading analytics. And we know most people start with the wrong foot when they just assume that they are only to be using the same data that everyone using. Finding out critical data to use is such an important thing is trading analytics today, more important than what analytical techniques to choose from.

What’s next after search?

Filed under: Business — Tags: , — Huayin Wang @ 6:27 pm

In other words, what’s next after Google?

I believe it will be a technology that allow social search – an IM like application which has the following feature:
interface with Web, SMS, IM, CellPhone etc.
you can ask any question and get answers instantly, from people. The underlying technology optimally select channels to broadcast your questions, where people can subscribe to the channels they like.
Method of earning points, through activities and feedbacks. The point can be earned, purchased and used to purchase priority services.

Business model? Oh, I can’t even begin to think all the money that wanting to get in 🙂

March 23, 2007

What is Data, really?

Filed under: Datarology — Tags: , , — Huayin Wang @ 8:40 pm

Common definition found on the web, all share similar construct:

Factual information, especially information organized for analysis or used to reason or make decisions. ( (Webster is similar)

There are many versions of it that define Data using the word “fact” or “factual information”. This is unacceptable. For data is itself carry no assertion about the quality of it, whether it is fact or not is an after fact as long as the definition of data is concerned.

Using “information” to define data is not proper either, for whether data is information or not is relying on the users and how users understand the data: data is more “primitive” than “information”, not the other way around.

I like the following better, although I am not perfectly happy about it: Data is a structured form consisting of datum. I like it because it does not imply any implicit relationship between its explicit forms and the external world. It does not say limit its structure to any kinds, table, row, collection, independent observations etc. are all artificial frame, not general enough to be considered in the definition. It also does not imply the present of any external knowledge, or preprocessing routines or any specialized observers.

Give me some example, you ask. First of all, the simplest data example is a datum – the atom as far as data is concerned. Because it is datum, itself should not have any sub-structure, so this is saying that it can have nothing but a name or a label. As to the form of datun, it really does not matter, as far as it is looked at as simplest data. Datum can have name and value.

Next, data can be a collection of observations (datum).

Next, datum can have attributes which “describe” observations. An example of it will be “continuous”, “discrete”, “ordered” etc. Attribute may have name and value as well.

What we called “data table” is just one common form of data. Other forms of data include: network data, transaction data, graph data, time series, text data.

“All these are common sense”, you said. “What’s new?”

Well, all the common data analytics are analytics on “Table-like” data. The analytics for other forms of data are so much behind. This is a problem, this is an opportunity.

March 22, 2007

Stupidity in theories

Filed under: Random Thoughts — Tags: — Huayin Wang @ 3:01 pm

There is a funny Chan story about an erudite ancient Chinese scholar/poet, Su Dongpo, of Song Dynasty:

One day, Su Dongpo went to see his friend Foyin, a Chan master at Golden Mountain Temple. Feeling so good after a total relaxing mediation, Su asked Foyin, “What do you think of my sitting posture?”, “Very magnificent. Like a Buddha!” Su Dongpo felt even better. Foyin then asked the same question to Su Dongpo, he answered, he answered, teasingly, “like a pile of bullsh**”. Foyin smiled and did not utter a single word.

Su Dongpo thought he had beaten the Chan Master Foyin because the Chan Master was wordless while being compared to a pile of bullsh**. He was so proud of himself that he told everyone he met, “Today I won.”

When his little sister heard about this, she said “Brother, you just proved you are in an inferior state of the mind! It is because Chan Master’s mind is actually that of a Buddha that he could see you as a Buddha. As your mind is like a pile of bullsh**, you, of course, saw him as a pile of bullsh**.” Su Dongpo was speechless upon hearing this.

The self-reflective nature of “That” always amaze me. It is not that if I see bad people, I am bad; but at a higher level, the content of your mind reflect upon itself and limiting itself when looked at from a higher perspective. “People can’t understand this because they are stupid” — well this could prevent you from discovering how to make your better understood by people.

During the many years working in data mining and predictive modeling business, I have heard people constantly talking about how “you have to dummy down” things, and compromise the power of your models if you want to sell it to clients or marketing or none technical people or any of those “not smart” people. It is unsettling and uncomfortable to me. If I can’t make sense of the “other minds” in my world, I can label them as “stupid”; unless the stupidity label disappears, I am likely unable to make sense of those minds.

You maybe surprised if I told you that this post is triggered by a nice blog by dave kellog, in which he was speculating how some business trends can be explained by the single reason that “people do not like to buy products with magic components”. One of those thing happens to be the never taking-off field of “data mining”.

March 21, 2007

I made up the word: datarology

Filed under: Datarology — Tags: , — Huayin Wang @ 5:53 pm

My confession to all: I have a habit of making up new words, some of them are interesting while others felt like dried up “luo bo”. I do not know which one datarology belong to.

I was actually worked hard to come up with this word. Being in the business of data mining/statistical modeling/pattern recognition/AI for over a decade, I am so tired of the akward situation I am in whenever people ask me what I am doing.

What’s my profession, really?

I feel funny also, when I read job specifications, it really run wild: statistician, data miner, data mining researcher, predictive modeler, research scientist, pattern recognization scientists, data analysts etc. Deep in my heart, I know exactly what they are looking for, but it just hard to put it in word. If this is not a situation needing a new word, what will be?

What would you call someone who understand how to get the crucial intelligent/reliable knowledge out of data; and who know all the tricks and traps of working around often messy and unfriendly data, in their many forms, someone who understand the how the size/dimention/the bias in the data can impact what you can do and draw out of it?

People from statistics call them statistician, but it is only part of their skills. Computer science label them as data miner, but to me “mining” is hardly the most appropriate analog for the thigns they do. They can be as sophisticated as the most challenging science and need to have sense as practical salesmen. What would call them, and what would you call this discipline?

It is such a growing field, much like the computer programming 20/15 years ago: many companies begin to find that they need someone in this profession that they never dream to. Now everyone get data, a lot of data, in fact too much data. Once the need to have database to host them is satisfied, what do you do with the data become the question.

10 or even 5 years ago, who would think that having too much data could be a problem? And how much does that changed! Yet, many still believe that one just need to look and think to get the most out of their data, waiting to be bitten once, twice, over and over until they realize that they need someone who have this special knowledge and skills.

What about numerology? Numerology is about number, while datarology is about data. There is one contrast though: numerology is about reading much out of little, at least on the surface. Datarology, on the other hand, is about reading little out of much data. These are about as much sense as I can make out of this word.

How this could possibly helping anyone for anything? Well, it could if one day you see a job ad that day: A level-3 datarologist wanted!

March 15, 2007

Only One Thing is needed.

Filed under: Random Thoughts — Tags: — Huayin Wang @ 2:42 pm

The #1 skill for anyone doing anything.

I believe the most important skill, for anyone, is the quick intuitive assessment of the significance of things: how big the issue is, what the scale of the problem, what’s the scope of impact of it, the strategic significance of it.

It is this assessment that drive the next move: the level of attention and the amount of resource allocated to it; and without the right level of attention and the right amount of resource allocated to it, you can hardly get treat it with the right strategy.

The more complex the decision context is, the less rountine and predictable the incoming situation is, the more important this skill is. When facing this situation, the best strategy is to put your frontline with people with the best above skill. This is hard to do since those people also tend to be the most skilled in doing many things as well.

One justification of it comes from the optimal allocation of attention/resources in decision theory; which says that your first meta-decision before any decision is the decision for how much attention/resources you are going to allocate for RESEARCHING the issue. And normally, it is based on your skill #1.

This skill is difficult to cultivate.

March 13, 2007

Leadership meta skill set

Filed under: Business — Tags: — Huayin Wang @ 10:00 pm

There is a need to define the set of meta-skills.

Most of the leadership skills that people use fall into two categories: that they are too general making it almost as empty as “skills to do the right thing”, or too concrete so that none of them are really critical under most circumstances.

What I wrote in the early post is really about the meta-skills.

On Leadership

Filed under: hbr review — Tags: — Huayin Wang @ 3:27 pm

Just what is need for a business leader, in fact, any leader? Two things: know the right things to do and has the ability to do it.

The knowing is more important than the ability of doing: if you do not know the right thing to do, the ability to do anything is irrelevant! While in case you do not have the ability to do thing right, the knowing may lead you to know how to get it done, maybe in a detoured way.

Game provides a great context to understand the basics of “knowing the right move”.

I used to play the game of Go, which is an ancient Chinese board game, popularized later in Japan, South Korea and China. The rules are simple to learn but the strategies are difficult to master. There are things I learned from playing Go, that I think are generally useful.

Recognizing that the heart of decision making is evaluating and balancing alternative strategies, evaluating the set of the following 4 questions has been my guide in many decision makings:

1) How big is the issue – scale/impact?
2) What’s the chance for success – each outcomes?
3) What’s the optimal measure – passive/aggressive, qty of investment?
4) When is the right timing?

The 4 questions must be seriously asked and played with, in that order.

The outcome strategy, when passed through the above process, will have less of the common problems which infect many organizations:

1) wasted resources working on small issues. While the results could be clear, convincing and even beautiful, it is nonetheless minor.
2) wrong estimations on potential outcomes caused by wishful thinking and others
3) over/under investing
4) wrong timings, prioritizing and sequencing

Coming back to the leadership quality. As I see it, the ability of the leader to intuitively grasp and making right judgment on the scale/impact of an issue is top most important.

Blog at