Analytics ROI

Here’s a great article on calculating the ROI on analytics activity in a company and some templates to help make your own model.

Making Use of Google Analytics Statistics

You can login and look at statistics on your site all day long but how can you take actionable steps to improve your site based on the statistics your looking at. The following are some techniques:
- Conversion Rate: This is the mother of all metrics to look at. It represents the percentage of users visiting your site that do what you were hoping they would do. In E-Commerce its a customer who makes a purchase, in a contact form its a user filling in their information and sending it to you. Try to tweak your user interface and see if you can see a shift in conversion rate. Make an action button more prominent and see if you get results. Look at conversion rates broken down by referring page, search keywords, ads, geographic region, etc. and see if you can detect specific patterns and brainstorm on ways to build off those trends.
- Bounce Rate: measures the percentage of people who come to your website and leave “instantly”. If your page doesn’t provide all the information a user needs then you may be in trouble.
- Exit Rate: Every user must exit at some point but on what page do they exit the most. If its half way through a guided tour then you have a problem. If its on the congratulations on making your purchase page then your in really good shape. This is a good spot to look for the wholes on your site

Semantic Search ala Truevert

Interesting approach to semantic search. Information Week review of Truevert

Powerful New Search Engine – Powerset

Powerset has an interesting new search engine that claims to go beyond free text search and understand the meaning of your queries. While this is a claim that has been made many times (Autonomy, InXight) I will watch Powerset’s ability to handle a larger more unstructured dataset before becoming convinced. It would seem that for the time being they are focusing on the public web. It will also be interesting to see if they sell their technology to companies wanting a search tool capable of searching internal private documents. In my opinion this is where powerful contextual understanding engines are most sorely needed.

Some background into the Bunk name….

History may be Bunk but here is some bunk history. With the last name of Bunk I’ve been asked many times of the name got shortened on the boat over to the US. Well it turns out is did. My Great Grandparents Josef and Agnieszka Bak came over to America and at some point their name got changed to Bunk. Ellis Island has a great website where you can look up the records of passengers coming over. Below if the one for my great grandma.
Arrival Record of Adnieszka Bunk
Interesting to note on the manifest that she had not previously been in the US, so she was coming to join Josef at that time. His last name is also given in the list as Bak. I can find no record of his arrival, probably came before Ellis was constructed. She came from Komorow, which I believe is located west of Krakow in what was then Silesia (a part of Germany, since Poland didn’t exist at that time). My 2nd cousin Jeff remembers that Mom (My great Grandma) said she remembered her father hating the Prussians (e.g. Germans?) which would make sense if they both came from somewhere in Silesia.

Free DMTA Tools

One of my favorite free tools dealing with visualization is Treemap
If you were to put together a data portal to constantly evaluate the trends of data in the agency this would be the type of thing that gives a nice overall snapshot of the data and the direction it is going. It also serves as a nice starting point for performing a specific analysis on a set of data. Here is a great example of what Treemaps can do.
The one thing to know about this example is they have taken the free concept and initial source code of Treemap and extended it beyond it’s capability to be specifically geared to financial analysis. They charge a license fee if you want to use their extended version.
Traditional statistical/analytic techniques are provided for free by the R project. The R project is modeled after the S project and contains most of the same functionality. R can perform linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering, and present the results graphically.
The next few free tools are as much source code APIs as they are tools. They are powerful but difficult to use tools that require text and data mining theory to effectively use.
The first tool is Kea. It performs text key phrase extraction. Think of it as a way to figure out key concepts in unstructured text.
The next tool is Weka. Weka is by far the best free (open sourced) software package I have scene for data mining. It contains the majority of methods of data mining discussed in the workshop (data pre-processing, classification, regression, clustering, association rules, and visualization).
Here is an interesting article about an example of someone using Weka and Kea to mine, organize and analyze an internet mailing list’s archives.
**Note its been translated from German so the wording is a bit off.
The first chapter of their results is available on line.
One more worth mentioning in the Free Tool Category is JFreeChart – free java class library for graphing
As you can see not all DMTA has to be expensive.

Statistical Data Mining Tutorials

Data Mining Tutorials

Starlight Information Visualization System

From Battelle Corportation (the people who created the CD) comes a software product that as they say in their website

couples advanced information modeling and management functionality with a visualization-oriented user interface