Monday, December 21, 2009

Want instant results? Concentrate on improving the worst performers.

"He who joyfully marches to music in rank and file has already earned my contempt. He has been given a large brain by mistake, since for him the spinal cord would suffice." -- Albert Einstein

Concentrating on the worst performers can be particularly fruitful in a well-managed organization, where performance is distributed in a close to random manner.

Mind you, we are not talking real improvements here. We are talking about putting yourself in a situation where there is nothing, but the upside.

Here is how you can spot the right type of the "worst" performers. They are usually smaller in size compared to an average performer. Whether it is smaller regions or smaller segments, groups of smaller sample size tend to have a higher variance, and thus, a higher probability to be an outlier, including the "worst" kind. You also need to make sure there are no systemic factors that drive poor performance, or otherwise simple random nature of the world may not be able to compensate for the systemic factor. That means you will have to actually do something to improve performance of your "worst" group. So, check your averages over several time periods to confirm that your worst performers had a particularly hard time when you pick them.

After you have validated the random (or even semi-random) nature of your "worst" performance group, you can crate a project to improve how it is doing. You would not expect your "worst" performer group to stay exactly the same every period, would you? If so, there are pretty good chances that next period you will "pull" a good number of the worst performers closer to the average, or out of the "worst" group. Congratulation, you just got yourself a "real" quantifiable result!

Remember, there is always a bottom 5% to work on!

Simplicity is the king

"If you can't explain it simply, you don't understand it well enough." -- Albert Einstein


Today I had a conversation about a very interesting churn model that we may try to build. The model will let us assess impact of different factors on churn, one of those factors is price, or to be precise, pricing changes. When the conversation ventured to the problem at hand, which is to quantify the impact of the most recent price change, I had to explain that I do not want to put this price change into the model. This is anathema to someone with an interest in econometrics, however, there is not as much driven by the scientific truth as by communication, i.e. being able to explain your results. Though adding the most recent data will improve the model, it is unlikely to help with understanding of the issue at hand by those with little knowledge of regression.Having a known coefficient is good, but it is hard to explain what this coefficient means to layperson. Even if you express it in the form of elasticity, let's say, your churn goes up by 1.5% per every percent of a price increase, it does not quite mean anything to most executives. The alternative approach we agreed upon was to build the model on the data before the price increase, and then determine the churn baseline for every segment we are tracking. Then, we can compare post price change churn to that baseline to show the difference. For example, you had a 2% rate increase for this group of customers, and their churn was 6% compared to 3% we would have expected with no price increase. That is something people can understand.

Another example of simplification to aid communication is the correlation analysis I have done a few years ago. For every variable X correlated to my output Y (sales), I would create a bar chart of Y by grouping subjects with "low X", "medium X" and "high X". This spoke better than any scatterplots or correlation numbers. The only difference is correlation in time between two variables - when shown on a nice chart and visibly correlated they make the best case for making executives feel smart.

Friday, December 11, 2009

When comparing, make sure the groups are representative

Sometimes people call them "matched", which is a layman's term for representative. So, why do they have to be matched? Because having non-representative groups may be so misleading, your analysis result may be the opposite of what they should be.

Here is a quick example. Let's say you have two groups of customers, each one of them consists of customers of two types/segments. Sometimes you may not even be aware that there two types of customers in your groups. Let's assume those segments exhibit different behavior. The sample behavior I chose was Churn, but it may be anything. Let's say we applied some sort of treatment to Group #2, and their churn went down by 1% in both segments. We are trying to use Group #1 to establish a baseline (or, what would have happened to the best of our knowledge) to Group #2 if we had not had the treatment. However, because the composition of groups is not representative of each other, we get exactly opposite result for the total - Group #2 appears to have a higher, not lower churn. See table below.















Group 1
Group 2
Difference







Segment #1






1,000
5,000








Seg #1 Churn






5.0%
4.0%
-1.0%







Segment #2






5,000
1,000









Seg #2 Churn






2.0%
1.0%
-1.0%







Total






6,000
6,000









Total Churn






2.5%
3.5%
+1.0%


Wednesday, December 9, 2009

It's easy to calculate a number, it is much harder to tell what it means

We all know, that's how it usually starts - we need to know X. So, let's say X=25%. Is it bad? Is it good? Too high? Too low? In my practice, when I bring the number within context with all the other numbers that shed light on what it means... people think it is too much, that they do not need "all of that". That just need one number. So, when they get the number, they start asking questions about "all of that". The circle is now complete. If anyone knows a way out of this conundrum, please let me know.

Tuesday, December 8, 2009

My take on the classic

"Every truth passes through three stages before it is recognized. In the first it is ridiculed, in the second it is opposed, in the third it is regarded as self-evident" - Arthur Schopenhauer

Here is my version: "Every truth passes through three stages before it is recognized. In the first it is ignored, in the second it becomes a fad, in the third it is forgotten".

Every business is seasonal

At least, I have yet to see one that is not. Out of all basic analytical concepts seasonality is the one that I find underestimated the most. On more than a few occasions I have heard that "our business is not that seasonal" while in reality, seasonal swings may explain up to 80% of sales deviation. Always check for seasonality.

Monday, October 26, 2009

Scientific Principles of Direct Marketing Optimization

Here is the list of my principles so far:
  1. Always use a control group. Preferably randomized, representative (of your treatment) control group takes care of other things going on in the marketplace, including your own campaigns, and also acts as the “great equalizer”, fixing even the worst metrics, response models.
  2. Maximize lift, not response. Lift is the difference between treated and control groups. That’s what you are trying to impact.
  3. Optimal frequency is often more powerful/important than optimal segmentation. Ideally, you want to optimize (i.e. maximize lift) frequency by segment, but if you are confused where to start testing, you should start with frequency. None of the segmentation work will be insightful unless your frequency is within the shooting range of the optimal.
  4. Test, test, test. It’s one of the easiest and simplest ways to learn.
  5. When testing, have a hypothesis, then design the test around it. Sending a customized piece to a segment is a great idea, until you realize that you did not send your regular piece to the same audience at the same time, and thus can’t tell whether customized piece would have done better than a regular one.
  6. Track your treated group against control group for a while to understand how long the impact of your mailing lasts. Some people want to use LTV. That’s because they want a higher ROI. True measurable difference traceable to the impact of a direct mail piece rarely lasts more than a few months, even though average lifetime maybe measured in years.
  7. When choosing the size of the control group, you first need to understand what kind of a difference will justify the effort (i.e. break even lift), and then determine a sample size that will make this difference statistically significant. If you’re measuring with a yardstick, it’s hard to determine a half-inch of a difference.