Better Understanding that “Big Data” – Analysis and Interpretation

Dr. Brian Monger

Managers are expected to be expert not only at managing in the present, but predicting what will happen in the future.  They are expected to take information and forecast, budget, and calculate risk for their organisations and communicate projections to make decisions.

How is it possible for a manager to estimate future events with any degree of certainty when so many different outcomes are possible?  The answer is found in being able to undertake effective analysis of data they have gained from their research, through applying some of the basic techniques of statistical analysis.

To be able to stand behind their conclusions and decisions with confidence, managers need to be able to back up estimates with reasonable assumptions that others will accept.  Using numerical information correctly will improve their ability to communicate within the organisation.

The use of statistical techniques is a necessity.  By evaluating data, you can assign significance, discover trends, and project likely future events.

Numbers by themselves do not constitute information.  The meaning those numbers give is information.

Analysing marketing research.

In marketing research, the analysis and interpretation process explains the meaning of the data.  Interpretation refers to making inferences relevant to the meaning and implications of the research investigation and drawing conclusions about the managerial implications of the variables thereof.

From a management perspective the qualitative meaning of the data and the managerial implications are an important aspect of the interpretation.

Estimating with confidence

Managers, analyse collected data to produce information that will assist in addressing the management problem at hand.

To do this they can use many different techniques.  There are many analytical tools available to perform data analysis.  However given the nature of this text, it is not feasible to cover them all.  This chapter aims to give a working overview and leave the fuller details to more specific texts on statistics.

There are many software packages available to assist you do complex statistical analysis for those who wish to pursue the matter further.

Data and Information

Data has no value by itself.   Data is raw, it has to be ‘converted’ into useful information, and this conversion is done through analysis.

Making Data Usable

If managers could predict the future with total consistency and accuracy, they wouldn’t have to go to the office.  They could do better at Crown Casino, on the stock market, or at the race track.  But because the possible outcomes of future events are random or uncertain, prediction is not a precise art.

Narrowing the odds, though, is possible.  By identifying the degree of possibility connected to a future event can improve business judgement.  Intuition and common sense are still very strong tools; but managers will face decisions that cannot be made easily because the possible outcome involves a number of variables and a lot of uncertainty.  The mathematical techniques involved in probability analysis will help them improve their decision-making skills.

What makes statistical analysis useful is its ability to quantify uncertainty, thus enabling managers to make better decisions.  

Mathematics and Statistics are neither the Answer nor a Guarantee

An important point to keep in mind: The math involved in estimating the future, no matter how scientifically executed, is a secondary skill.  Of greater importance is the premise on which a study is based.  Because an analysis must begin with a series of numbers, the risk is that assumptions may be flawed.  Thus, before applying the techniques involved, examine your assumptions and test the information you use.

Analysis is only one of many tools.  Intuition, common sense, and business experience will affect every decision made.  Probability analysis is helpful in narrowing down the likely outcomes in the future, but it is not the last word.

There are no guarantees

Even when the odds favour one outcome to an extreme, there is no certainty that the outcome will follow the averages.  Opportunity and risk are constant companions in business.

Data Analysis – the basic elements

Transforming data into information

Data transformation is the process of changing the original form of the (raw) data to a format suitable for performing a data analysis that will satisfy the research objectives.

Data analysis converts data into a format (information) that assists managers better understand the situation and make better decisions.

Even qualitative (non quantitative by nature) research needs analysis.  What you see/perceive is not enough.  Managers need skills to interpret the input and output data. In fact much qualitative research can in fact be quantified using scaling techniques.

In using statistics in data analysis, we rely on Probability (the laws of chance)

Probability

Very little in life is a certainty.  Everything we do in business is based on the chances of a successful outcome.

The term Probability, refers to a mathematical estimate concerning the possibility of a particular outcome in a situation where there are a number of different possible outcomes. If, for example, we know that in a certain situation (e.g., a roll of the dice) there is a determined number of possible outcomes (e.g., six possible faces of the die), we can calculate the probability of a particular outcome or of a particular sequence of outcomes if the event is repeated.

Explanations of probability usually start out with the analogy to a coin toss.  If a coin is tossed once, we have a 50-50 chance that heads will come up.  If the coin is tossed 100 times, the average outcome will be 50 heads and 50 tails.  But it’s not certain that that exact result will be obtained each time.  It’s only the average.

In a business, things are not as simple.  For example, a manager may want to determine the probability that sales volume will be greater or less than previous years’ averages.  As long as nothing changes (conditions are the same for each year), there is a 50-50 chance of either event taking place.  But it’s likely that conditions today are not the same as in the past, so a 50-50 test would not apply.

The coin toss example makes the point about the theory of probability, but it doesn’t show how to apply the theory in a business decision.  Few of the problems to be solved are so clear that only one of two results will occur.  In the real world we face many variables.

Managers can do ‘guestimates’ based on chance or previous experience, but they should also to consider a number of other possibilities, such as:

•           Competition will be stronger in the future than in the past.

•           Perhaps fewer customers will be available during the coming year.

•           Some competitors may have an inside advantage on certain jobs.

•           Economic conditions will be better or worse than in the past.

Probabilities form a model for what is likely to happen in the real world.

It is important to stress at the outset that probability is a theoretical calculation of the likelihood of a particular outcome. Thus, it is based, not upon observation of actual events (e.g., listing the results from several tosses of the coin) but on calculations which are based on certain assumptions (e.g., that the coin or die toss is fair, that there are no hidden factors, like cheating, influencing the outcome).

Because the probability of an outcome is a theoretical calculation, knowing the probability does not mean that we can always predict a particular outcome with certainty (please understand this very important point). Often the result of a particular event or series of events will not match the theoretical probabilities.

Probability theory has a number of important uses. A simple but powerful application is its ability to assist us in establishing the (mathematical) likelihood of a particular outcome and thus to advise ourselves of the prudence of a certain action (e.g., betting).

We can express the probability of an event mathematically in a number of ways, but the most common is as a decimal between 1 and 0. Thus, a probability of 1 (p = 1) is certainty that the event will occur. A probability of 0 (p = 0) is certainty that event will not occur. Thus, all probabilities for an event are expressed as a decimal between 0 and 1.

The probability that a single coin toss will produce a head on top is expressed as follows: p = .5. This we might also express as meaning that the probability is 5 out of 10 or 50 percent. The decimal notation is the most common

One of the major applications of probability is to check the theoretical possibility of a particular outcome (calculated mathematically) with the observed result of an event. If we find that the observed frequency of a particular occurrence departs widely from theoretical probabilities (i.e., the result is occurring more frequently than we would expect from our calculations), sometimes we have good reason to investigate the event more carefully.

Significance

When the probability of an event (theoretically calculated) is very low and yet the event occurs, we are entitled to be suspicious or alarmed at the result. Such moments are an invitation to investigate the event to see if there are hidden factors at work (e.g., some hitherto unknown cause). When do such moments occur? At what point is the probability so unlikely (i.e., so low) that you should investigate further?

Mathematicians have set conventional standards for how much an observed result has to differ from the theoretical expectation in order for the result to be considered significant (i.e., worthy of notice or of further investigation, because it is not likely to have occurred by chance). That level is p = .05 (or 5 results in 100). If there is more than a probability of .05 (i.e., if p is greater than .05) that an event could have occurred by chance, we do not usually pay much attention to it. If p is less than .05, however, then the result is considered significant and warrants analysis, because something other than chance may be affecting the outcome.

The justification for this convention of a probability of .05 as the dividing line between significant and non-significant results is methodological and practical.

The basis upon which probability rests is as follows: we assume that the phenomenon we wish to investigate can be described in terms of a given number of different possible but equally probable outcomes. For instance, in a simple coin toss there are two equally probable outcomes, heads or tails. In one roll of a die, there are six equally probable outcomes, represented by the six sides of the die, each with a number of dots from 1 to 6.

Multiple outcomes (e.g., so many heads in consecutive coin tosses or so many sixes in consecutive rolls of a single dice) can be calculated from the probability of a single event (as we did above by multiplying the frequency of repeated events by the probability of the single outcome in one event). Events which cannot be described in terms of equally probable outcomes (perhaps due to bias or to too many unknown variables) are beyond the scope of probability theory.

The probability of any specific outcome is given by the number which is equal to the number of possibilities which involve that outcome divided by the total number of possibilities. For example, in a simple coin toss, there is only one possibility that a head will appear, and the total number of possible outcomes is two. Thus, the probability that a head will appear is 1 divided by 2 or .5.

Data Description

There are two applications of statistics: (1) to describe characteristics of the population or sample (descriptive statistics) and (2) to generalise from the sample to the population (inferential statistics).

Data description is about how we can represent data (unprocessed information) in useful way to help us understand it.  We are trying to see any underlying patterns or shapes from a pile of raw numbers.

The primary purpose of inferential statistics is to be able to make a judgement about the elements about which one seeks information.

Descriptive Statistics

Marketing researchers edit and code data to provide input that will result in tabulated (tables) information.  With this input, we can statistically describe project results.  All forms of analysis attempt to portray data so that the results may be studied and interpreted in a brief and meaningful way.

Descriptive analysis refers to the transformation of the raw data into a form that will make them easy to understand and interpret.  Not only do managers need to summarise the meaning of today’s numbers, they also need to relate the information as part of a trend.  Are the numbers better or worse than they were last month or last year?  And what is likely to happen in the future?  Such analysis rearranges, orders, and manipulates data to provide descriptive information.

As the analysis progresses, researchers generally apply the tools of inferential statistics.

Statistical analysis

Statistical analysis helps managers answer one of two questions:

1.         Does the result of the research differ significantly from another result or what was expected (the ‘hypothesis)

2.         Is the data reasonably predictable or is it just chance?

In developing an analysis, a manager needs to treat the research data in an appropriate way.

Dr Brian Monger is Executive Director of MAANZ International and an internationally known consultant with over 45 years of experience assisting both large and small companies with their projects.  He is a specialist in negotiation and behaviour He is also a highly effective and experienced trainer and educator

Did you find this article useful?  Please let us know

These articles are usually taken from notes from a MAANZ course.  If you are interested in obtaining the full set of notes (and a PowerPoint presentation) please contact us – info@marketing.org.au

Also check out other articles on http://smartamarketing2.wordpress.com

MAANZ International website http://www.marketing.org.au

Smartamarketing Slideshare (http://www.slideshare.net/bmonger)

One thought on “Better Understanding that “Big Data” – Analysis and Interpretation

Leave a comment