The Five Stages of Data Analysis



Master of Business Analytics

Master of Science in Analytics

American University’s online MS in Analytics program prepares students to apply data analysis skills to real-world business practices. The program can be completed in 12 months. No GMAT/GRE required.

  • Generally, this process begins with descriptive analytics. This is the process of describing historical trends in data. Descriptive analytics aims to answer the question “what happened?” This often involves measuring traditional indicators such as return on investment (ROI). The indicators used will be different for each industry. Descriptive analytics does not make predictions or directly inform decisions. It focuses on summarizing data in a meaningful and descriptive way.
  • The next essential part of data analytics is advanced analytics. This part of data science takes advantage of advanced tools to extract data, make predictions and discover trends. These tools include classical statistics as well as machine learning. Machine learning technologies such as neural networks, natural language processing, sentiment analysis and more enable advanced analytics. This information provides new insight from data. Advanced analytics addresses “what if?” questions.
  • The availability of machine learning techniques, massive data sets, and cheap computing power has enabled the use of these techniques in many industries. The collection of big data sets is instrumental in enabling these techniques. Big data analytics enables businesses to draw meaningful conclusions from complex and varied data sources, which has been made possible by advances in parallel processing and cheap computational power.

Types of Data Analytics

Data analytics is a broad field. There are four primary types of data analytics: descriptive, diagnostic, predictive and prescriptive analytics. Each type has a different goal and a different place in the data analysis process. These are also the primary data analytics applications in business.

  • Descriptive analytics helps answer questions about what happened. These techniques summarize large datasets to describe outcomes to stakeholders. By developing key performance indicators (KPIs,) these strategies can help track successes or failures. Metrics such as return on investment (ROI) are used in many industries. Specialized metrics are developed to track performance in specific industries. This process requires the collection of relevant data, processing of the data, data analysis and data visualization. This process provides essential insight into past performance.
  • Diagnostic analytics helps answer questions about why things happened. These techniques supplement more basic descriptive analytics. They take the findings from descriptive analytics and dig deeper to find the cause. The performance indicators are further investigated to discover why they got better or worse. This generally occurs in three steps:
    • Identify anomalies in the data. These may be unexpected changes in a metric or a particular market.
    • Data that is related to these anomalies is collected.
    • Statistical techniques are used to find relationships and trends that explain these anomalies.
    • Predictive analytics helps answer questions about what will happen in the future. These techniques use historical data to identify trends and determine if they are likely to recur. Predictive analytical tools provide valuable insight into what may happen in the future and its techniques include a variety of statistical and machine learning techniques, such as: neural networks, decision trees, and regression.
    • Prescriptive analytics helps answer questions about what should be done. By using insights from predictive analytics, data-driven decisions can be made. This allows businesses to make informed decisions in the face of uncertainty. Prescriptive analytics techniques rely on machine learning strategies that can find patterns in large datasets. By analyzing past decisions and events, the likelihood of different outcomes can be estimated.

    These types of data analytics provide the insight that businesses need to make effective and efficient decisions. Used in combination they provide a well-rounded understanding of a company’s needs and opportunities.

    Step One: Ask The Right Questions

    So you’re ready to get started. With no time to waste in discovering what makes your customers or employees tick, you quickly set out to collect as much data as you can get your hands on by digging through records and surveys. The more the better, right?

    Before you start collecting data, you need to first understand what you want to do with it. Take some time to think about a specific business problem you want to address or consider a hypothesis that could be solved with data. From there, you’ll create a set of measurable, clear, and concise questions that will help answer that.

    For example, an advertiser who wants to boost their client’s sales may ask if customers are likely to purchase from them after seeing an ad. Or an HR director who wants to reduce turnover might want to know why their top employees are leaving their company.

    Starting with a clear objective is an essential step in the data analysis process. By recognizing the business problem that you want to solve and setting well-defined goals, it’ll be way easier to decide on the data you need.

    Step # 3. Tabulation:

    Tabulation is a part of the technical process in the statistical analysis of the data. The essential element in tabulation is the summarization of results in the form of statistical tables.

    It is only when raw data are divided into groups and counts made of the number of cases falling in these various groups, that it is possible for the researcher to determine what his results mean and to convey his findings to the consumer in a form which can be readily understood.

    Tabulation naturally depends on establishing categories for raw data, editing and coding of response (punching and running the cards through machines for mechanical tabulation and sorting and tallying for hand tabulation).

    Experienced researchers generally develop tabulation plans at about the same time as they draft or construct the data-collection instruments and make sampling plans. The inexperienced researchers seldom concern themselves with tabulation plans until the data have been collected. Of course, it is impossible for the researcher to foresee the entire range of tabulation that will be subsequently desired.

    He should be familiar enough with his research problem or the subject of investigation to be able to draw up tables that will provide answers to the questions which gave rise to the study. The researcher should be able to prepare adequate tabulation plans if he uses the findings from the earlier researches which have elements in common with the one for which the plans are being drawn.

    In exploratory studies, a better and safer procedure is to pretest the data-collection instrument on a sample of population of the type that would be covered in the final study. This way, some clues in regard to what kind of tabulation would be meaningful can generally be obtained.

    Tabulation, may be done entirely by manual methods; this being known as hand tabulation. Alternatively, it may be done by mechanical methods utilizing automatic and fast power machines for the bulk of data, the process being known as mechanical tabulation.

    The researcher must decide before he draws detailed tabulation plans for his study, what method of tabulation he would use. This decision will be based on various considerations such as cost, time, personnel, etc.

    Both hand tabulation as well as mechanical tabulation procedures have their respective merits and limitations. The researcher’s alert to these merits and demerits is in a better way to decide which method would be suited to his problem.

    (1) Mechanical tabulation involves much clerical work and specialized operations. Of course, it facilitates speed but the speed may not always be an adequate compensation for extra clerical work.

    (2) If the number and types of tables desired are not decided upon before tabulation work is begun,. machine-tabulation may be more expedient. But, if hand tabulation is considered to be efficient, the order in which various sorts and counts would be made is determined in advance of tabulation.

    (3) A major advantage of machine tabulation is that it facilitates cross-classifications. In large-scale studies where many variables are to be correlated or cross- classified, machine tabulation is reasonably preferable.

    It is for this reason that mechanical tabulation is used in studies requiring many inter correlations among variables. But, if the total number of respondents is small, a manual counting of them in accordance with the cross-classificatory principle may be relatively economical.

    (5) If it is desired to keep the data in a form ready for new tabulation at a relatively short notice punch cards are typically useful. Mechanical tabulation is useful for periodic studies or surveys in which the same type of information is required to be collected at frequent intervals.

    (6) The process of sorting and counting is less likely to produce errors if done by machine than if done by hand. Errors, of course, can and do arise in machine tabulation and when they do, they are often very difficult to identify and check.

    Any errors discovered at coding, editing or field-work stages of the survey may hold up machine-tabulation work. It is often desirable, therefore, to proceed with hand tabulation alongside with the field-work.

    (7) Cost of tabulation operations is an important concern of the researcher. Machine tabulation often involves much greater cost since the most of punch cards, charges for punching and verifying, machine charges for sorting and tabulation machines and expenses on hiring specialized services of specific types of machine operators often add up to much more than those involved in hand tabulation.

    (8) Another important consideration is time. In mechanical tabulation the work of tabulation as such is done in a very short time, but the preparatory stages as also the training, supervision and possible non-availability of certain types of machines on hire resulting in dislocation of work may all inevitably contribute to wastage of time.

    Step # 4. Statistical Analysis of Data:

    In research, we are not concerned with each individual respondent. The purpose of research is broader than this. That is, we wish to know much more than simply that a given respondent, for example, has extremely favourable attitude toward disarmament and that another respondent has moderately unfavorable attitudes toward the same issue. But this information is just not enough.

    Social science researches are generally directed toward providing information about a particular population of respondents mostly via a sample. The sample of the totality might be asked certain questions related to the problem of our study, or be subjected to some form of observation.

    Let us suppose that we have asked a sample of a thousand college students studying in ‘post-graduate’ classes a series of questions with a view to securing information about their study habits. Our research would thus be directed toward providing information about the ‘population’ of ‘post-graduate’ students of which the thousand cases is a sample.

    As a necessary step to characterizing this ‘population’, we would have to describe or summarize the information about study habits that we have obtained on the sample thereof. Tabulation is just a part of this step. In addition, we must estimate the reliability of generalizations of the ‘population’ from the obtained data. Statistical methods are useful in fulfilling both these ends.


Leave a Comment