Unit I Introduction to Business Analytics PDF

UNIT I INTRODUCTION TO BUSINESS ANALYTICS Analytics and Data Science – Analytics Life Cycle – Types of Analytics – Business Problem Definition – Data Collection – Data Preparation – Hypothesis Generation – Modeling – Validation and Evaluation – Interpretation – Deployment and Iteration Business Analytics: Def: Business intelligence (BI) can be defined as a set of processes and technologies thatconvert data into meaningful and useful information for business purposes. While some believe that BI is a broad subject that encompasses analytics, business analytics, and information systems (Bartlett, 2013, p.4), others believe it is mainly focusedon collecting, storing, and exploring large database organizations for information useful to decision- making and planning (Negash, 2004). Business analytics, or simply analytics, is the use of data, information technology, statistical analysis, quantitative methods, and mathematical orcomputer-based models to help managers gain improved insight about their business operations and make better, fact-based decisions. Business analyticsis “a process of transforming data into actions through analysis and insights in the context of organizational decision making and problem solving.” Business analytics is supported by various tools such as Microsoft Excel and various Excel add-ins, commercial statistical software packages such as SAS or Minitab, and more-complex business intelligence suites that integrate data with analytical software. Tools and techniques of business analytics are used across many areas in a wide variety of organizations to improve the management of customer relationships, financial andmarketing activities, human capital, supply chains, and many other areas. Leading banks use analytics to predict and prevent credit fraud. Manufacturers use analytics for production planning, purchasing, and inventory management. Retailers use analytics to recommend products to customers and optimize marketing promotions. Pharmaceutical firms use it to get life-saving drugs to market more quickly. The leisure and vacation industries use analytics to analyze historical sales data, understand customer behavior, improve Web site design, and optimize schedulesand bookings. Airlines and hotels use analytics to dynamically set prices over time to maximize revenue. Even sports teams are using business analytics to determine both game strategy and optimal ticket prices.4 Among the many organizations that use analytics to make strategic decisions and manage day-to-day operations are Harrah’s Entertainment, the Oakland Athletics baseball and New England Patriots football teams, Amazon.com, Procter & Gamble, United Parcel Service (UPS), and Capital One bank. It was reported that nearly all firms with revenues of more than $100 million are using some form of business analytics. Some common types of decisions that can be enhanced by using analytics include pricing (for example, setting prices for consumer and industrial goods, governmentcontracts, and maintenance contracts), customer segmentation (for example, identifying and targeting key customergroups in retail, insurance, and credit card industries), merchandising (for example, determining brands to buy, quantities, andallocations), location (for example, finding the best location for bank branches andATMs, or where to service industrial equipment) Analytics and Data Science Business Analytics Data Science Business Analytics is the statistical study Data science is the study of data using of business data to gain insights. statistics, algorithms and technology. Uses mostly structured data. Uses both structured and unstructureddata. Coding is widely used. This field is a Does not involve much coding. It ismore combination of traditional analytics practice statistics oriented. with good computer scienceknowledge. The whole analysis is based onstatistical Statistics is used at the end of analysis concepts. following coding. Studies trends and patterns specific to Studies almost every trend and pattern business. related to technology. Top industries where business analytics is Top industries/applications where data used: finance, healthcare,marketing, retail, science is used: e-commerce, finance, supply chain, telecommunications. machine learning, manufacturing. What is concept business analytics? Business analytics bridges the gap between information technology and businessby using analytics to provide data-driven recommendations. The business part requires deep business understanding, while the analytics part requires an understanding of data, statistics and computer science. What does a business analyst do? A business analyst acts as a communicator, facilitator and mediator, and seeks the best ways to improve processes and increase effectiveness through technology, strategy, analytic solutions, and more. What are the skills of a business analyst? Interpretation: Businesses managers manage a vast amount ofdata. As a business analyst, you should have the ability to cleandata and make it useful for interpretation. Data Visualization and storytelling: Data visualization is an evolving discipline, and Tableau defines data visualization as a graphical representation of data and information. A business analyst uses such visual elements as charts, graphs and maps, andprovides an accessible way to see and understand trends, outliers and patterns in data. Analytical reasoning ability: Consists of logical reasoning, critical thinking, communication, research and data analysis. A business analyst requires these to apply descriptive, predictive andprescriptive analytics in business situations to solve business problems. Mathematical and statistical skills: The ability to collect,organize and interpret numerical data is used for modeling, inference, estimation and forecasting in business analytics. Written and communication skills: If you have better communication skills, it becomes easy to influence the management team to recommend improvements and increase business opportunities. What is data science? Data science is the study of data using statistics, algorithms and technology. It is the process of using data to find solutions and predictoutcomes for a problem statement. What does a data scientist do? Data scientists apply machine-learning algorithms to numbers, text, images, videos and audio, and draw various understanding from them. According to Hugo Bowne- Anderson writing in the Harward BusinessReview, “Data scientists lay a solid data foundation in order to performrobust analytics. Then they use online experiments, among other methods, to achieve sustainable growth.” Finally, they build machine learning pipelines and personalized data products to better understand their business and customers and to makebetter decisions. In others, in tech, data science is about infrastructure, testing, machine learning, decision-making, and data products.” What are the skills of a data scientist? The core skills required in data science are as follows: Statistical analysis: You should be familiar with statistical tests,likelihood estimators for a keen sense of pattern and anomaly detection. Computer science and programming: Data scientists encounter massive datasets. To uncover answers to problems, you will have to write computer programs and should be proficient in computerprogramming languages such as Python, R and SQL. Machine learning: As a data scientist, you should be familiar withalgorithms and statistical models that automatically enable a computer to learn from data. Multivariable calculus and linear algebra: This significant mathematical knowledge is needed for building a machinelearning model. Data visualization and storytelling: After you have the data, youhave to present your findings. Data scientists use data visualization tools to communicate and describe actionable insights to technical and non- technical audiences. Business Intelligence vs. Analytics: \ Business Analytics Life Cycle: Gather: Extract data from your database through advanced queries using Structured Query Language (SQL), which supports, extracts, transforms and loads data in preparation for analytics model development. Clean: Data cleaning or cleansing identifies and removes errors, corruptions,inconsistencies or outliers that can affect data accuracy and, ultimately, its functionality. Visualize and Analyze the Data:Data visualization—presented in statistical graphs, plots, charts or other visual context—descriptively summarizes the information so you can apply the right structured data model for analysis. Specialized analytics systems based on yourmodel can provide the analysis and statistics. Statistics and Algorithms: This can include correlation analysis, hypothesis testing, and regression analysis to see whether simple predictions can be made. Machine Learning: Decision trees, neural networks and logistics regression are all machine learning-based predictive analytics techniques that can turn data into proactive solutions. As you go through data analysis, different comparisons can derive different insights (i.e. applying differing machine learning approaches). With enhanced analysis capabilities, youcan realize new opportunities and develop innovative solutions to your business questions. Validate the Analysis:Ready to ask more questions? Examining “Is it accurate?”, “Is it appropriate?” and running “what-if” scenarios will help you determine if your analysis is valid or on the right track. Statistical analysis, inference and predictive modeling are applied to define andvalidate target parameters, leading to an optimal solution and model most aligned to your business goal. Types of Analytics: Business analytics begins with the collection, organization, and manipulation of data and is supported by three major components: Descriptive analytics. Most businesses start with descriptive analytics—the use of data to understand past and current business performance and make informed decisions. Descriptive analytics is the most commonly used and most well-understood type of analytics. These techniques categorize, characterize, consolidate, and classify data to convert it into useful information for the purposes of understanding and analyzing business performance. Descriptive analytics summarizes data into meaningful charts and reports, for example, about budgets, sales, revenues, or cost. This process allows managers to obtain standard and customized reports and then drill down into the data and make queries to understand the impact of an advertising campaign, for example, review business performance to find problems or areas of opportunity, and identify patterns and trends in data. Typical questions that descriptive analytics helps answer are “How much did we sell in each region?” “What was our revenue and profit last quarter?” “How many and what types of complaints did we resolve?” “Which factory hasthe lowest productivity?” Descriptive analytics also helps companies to classify customers into different segments, which enables them to develop specific marketing campaigns and advertising strategies. Predictive analytics. Predictive analytics seeks to predict the future by examining historical data, detecting patterns or relationships in these data, and then extrapolatingthese relationships forward in time. For example, a marketer might wish to predict the response of different customer segments to an advertising campaign, a commodities trader might wish to predict short-term movements in commodities prices, or a skiwear manufacturer might want to predict next season’s demand for skiwear of a specific color and size. Predictive analytics can predict risk and find relationships in data not readily apparentwith traditional analyses. Using advanced techniques, predictive analytics can help to detect hidden patterns in large quantities of data to segment and group data into coherent sets to predict behavior and detect trends. For instance, a bank manager might want to identify the most profitable customers or predict the chances that a loanapplicant will default, or alert a credit- card customer to a potential fraudulent charge. Predictive analytics helps to answer questions such as “What will happen if demand falls by 10% or if supplier prices go up 5%?” “What do we expect to pay for fuel overthe next several months?” “What is the risk of losing money in a new business venture?” Prescriptive analytics. Many problems, such as aircraft or employee scheduling and supply chain design, for example, simply involve too many choices or alternatives for a human decision maker to effectively consider. Prescriptive analytics uses optimization to identify the best alternatives to minimize or Maximize some objective. Prescriptive analytics is used in many areas of business, including operations, marketing, and finance. For example, we may determine the best pricing and advertising strategy to maximize revenue, the optimal amount of cash to store in ATMs, or the best mix of investments in a retirement portfolio to manage risk.The mathematical and statistical techniques of predictive analytics can also be combined with optimization to make decisions that take into account the uncertainty in the data. Prescriptive analytics addresses questions such as “How much should we produce to maximize profit?” “What is the best way of shipping goods from our factories to minimize costs?” “Shouldwe change our plans if a natural disaster closes a supplier’s factory: if so, by how much?” Business Problem Definition: Sometime during their lives, most Americans will receive a mortgage loan for a house or condominium. The process starts with an application. The application contains all pertinent information about the borrower that the lender willneed. The bank or mortgage company then initiatesa process that leads to a loan decision. It is here that key information about the borrower isprovided by third-party providers. This informationincludes a credit report, verification of income, verification of assets, verification of employment, and an appraisal of the property among others. The result of the processing function is a complete loan file that contains all the information and documents needed to underwrite the loan, which is the next step in the process. Underwriting is where the loan application isevaluated for its risk. Underwriters evaluate whether the borrower can make payments on time, can afford to pay back theloan, and has sufficient collateral in the property toback up the loan. In the event the borrower defaultson their loan, the lender can sell the property to recover the amount of the loan. But, if the amount of the loan is greater than the value of the property,then the lender cannot recoup their money. If the underwriting process indicates that the borrower is creditworthy, has the capacity to repay the loan, and the value of the property in questionis greater than the loan amount, then the loan is approved and will move to closing. Closing is the step where the borrower signs all the appropriate papers agreeing to the terms of the loan. In reality, lenders have a lot of other work todo First, they must perform a quality control review on a sample of the loan files that involves a manual examination of all the documents and information gathered. This process is designed to identify any mistakes that may have been made or information that is missing from the loan file. Because lenders do not have unlimited money to lend to borrowers,they frequently sell the loan to a third party so that they have fresh capital to lend to others. This occurs in what is called the secondary market. Freddie Mac and Fannie Mae are the two largest purchasers of mortgages in the secondary market. The final step in the process is servicing. Servicingincludes all the activities associated with providingthe customer service on the loan like processing payments, managing property taxes held in escrow, and answering questions about the loan. In addition, the institution collects various operational data on the process to track its performance and efficiency, including the number of applications, loan types and amounts, cycle times (time to close the loan), bottlenecks in the process, and so on. Many different types of analytics are used: Descriptive Analytics—This focuses on historicalreporting, addressing such questions as How many loan apps were taken each of thepast 12 months? What was the total cycle time from app to close? What was the distribution of loan profitability bycredit score and loan-to-value (LTV), which is themortgage amount divided by the appraised value of the property. Predictive Analytics—Predictive modeling usemathematical, spreadsheet, and statistical models,and address questions such as: What impact on loan volume will a givenmarketing program have? How many processors or underwriters are neededfor a given loan volume? Will a given process change reduce cycle time? Prescriptive Analytics—This involves the use ofsimulation or optimization to drive decisions. Typicalquestions include: What is the optimal staffing to achieve a givenprofitability constrained by a fixed cycle time? What is the optimal product mix to maximizeprofit constrained by fixed staffing? The mortgage market has become much more dynamic in recent years due to rising home values,falling interest rates, new loan products, and an increased desire by home owners to utilize the equity in their homes as a financial resource. This has increased the complexity and variability of the lenders to proactively use the data that are availableto them as a tool for managing their business.To ensure that the process is efficient, effective andperformed with quality, data and analytics are used every day to track what is done, who is doing it, and how long it takes. Data for Business Analytics: Data: numerical or textual facts and figures that are collected through some type of measurementprocess. Information: result of analyzing data; that is,extracting meaning from data to support evaluation and decision making Examples for Data Sources and Uses: Data Collection Data collection is the methodological process of gathering information about a specific subject. In general, there are three types of consumer data: First-party data, which is collected directly from users by your organization Second-party data, which is data shared by another organization about its customers (or its first-party data) Third-party data, which is data that’s been aggregated and rented or sold by organizations thatdon’t have a connection to your company or users Although there are use cases for second- and third-party data, first-party data (data you’ve collected yourself) is more valuable because you receive information about how your audience behaves, thinks, andfeels—all from a trusted source. Data can be qualitative (meaning contextual in nature) or quantitative (meaning numeric in nature). Manydata collection methods apply to either type, but some are better suited to one over the other. In the DATA LIFE CYCLE, data collection is the second step. After data is generated, it must be collected to be ofuse to your team. After that, it can be processed, stored, managed, analyzed, and visualized to aid in your organization’s decision-making. The data collection method you select should be based on the question you want to answer, the type of datayou need, your timeframe, and your company’s budget. There are 7 Data Collection Methods Used in Business Analytics 1.Surveys Surveys are physical or digital questionnaires that gather both qualitative and quantitative data from subjects. One situation in which you might conduct a survey is gathering attendee feedback after an event. This can provide a sense of what attendees enjoyed, what they wish was different, and areas you can improve or save money on during your next event for a similar audience. Because they can be sent out physically or digitally, surveys present the opportunity for distribution at scale. They can also be inexpensive; running a survey can cost nothing if youuse a free tool. If you wish to target a specific group of people, partnering with a market research firm to get the survey in the hands of that demographic may be worth the money. Something to watch out for when crafting and running surveys is the effect of bias,including: Collection bias: It can be easy to accidentally write survey questions with a biasedlean. Watch out for this when creating questions to ensure your subjects answer honestly and aren’t swayed by your wording. Subject bias: Because your subjects know their responses will be read by you, theiranswers may be biased toward what seems socially acceptable. For this reason, consider pairing survey data with behavioral data from other collection methods to get the full picture. 2.Transactional Tracking Each time your customers make a purchase, tracking that data can allow you to makedecisions about targeted marketing efforts and understand your customer base better. Often, e-commerce and point-of-sale platforms allow you to store data as soon as it’s generated, making this a seamless data collection method that can pay off in the form ofcustomer insights. 3.Interviews and Focus Groups Interviews and focus groups consist of talking to subjects face-to-face about a specific topicor issue. Interviews tend to be one-on-one, and focus groups are typically made up of several people. You can use both to gather qualitative and quantitative data. Through interviews and focus groups, you can gather feedback from people in your target audience about new product features. Seeing them interact with your product in real-timeand recording their reactions and responses to questions can provide valuable data about which product features to pursue. 4.Observation Observing people interacting with your website or product can be useful for data collection because of the candor it offers. If your user experience is confusing or difficult, you can witness it in real-time. Yet, setting up observation sessions can be difficult. You can use a third-party tool to record users’ journeys through your site or observe a user’s interaction with a beta version of your site or product. While less accessible than other data collection methods, observations enable you to see firsthand how users interact with your product or site. You can leverage the qualitative and quantitative data gleaned from this to make improvements and double down on points of success. 5.Online Tracking To gather behavioral data, you can implement pixels and cookies. These are both tools thattrack users’ online behavior across websites and provide insight into what content they’re interested in and typically engage with. You can also track users’ behavior on your company’s website, including which parts are ofthe highest interest, whether users are confused when using it, and how long they spend on product pages. This can enable you to improve the website’s design and help users navigateto their destination. Inserting a pixel is often free and relatively easy to set up. Implementing cookies may come with a fee but could be worth it for the quality of data you’ll receive. Once pixels and cookies are set, they gather data on their own and don’t need much maintenance, if any. It’s important to note: Tracking online behavior can have legal and ethical privacy implications. Before tracking users’ online behavior, ensure you’re in compliance with localand industry data privacy standards. 6.Forms Online forms are beneficial for gathering qualitative data about users, specifically demographic data or contact information. They’re relatively inexpensive and simple to set up, and you can use them to gate content orregistrations, such as webinars and email newsletters. You can then use this data to contact people who may be interested in your product, build out demographic profiles of existing customers, and in remarketing efforts, such as email workflows and content recommendations. 7.Social Media Monitoring Monitoring your company’s social media channels for follower engagement is an accessible way to track data about your audience’s interests and motivations. Many social media platforms have analytics built in, but there are also third-party social platforms that give more detailed, organized insights pulled from multiple channels. You can use data collected from social media to determine which issues are most importantto your followers. For instance, you may notice that the number of engagements dramatically increases when your company posts about its sustainability efforts. Data Preparation What is data preparation in business analytics? Data preparation is the process of gathering, combining, structuring and organizing data so it can be used in business intelligence (BI), analytics and data visualization applications. Data preparation, also sometimes called “pre-processing,” is the act of cleaning and consolidating raw data prior to using it for business analysis. It might not bethe most celebrated of tasks, but careful data preparation is a key component of successful data analysis. Data preparation is often referred to informally as data preparation. It's also known as datawrangling, although some practitioners use that term in a narrower sense to refer to cleansing, structuring and transforming data; that usage distinguishes data wrangling fromthe data preprocessing stage. Purposes of data preparation One of the primary purposes of data preparation is to ensure that raw data being readied for processing and analysis is accurate and consistent so the results of BI and analytics applications will be valid. Data is commonly created with missing values, inaccuracies or other errors, and separate data sets often have different formats that need to be reconciled when they're combined. Correcting data errors, validating data quality and consolidating data sets are big parts of data preparation projects. Data preparation also involves finding relevant data to ensure that analytics applications deliver meaningful information and actionable insights for business decision-making. The data often is enriched and optimized to make it more informative and useful -- for example,by blending internal and external data sets, creating new data fields, eliminating outlier values and addressing imbalanced data sets that could skew analytics results. In addition, BI and data management teams use the data preparation process to curate datasets for business users to analyze. Doing so helps streamline and guide self-service BI applications for business analysts, executives and workers. What are the benefits of data preparation? Data scientists often complain that they spend most of their time gathering, cleansing and structuring data instead of analyzing it. A big benefit of an effective data preparation process is that they and other end users can focus moreon data mining and data analysis -- the parts of their job that generate business value. For example, data preparation can be done more quickly, and prepared data can automatically be fed to users for recurring analytics applications. Done properly, data preparation also helps an organization do the following: ensure the data used in analytics applications produces reliable results; identify and fix data issues that otherwise might not be detected; enable more informed decision-making by business executives andoperational workers; reduce data management and analytics costs; avoid duplication of effort in preparing data for use in multipleapplications; get a higher ROI from BI and analytics initiatives. Steps in the data preparation process Data preparation is done in a series of steps. There's some variation in the data preparationsteps listed by different data professionals and software vendors, but the process typically involves the following tasks: Data collection. Relevant data is gathered from operational systems, data warehouses, data lakes and other data sources. During this step, data scientists, members of the BI team, other data professionals and end users who collect datashould confirm that it's a good fit for the objectives of the planned analytics applications. Data discovery and profiling. The next step is to explore the collected data to better understand what it contains and what needs to be done to prepare it for the intended uses. To help with that, data profiling identifies patterns, relationships and other attributes in the data, as well as inconsistencies, anomalies, missing values and other issues so they can be addressed. Data cleansing. Next, the identified data errors and issues are corrected to create complete and accurate data sets. For example, as part of cleansing data sets, faultydata is removed or fixed, missing values are filled in and inconsistent entries are harmonized. Data structuring. At this point, the data needs to be modeled and organized to meetthe analytics requirements. For example, data stored in comma-separated values (CSV) files or other file formats has to be converted into tables to make it accessibleto BI and analytics tools. Data transformation and enrichment. In addition to being structured, the data typically must be transformed into a unified and usable format. For example, data transformation may involve creating new fields or columns that aggregate values from existing ones. Data enrichment further enhances and optimizes data sets as needed, through measures such as augmenting and adding data. Data validation and publishing. In this last step, automated routines are run against the data to validate its consistency, completeness and accuracy. The prepared data is then stored in a data warehouse, a data lake or another repository and either used directly by whoever prepared it or made available for other users to access. What are the challenges of data preparation ? Data preparation is inherently complicated. Data sets pulled together from different source systems are highly likely to have numerous data quality, accuracy and consistency issues to resolve. The data also must be manipulated to make it usable, and irrelevant data needs to be weeded out. As noted above, it's a time-consuming process: The 80/20 rule is often applied to analytics applications, with about 80% of the work said to be devoted to collecting and preparing data and only 20% to analyzing it. Inadequate or nonexistent data profiling. If data isn't properly profiled, errors,anomalies and other problems might not be identified, which can result in flawedanalytics. Missing or incomplete data. Data sets often have missing values and other forms ofincomplete data; such issues need to be assessed as possible errors and addressed if so. Invalid data values. Misspellings, other typos and wrong numbers are examples ofinvalid entries that frequently occur in data and must be fixed to ensure analytics accuracy. Name and address standardization. Names and addresses may be inconsistent indata from different systems, with variations that can affect views of customers andother entities. Inconsistent data across enterprise systems. Other inconsistencies in data setsdrawn from multiple source systems, such as different terminology and unique identifiers, are also a pervasive issue in data preparation efforts. Data enrichment. Deciding how to enrich a data set - for example, what to add to it -is a complex task that requires a strong understanding of business needs and analyticsgoals. Maintaining and expanding data prep processes. Data preparation work often becomes a recurring process that needs to be sustained and enhanced on an ongoingbasis. Hypothesis Generation What is Hypothesis Generation? Hypothesis generation is an educated “guess” of various factors that are impacting the business problem that needs to be solved using machine learning. In short, you are making wise assumptions as to how certain factors would affect our target variable and in the process that follows, you try to prove and disprove them using various statistical and graphical tools. Hypothesis generation in business analytics involves formulating potential explanations or predictions that can be tested using data. This process is crucial for making informed decisions, identifying opportunities, and solving business problems. Here's a detailed description: 1. Understanding the Context This is the foundational step where you gather all the necessary background information about the business and the specific problem or opportunity you are addressing. It involves: Identifying the problem or opportunity: Clearly articulate the business challenge or the opportunity you want to explore. For instance, the problem could be declining sales, and the opportunity could be expanding into a new market. Understanding the business environment: Gain a comprehensive understanding of the industry, market trends, competitors, and internal business processes. This helps in framing relevant and realistic hypotheses. 2. Data Exploration Data exploration involves a thorough examination of the available data to gain insights and identify patterns that may inform your hypotheses. This step includes: Descriptive analysis: Summarize and describe the main features of the data using measures like mean, median, mode, and standard deviation. Data visualization: Create visual representations of the data, such as charts, graphs, and heatmaps, to easily spot trends and anomalies. Statistical analysis: Apply statistical techniques to understand relationships between variables, distributions, and potential outliers. 3. Brainstorming Potential Hypotheses Based on the understanding of the context and initial data exploration, you brainstorm potential hypotheses. Hypotheses should be: Specific: Clearly defined and narrow in scope. Testable: Formulated in a way that they can be tested using available data and analytical methods. Relevant: Directly related to the business problem or opportunity. Examples: "Offering a 20% discount on weekends will increase sales by 15%." "Customers who receive personalized email marketing will have a 25% higher retention rate." 4. Prioritizing Hypotheses Since not all hypotheses can be tested at once due to resource constraints, prioritize them based on: Relevance to business goals: Hypotheses that align closely with strategic objectives should be given priority. Feasibility: Consider the availability of data, tools, and resources required to test the hypothesis. Impact: Evaluate the potential impact on the business if the hypothesis is confirmed. 5. Designing Experiments or Analytical Tests Design a plan to test the prioritized hypotheses. This involves: Defining metrics: Establish specific, quantifiable metrics to measure success. For instance, increase in sales, reduction in churn rate, etc. Selecting a methodology: Choose appropriate methods such as A/B testing, regression analysis, or controlled experiments. Collecting data: Ensure that you have or can collect the necessary data. This might involve setting up new data collection processes or utilizing existing data sources. 6. Analyzing Results After conducting the experiments or tests, analyze the results to evaluate the hypotheses. This step includes: Statistical testing: Use statistical methods to determine the significance of the results. Techniques like t-tests, chi-square tests, and ANOVA might be used. Interpreting results: Understand what the data reveals about the hypothesis. Are the results in line with expectations? Are there any unexpected findings? Drawing conclusions: Decide whether to accept or reject the hypothesis based on the analysis. 7. Iteration and Refinement Hypothesis generation is an ongoing process. Based on the results of the tests: Refine hypotheses: Adjust existing hypotheses or generate new ones if initial hypotheses are not supported by the data. Implement changes: If a hypothesis is confirmed, take the necessary actions to implement the findings in the business operations. Continuous monitoring: Keep track of the impact of implemented changes and iterate the process as needed for continuous improvement. 8. Communicating Findings Effectively communicate the results and insights gained from hypothesis testing to relevant stakeholders. This involves: Reporting: Create detailed reports that outline the methodology, analysis, and conclusions. Reports should be clear, concise, and tailored to the audience. Visual presentations: Use visual aids like charts, graphs, and infographics to present findings in an easily understandable format. Actionable recommendations: Provide clear and actionable recommendations based on the results, outlining the next steps for the business. Example Scenario: Improving Customer Retention Context: The company wants to improve customer retention rates. Data Exploration: Analyze customer purchase history, feedback, and churn rates to identify patterns. Hypotheses: "Offering personalized discounts to frequent customers will reduce churn by 15%." Prioritization: This hypothesis is prioritized due to its direct impact on retention and feasibility. Designing Tests: Implement a personalized discount campaign and measure its effect on retention rates. Analyzing Results: Use statistical tests to compare retention rates before and after the campaign. Iteration: Refine the approach based on results and continuously monitor retention rates. Communication: Present findings and recommendations to the marketing and sales teams. By following this structured process, businesses can effectively address challenges and capitalize on opportunities through data-driven insights. DATA VALIADATION Data validation is related to data quality. Data validation can be a component tomeasure data quality, which ensures that a given data set is supplied with information sources that are of the highest quality, authoritative and accurate. Data validation is also used as part of application workflows, including spellchecking and rules for strong password creation. Why validate data? For data scientists, data analysts and others working with data, validating it is very important. The output of any given system can only be as good as the data the operation is based on. These operations can include machine learning or artificial intelligence models, data analytics reports and business intelligence dashboards. Validating the data ensures that the data is accurate, which means allsystems relying on a validated given data set will be as well. Data validation is also important for data to be useful for an organization or for aspecific application operation. For example, if data is not in the right format to be consumed by a system, then the data can't be used easily, if at all. As data moves from one location to another, different needs for the data arise based on the context for how the data is being used. Data validation ensures thatthe data is correct for specific contexts. The right type of data validation makes the data useful. What are the different types of data validation? Multiple types of data validation are available to ensure that the right data is being used. The most common types of data validation include the following: Data type validation is common and confirms that the data in each field,column, list, range or file matches a specified data type and format. Constraint validation checks to see if a given data field input fits a specified requirement within certain ranges. For example, it verifies that adata field has a minimum or maximum number of characters. Structured validation ensures that data is compliant with a specified dataformat, structure or schema. Consistency validation makes sure data styles are consistent. For example, it confirms that all values are listed to two decimal points. Code validation is similar to a consistency check and confirms that codesused for different data inputs are correct. Performing data validation Among the most basic and common ways that data is used is within a spreadsheet program such as Microsoft Excel or Google Sheets. In both Exceland Sheets, the data validation process is a straightforward, integrated feature.Excel and Sheets both have a menu item listed as Data > Data Validation. Byselecting the Data Validation menu, a user can choose the specific data type orconstraint validation required for a given file or data range. ETL (Extract, Transform and Load) and data integration tools typically integratedata validation policies to be executed as data is extracted from one source and then loaded into another. Popular open source tools, such as dbt, also include data validation options and are commonly used for data transformation. Data validation can also be done programmatically in an application context foran input value. For example, as an input variable is sent, such as a password, it can be checked by a script to make sure it meets constraint validation for the right length Interpretation What Is Data Interpretation? Data interpretation refers to the process of using diverse analytical methods to review data and arrive at relevant conclusions. The interpretation of data helps researchers to categorize, manipulate, and summarize the information in order to answer critical questions. The importance of data interpretation is evident and this is why it needs to be done properly. Data is very likely to arrive from multiple sources and has a tendency to enter the analysis process with haphazard ordering. Data analysis tends to be extremely subjective. That is to say, the nature and goal of interpretation will vary from business to business, likely correlating to the type of data being analyzed. While there are several different types of processes that are implemented based on individual data nature, the two broadest and most common categories are “quantitative analysis” and “qualitative analysis”. Yet, before any serious data interpretation inquiry can begin, it should be understood that visual presentations of data findings are irrelevant unless a sound decision is made regarding scales ofmeasurement. Before any serious data analysis can begin, the scale of measurement must be decided for the data as this will have a long-term impact on data interpretation ROI. The varying scales include: Nominal Scale: non-numeric categories that cannot be ranked or compared quantitatively. Variables are exclusive and exhaustive. Ordinal Scale: exclusive categories that are exclusive and exhaustive but with alogical order. Quality ratings and agreement ratings are examples of ordinal scales (i.e.,good, very good, fair, etc., OR agree, strongly agree, disagree, etc.). Interval: a measurement scale where data is grouped into categories with orderlyand equal distances between the categories. There is always an arbitrary zero point. Ratio: contains features of all three.For a more in-depth review of scales of measurement, read our article on data analysis questions.Once scales of measurement have been selected, it is time to select which of the two broad interpretation processes will best suit your data needs. Let’s take a closer look at those specific data interpretation methods and possible data interpretation problems How To Interpret Data? When interpreting data, an analyst must try to discern the differences between correlation, causation, and coincidences, as well as much other bias – but he also has to consider all thefactors involved that may have led to a result. There are various data interpretation methods one can use. The interpretation of data is designed to help people make sense of numerical data that has been collected, analyzed, and presented. Having a baseline method (or methods) for interpreting data will provide your analyst teams with a structure and consistent foundation. Indeed, if several departments have different approaches to interpret the samedata while sharing the same goals, some mismatched objectives can result. Disparate methods will lead to duplicated efforts, inconsistent solutions, wasted energy, and inevitably – time and money. In this part, we will look at the two main methods of interpretation of data: a qualitative and quantitative analysis. When interpreting data, an analyst must try to discern the differences between correlation, causation, and coincidences, as well as much other bias – but he also has to consider all the factors involved that may have led to a result. There are various data interpretation methods one can use. The interpretation of data is designed to help people make sense of numerical data that has been collected, analyzed, and presented. Having a baseline method (or methods) for interpreting data will provide your analyst teams with a structure and consistent foundation. Indeed, if several departments have different approaches to interpret the samedata while sharing the same goals, some mismatched objectives can result. Disparate methods will lead to duplicated efforts, inconsistent solutions, wasted energy, and inevitably – time and money. In this part, we will look at the two main methods of interpretation of data: a qualitative and quantitative analysis. Qualitative Data Interpretation Qualitative data analysis can be summed up in one word – categorical. With qualitative analysis, data is not described through numerical values or patterns, but through the use of descriptive context (i.e., text). Typically, narrative data is gathered by employing a widevariety of person-to-person techniques. These techniques include: Observations: detailing behavioral patterns that occur within an observation group. Thesepatterns could be the amount of time spent in an activity, the type of activity, and the method of communication employed. Focus groups: Group people and ask them relevant questions to generate a collaborativediscussion about a research topic. Secondary Research: much like how patterns of behavior can be observed, different types of documentation resources can be coded and divided based on the type of material they contain. Interviews: one of the best collection methods for narrative data. Inquiry responses can begrouped by theme, topic, or category. The interview approach allows for highly-focused data segmentation. Quantitative Data Interpretation If quantitative data interpretation could be summed up in one word (and it really can’t) that word would be “numerical.” There are few certainties when it comes to data analysis, but you can be sure that if the research you are engaging in has no numbers involved, it is not quantitative research. Quantitative analysis refers to a set of processes by which numerical data is analyzed. More often than not, it involves the use of statistical modeling such as standard deviation, mean and median. Let’s quickly review the most common statistical terms: Mean: a mean represents a numerical average for a set of responses. When dealing with adata set (or multiple data sets), a mean will represent a central value of a specific set of numbers.It is the sum of the values divided by the number of values within the data set. Other terms that can be used to describe the concept are arithmetic mean, average and mathematical expectation. Standard deviation: this is another statistical term commonly appearing in quantitative analysis. Standard deviation reveals the distribution of the responses around the mean. It describes the degree of consistency within the responses; together with the mean, it provides insight into data sets. Frequency distribution: this is a measurement gauging the rate of a response appearance within a data set. When using a survey, for example, frequency distribution has the capability of determining the number of times a specific ordinal scale response appears (i.e., agree, strongly agree, disagree, etc.). Frequency distribution is extremely keen in determining thedegree of consensus among data points. Typically, quantitative data is measured by visually presenting correlation tests betweentwo or more variables of significance. Different processes can be used together or separately, and comparisons can be made to ultimately arrive at a conclusion. Other signature interpretation processes of quantitative data include: Regression analysis: Essentially, regression analysis uses historical data to understand the relationship between a dependent variable and one or more independent variables. Knowingwhich variables are related and how they developed in the past allows you to anticipate possible outcomes and make better decisions going forward. For example, if you want to predict your salesfor next month you can use regression analysis to understand what factors will affect them such as products on sale, the launch of a new campaign, among many others. Cohort analysis: This method identifies groups of users who share common characteristics during a particular time period. In a business scenario, cohort analysis is commonly used to understand different customer behaviors. Predictive analysis: As its name suggests, the predictive analysis method aims to predictfuture developments by analyzing historical and current data. Powered by technologies such as artificial intelligence and machine learning, predictive analytics practices enable businesses to spot trends or potential issues and plan informed strategies in advance. Prescriptive analysis: Also powered by predictions, the prescriptive analysis method usestechniques such as graph analysis, complex event processing, neural networks, among others, to try to unravel the effect that future decisions will have in order to adjust them before they areactually made. This helps businesses to develop responsive, practical business strategies.\ Conjoint analysis: Typically applied to survey analysis, the conjoint approach is used toanalyze how individuals value different attributes of a product or service. This helps researchers and businesses to define pricing, product features, packaging, and many other attributes. A common use is menu-based conjoint analysis in which individuals are given a “menu” of options from which they can build their ideal concept or product. Like this analysts can understand whichattributes they would pick above others and drive conclusions. Cluster analysis: Last but not least, cluster analysis is a method used to group objects into categories. Since there is no target variable when using cluster analysis, it is a useful method to find hidden trends and patterns in the data. In a business context clustering is used for audience segmentation to create targeted experiences, and in market research, it is often used to identify age groups, geographical information, earnings, among others. Why Data Interpretation Is Important The purpose of collection and interpretation is to acquire useful and usable information andto make the most informed decisions possible. From businesses to newlyweds researching their first home, data collection and interpretation provides limitlessbenefits for a wide range of institutions and individuals. Data analysis and interpretation, regardless of the method and qualitative/quantitative status, may include the following characteristics: Data identification and explanation Comparing and contrasting of data Identification of data outliers Future predictions Deployment and Iteration Iterative Development Overview DSDM(Dynamics System Development Method) Iterative development is a process in which the Evolving Solution, or a part of it,evolves from a high-level concept to something with acknowledged business value. Each cycle of the process is intended to bring the part of the solution being worked on closer to completion and is always a collaborative process, typicallyinvolving two or more members of the Solution Development Team. Each cycle should: Be as short as possible, typically taking a day or two, with several cycleshappening within a Timebox Be only as formal as it needs to be - in most cases limited to an informalcycle of Thought, Action and Conversation Involve the appropriate members of the Solution Development Team relevant to the work being done. At its simplest, this could be, for example, a Solution Developer and a Business Ambassador working together, or it could need involvement from the whole Solution Development Teamincluding several Business Advisors Each cycle begins and ends with a conversation (in accordance with DSDM’s Principles collaborate and communicate continuously and clearly). The initial conversation is focussed on the detail of what needs to be done. The cycle continues with thought - a consideration of how the need will be addressed. Atits most formal, this may be a collaborative planning event, but in most cases thought will be limited to a period of reflection and very informal planning. Action then refines the Evolving Solution or feature of it. Where appropriate, action will be collaborative. Once work is completed to the extent that it can sensibly be reviewed, the cycle concludes with a return to conversation to decidewhether what has been produced is good enough or whether another cycle is needed. Dependent on the organization and the nature of the work being undertaken, this conversation could range from an informal agreement to a formally documented demonstration, or a “show and tell” review with a wider group of stakeholders. As Iterative Development proceeds, it is important to keep the agreed acceptancecriteria for the solution, or the feature of it, in clear focus in order to ensure that the required quality is achieved without the solution becoming over-engineered. An agreed timescale for a cycle of evolution may also help maintain focus, promote collaboration and reduce risk of wasted effort.

Unit I Introduction to Business Analytics PDF

Document Details

Tags

Related

Summary

Full Transcript