Week 21 Data Analytics PDF

Summary

This document discusses the concept of data analytics, covering its different types such as descriptive, diagnostic, predictive, and prescriptive analytics. It also details the role of data analytics, describing the data analysis process with its core components: data mining, data management, statistical analysis, and data presentation.

Full Transcript

Understanding the concept of data analytics 1. What is data analytics? Analysing raw data to identify trends and address queries defines data analytics, encompassing a wide range within the field. Despite incorporating diverse techniques and goals, the data analytics process comprises components...

Understanding the concept of data analytics 1. What is data analytics? Analysing raw data to identify trends and address queries defines data analytics, encompassing a wide range within the field. Despite incorporating diverse techniques and goals, the data analytics process comprises components beneficial for various initiatives. The amalgamation of these components in a successful data analytics initiative yields a comprehensive understanding of your status, past trajectory, and future direction. 2. Types of Data Analytics Data analytics spans a wide spectrum, consisting of four fundamental types: descriptive, diagnostic, predictive, and prescriptive analytics. Each type plays a distinct role in the data analysis process and holds significant applications in the business sector. 1. Descriptive Analytics: Objective: Address questions about past events. Method: Summarises large datasets to present outcomes to stakeholders, utilising Key Performance Indicators (KPIs) for tracking. Involves data collection, processing, analysis, and visualisation. Provides crucial insights into historical performance. 2. Diagnostic Analytics: Objective: Investigate the reasons behind events. Builds upon descriptive analytics, exploring causation. Steps: Identifies anomalies in data (unexpected changes). Collects related data to anomalies. Applies statistical techniques to uncover relationships explaining anomalies. 3. Predictive Analytics: Objective: Address questions about future occurrences. Uses historical data to identify trends and assess their likelihood of recurrence. Employs statistical and machine learning techniques (e.g., neural networks, decision trees, regression) for predictions. Offers valuable insights into future scenarios. 4. Prescriptive Analytics: Objective: Determine actions to be taken. Utilises insights from predictive analytics for data-driven decision-making. Enables informed decisions in uncertain situations. Leverages machine learning strategies to identify patterns in large datasets, estimating the likelihood of various outcomes. Integrating these analytics types provides businesses with comprehensive insights, facilitating effective and efficient decision-making. Collectively, they contribute to a holistic understanding of a company's needs and opportunities, empowering organisations to navigate complexities with well-informed strategies. 3. What is the Role of Data Analytics? Data analysts operate at the intersection of information technology, statistics, and business, utilising their expertise in these domains to contribute to the success of businesses and organisations. Their primary objective is to enhance efficiency and boost performance by identifying patterns in data. The role of a data analyst encompasses various tasks within the data analysis pipeline. This involves working with data through key steps such as data mining, data management, statistical analysis, and data presentation. The significance and balance of these steps are contingent upon the specific data being utilised and the analysis goals. 1. Data Mining Definition: An essential process involving the extraction of data from diverse sources, including written text, large databases, or raw sensor data. Key Steps: Extract, Transform, and Load (ETL) procedures convert raw data into a usable format, preparing it for storage and analysis. Note: Data mining stands out as the most time-intensive step in the data analysis pipeline. 2. Data Management or Data Warehousing: Definition: Involves designing and implementing databases that facilitate easy access to the outcomes of data mining. Approach: Typically includes the creation and management of SQL databases, with increasing usage of non-relational and NoSQL databases. 3. Statistical Analysis: Purpose: Enables analysts to derive insights from data using statistical and machine learning techniques. Utilisation: Big data is employed to create statistical models, unveiling trends that can be applied to new data for predictions and informed decision-making. Tools: Statistical programming languages like R or Python (with pandas) are vital, and open-source libraries such as TensorFlow facilitate advanced analysis. 4. Data Presentation: Final Step: Involves sharing insights with stakeholders. Key Tool: Data visualisation plays a crucial role in presenting data, as compelling visualisations help convey the narrative within the data, aiding executives and managers in grasping the significance of insights. In summary, data analysts navigate through these key steps, leveraging their skills in data mining, data management, statistical analysis, and data presentation to contribute valuable insights that drive informed decision-making within organisations. 4. Why is data analytics important? The applications of data analytics are extensive and play a pivotal role in optimising efficiency across diverse industries. The analysis of big data is instrumental in enhancing performance, enabling businesses to thrive in a highly competitive global landscape. One notable application is in the realm of financial institutions, where data analytics is employed to detect and prevent fraud, thereby improving efficiency and reducing risk. However, the impact of data analytics extends far beyond profit maximisation and return on investment. In healthcare, specifically health informatics, crime prevention, and environmental protection, data analytics provides critical information that contributes to positive changes in our world. While statistical methods and data analysis have traditionally been integral to scientific research, the advent of advanced analytic techniques and big data has ushered in a new era of insights. These techniques can uncover trends within complex systems, and researchers are harnessing machine learning to safeguard wildlife, highlighting the transformative potential of data analytics in various domains. The applications of data analytics seem boundless, fuelled by the continuous collection of vast amounts of data each day. This ongoing influx of data creates new opportunities to apply analytics to different facets of business, science, and everyday life, promising continued innovation and positive impact. 5. Data analytics technique Different methods are available for data analysis, each serving specific purposes: a. Regression Analysis: Estimates relationships between variables by examining the correlation between a dependent variable and multiple independent variables. Useful for identifying trends and patterns, aiding in predictions and forecasting. b. Monte Carlo Simulation: A computerised technique generating models of potential outcomes and their probability distributions. It calculates the likelihood of each outcome, serving as an advanced risk analysis tool for better future forecasting and decision-making. To get a better understanding of Monte Carlo simulation, review the following case: A case study using Monte Carlo simulation for risk analysis c. Factor Analysis: Reduces numerous variables to a smaller number of factors by leveraging observed correlations. This technique condenses large datasets, uncovers hidden patterns, and explores abstract concepts. Factor analysis in action: Using factor analysis to explore customer behaviour patterns in Tehran d. Cohort Analysis: Breaks data into related groups (cohorts) for analysis, focusing on common characteristics within a defined timespan. Provides dynamic insights into customer behaviour over time within the context of the customer lifecycle. You can learn more about how to run cohort analysis using Google Analytics here. Cohort analysis in action: How Ticketmaster used cohort analysis to boost revenue e. Cluster Analysis: An exploratory technique identifying structures within a dataset by sorting data points into internally homogeneous and externally heterogeneous groups (clusters). Offers insight into data distribution and serves as a pre-processing step for other algorithms. f. Time Series Analysis: A statistical technique identifying trends and cycles over time in a sequence of data points measuring the same variable at different time points. Enables the forecasting of future fluctuations in the variable of interest. For an in-depth look at time series analysis, refer to this introductory study on time series modelling and forecasting. Time series analysis in action: Developing a time series model to predict jute yarn demand in Bangladesh g. Sentiment Analysis: A qualitative analysis of textual data, interpreting and classifying conveyed emotions. Extracts insights from written or spoken expressions, particularly from customers. Sentiment analysis in action: 5 Real-world sentiment analysis case studies These diverse data analysis methods provide comprehensive tools for extracting meaningful insights from datasets across various domains. The selection of a specific technique depends on the nature of the data and the objectives of the analysis. 6. The data analysis process To extract meaningful insights from data, data analysts follow a comprehensive step- by-step process. While detailed information can be found in our guide to the data analysis process, a concise summary of the phases typically includes: 1. Defining the Question: The initial step involves framing the analysis objective, often referred to as a 'problem statement.' This entails formulating a question related to a business problem to be addressed. Identification of relevant data sources to answer the question is also essential. 2. Collecting the Data: After defining the objective, the focus shifts to developing a strategy for collecting and aggregating the necessary data. Decisions involve choosing between quantitative (numeric) or qualitative (descriptive) data and determining if the data falls into categories such as first-party, second-party, or third-party data. 3. Cleaning the Data: The collected data undergoes a crucial cleaning phase before analysis. This step, often the most time-consuming for a data analyst, involves tasks such as removing errors, duplicates, outliers, and unwanted data points and structuring the data by fixing typos and layout issues. Additionally, addressing major gaps in data is part of this phase. 4. Analysing the Data: With the cleaned data in hand, the analysis phase commences. Choosing from various methods outlined in this article, the data analyst determines the approach that best aligns with the defined objective. Options may include: Descriptive analysis: Identifying past occurrences. Diagnostic analysis: Understanding the reasons behind events. Predictive analysis: Anticipating future trends based on historical data. Prescriptive analysis: Providing recommendations for the future. 5. Visualising and Sharing Findings: As the analysis concludes, the next step is communicating the insights. Utilising data visualisation tools like Google Charts or Tableau, analysts present their findings in a comprehensible and engaging manner. This step marks the conclusion of the data analysis process, allowing for the effective sharing of information with others Learn more: 13 of the Most Common Types of Data Visualisation 7. The best tools for data analysis Undoubtedly, each phase of the data analysis process demands a diverse set of tools to enable data analysts to extract valuable insights. Here are some notable tools frequently employed by data analysts: 7.1 Top Nine Tools for Data Analysts Microsoft Excel Python R Jupyter Notebook Apache Spark SAS Microsoft Power BI These tools encompass a range of functionalities, from spreadsheet analysis and programming languages to interactive computing environments and advanced analytics platforms. The versatility of these tools equips data analysts with the necessary capabilities to navigate through the intricacies of data analysis across separate phases of the process. 8. Seven top challenges in implementing data analytics Analysing data comes with its set of challenges and understanding and addressing these challenges are crucial for successful data analytics. Here are some familiar challenges faced by businesses in data analytics, along with potential solutions: 1. Collecting Meaningful Data: Challenge: Overwhelming amounts of data from various channels can lead to analysis paralysis, hindering the identification of critical insights. Solution: Define clear objectives for data collection, focusing on information that adds tangible value to the business. 2. Selecting the Right Tool: Challenge: The abundance of tools available in the market can lead to confusion about which tool is best suited for specific tasks. Solution: Conduct thorough research to understand the strengths and weaknesses of various tools, aligning them with specific analytical requirements. 3. Consolidating Data from Multiple Sources: Challenge: Data originates from diverse sources, requiring consolidation for a comprehensive analysis. Solution: Implement robust data integration strategies to harmonise information from disparate sources, ensuring uniformity in formats. 4. Quality of Data Collected: Challenge: Incorrect or flawed data, often due to manual errors or inconsistencies in data entry, can compromise the reliability of analysis. Solution: Enforce stringent data quality standards, conduct regular audits, and invest in data validation tools to minimise errors. 5. Building a Data Culture among Employees: Challenge: Resistance and lack of support from employees, coupled with a lack of a data-driven culture, can impede successful analytics adoption. Solution: Foster a culture that values data-driven decision-making, provides adequate training, and garner support from leadership to ensure alignment with organisational goals. 6. Data Security: Challenge: The privacy and security of vast datasets are often overlooked, posing potential risks and vulnerabilities. Solution: Prioritise data security, implement robust encryption measures, and establish protocols for secure data storage and access to mitigate the risk of unauthorised access. 7. Data Visualisation: Challenge: Making sense of data requires effective visualisation to communicate insights to stakeholders. Solution: Utilise advanced data visualisation tools to present complex information clearly and compellingly, enhancing the interpretability of analytical findings. Understanding and addressing these challenges enables businesses to harness the benefits of big data effectively. With a strategic approach, organisations can navigate these hurdles and derive valuable insights for informed decision-making and optimal return on investment. 9. Ethical frameworks to consider Despite the presence or future emergence of regulations aimed at addressing unfair algorithmic behaviour, data scientists play a pivotal role in ethical considerations within the realm of data analytics. Fortunately, various public frameworks exist as valuable resources for data scientists and other professionals within organisations. These frameworks serve as guidance to avoid potential legal or ethical complications associated with the use of automated or data-driven decision-making processes. At its core, ethical behaviour entails avoiding harm or the creation of disadvantages for individuals, either directly or indirectly. The examples of unconscious bias highlighted earlier underscore the potential pitfalls even well-intentioned companies may face, finding themselves on the wrong side of an ethical issue and subject to public condemnation. Therefore, as a best practice, it is recommended that, before embarking on any data analytics use case, individuals take a step back and carefully consider the ethically relevant factors inherent in the specific case. This initiative-taking approach ensures a thoughtful evaluation of potential ethical implications and helps safeguard against unintended consequences in the deployment of data-driven decision-making systems. 10. A practical guide to avoiding ethical missteps Develop guidelines that spotlight critical areas requiring special attention and aid in evaluating potential risks associated with a specific use case. This assessment involves a set of inquiries addressing factors like the individuals affected by the decisions, the nature of their impact, and the extent of the repercussions. For instance, consider whether the data will be utilised to make decisions affecting the financial stability of a vulnerable population. If so, delve into the potential impact on their behaviour. Additionally, establish a period for measuring whether the algorithm is causing any undesirable effects. These guidelines aim to ensure a comprehensive analysis of risks and ethical considerations within a given data analytics context. 11. Summary Data analytics (DA) is the systematic examination of datasets to identify patterns and derive insights from the information they encapsulate. Specialised systems and software increasingly facilitate this process. The technologies and techniques employed in data analytics have become integral tools across various commercial industries, empowering organisations to make more informed and strategic business decisions.

Use Quizgecko on...
Browser
Browser