aids_2ch[1].pdf

Module 2 Cognitive Computing Foundation of Cognitive Computing Cognitive computing is a technology approach that enables humans to collaborate with machines. A cognitive system uses the data to train, test, or score a hypothesis. Cognitive computing is the use of computerized models to simulate the human thought process in complex situations where the answers might be ambiguous and uncertain. Cognitive computing is an attempt to have computers mimic the way the human brain works. To accomplish this, cognitive computing uses artificial intelligence (AI) and other underlying technologies Expert systems, Neural networks, Machine learning. Deep learning, Natural language processing (NLP). Speech recognition, Object recognition, Robotics. It uses self learning algorithms, data analysis and pattern recognition. Learning technology can be used for speech recognition, sentiment analysis, risk assessments, face detection and more. IBM’s cognitive computer system, Waston Examples SIRI, Google Assistant, Cortana , Alexa Applications: healthcare, banking, finance and retail, logistics A cognitive system has three fundamental principles to work as described below: 1> Learn:- The system leverages data to make inferences about a domain, a topic, a person, or an issue based on training and observations from all varieties, volumes, and velocity of data. 2> Model :- To learn, the system needs to create a model or representation of a domain (which includes internal and potentially external data) and assumptions that dictate what learning algorithms are used. 3> Generate Hypothesis:- A cognitive system assumes that there is not a single correct answer. The most appropriate answer is based on the data itself. cognitive computing systems must have the following attributes/features: 1> adaptive 2> interactive 3> iterative and stateful 4> contextual Adaptive: CC devices must be adaptive systems that are capable of learning as information changes and as goals and requirements evolve. They should be able to resolve ambiguity and tolerate unpredictability. Interactive: These devices should be able to interact with other processors, devices, cloud services, and humans as well. Iterative and stateful: They are able to define a problem by posing questions or finding additional source input if a problem statement is ambiguous or incomplete. They are also able to recall previous interactions in a process and return information that is suitable for the specific application at a given point in time. Contextual: CC devices should be able to understand, identify, and extract contextual elements such as syntax, meaning, time, location, appropriate domain, regulations, user profiles, process, task and goal. They can pull from multiple sources of information, including both structured and unstructured digital information, as well as sensory inputs (visual, gestural, auditory, or sensor- provided). With cognitive computing, we are bringing together two disciplines: Cognitive Computing = Cognitive Science + Computer Science Cognitive science—The science of the mind. Computer science—The scientific and practical approach to computation and its applications. It is the systematic technique for translating this theory into practice. Cognitive Computing Vs AI Benefits or Advantages of Cognitive Computing Following are the benefits or advantages of cognitive computing: ➨It helps in improvement of customer engagement and service. This enhancement of customer experience is possible due to use of cognitive applications such as cognitive assistants, social intelligence, personalized recommendations, behavioral predictions. ➨It helps in enhancing employee productivity and quality of service/product outcomes. ➨Every second 1.7 MB of data is generated by each person on the earth. Out of the total, about 99.5% of data is not being analyzed. Once this data is analyzed, it can help in unlocking many business opportunities by identifying right markets , new customer segments, new products to launch and so on. ➨Enterprises can enhance operational efficiency by implementation of cognitive applications such as predictive asset maintenance, automated replenishment systems etc. ➨It helps in providing very accurate data analysis. Hence cognitive systems are employed in healthcare industry. Drawbacks Or Disadvantages Of Cognitive Computing ➨Security is one of the major concern as digital devices manage critical information in cognitive computing. ➨The other big hurdle is its voluntary adoption by enterprises, government and individuals. ➨Change management is another challenge as this technology has power to learn like humans and behave like natural humans. Hence people are fearful as they feel machines would replace humans in future. ➨The cognitive computing based systems/products require lengthy development cycles. Challenges of Cognitive Computing Data Privacy: Cognitive computing relies heavily on data analysis, raising concerns about the privacy and security of sensitive information. Complexity: Implementation of cognition solutions can be complex and may require significant integration efforts with existing systems. Ethical and Bias: Perpetuate biases present in the training data, leading to unfair or discriminatory outcomes. How does cognitive computing work? Cognitive computing works through a combination of technologies and processes that aim to simulate human-like intelligence and decision- making. Collection: The initial step in cognitive computing involves the gathering of extensive datasets from various sources, encompassing both structured and unstructured data like text, images, videos, and sensor readings. Ingestion: Following this, the acquired data undergoes ingestion into the cognitive computing system, wherein it is systematically organized, categorized, and stored in a format conducive to effective analysis. NLP: A pivotal aspect of this process is Natural Language Processing (NLP), a fundamental component enabling the system to comprehend and interpret human language, encompassing both written and spoken communication. NLP algorithms are employed to process textual data, extracting meaning and identifying relationships between words and concepts. Subsequently, cognitive computing heavily relies on Machine Learning Algorithms to scrutinize and glean insights from the ingested data. Two primary types of machine learning are employed in this context: Supervised Learning, where the system is trained on labeled data associating inputs with known outputs. Unsupervised Learning, where the system identifies patterns and relationships within the data without predefined labels. Analysis: This machine learning capability is instrumental in facilitating pattern recognition within the cognitive system. Predictions: Through the analysis of patterns, correlations, and trends within the data, the system gains a comprehensive understanding of complex relationships, enabling it to make accurate predictions. This iterative process underscores the dynamic and evolving nature of cognitive computing, where continuous learning and adaptation are integral to enhancing system capabilities over time. Applications of Cognitive Computing Retail Industry It helps the marketing team to collect more data and then analyse to make retailers more efficient and adaptive. These help companies to make more sales and provide personalized suggestions to the customers. E-commerce sites have integrated cognitive computing very well, they collect some basic information from the customers about the basic details of the product they are looking for and then analyse the large available data and recommend the products to the customer. Through demand forecasting, price optimization, and website design, cognitive computing has provided retailers with the tools to build more agile businesses. Apart from e-commerce sites, cognition can be very useful for on-floor shopping as well. It will help retailers to provide customers personalized products – what they want, when they want, and how they want to derive meaningful experiences, opportunities to reduce the wastage and losses by providing the fresh products by predicting the demand priorly, and by automating areas it will reduce the cycle time, effort and improve the efficiency. Logistics Cognition is the new frontier in the Transportation, Logistics, and Supply chain. It helps at every stage of logistics, like Designing Decisions in Warehouse, Warehouse Management, Warehouse Automation, IoT, and Networking. In the warehousing process, cognition helps in compiling storage code, automatic picking with the automated guided vehicle, and use of warehouse robots will help to improve work efficiency. Logistics distribution links use cognition to plan the best path improving the recognition rate which will save a lot of labour. IoT will help in warehouse infrastructure management, optimizing inventory, enhancing operations in the warehouse and the autonomous guided vehicle can be used for picking and putting operations. Apart from IoT, the other important technology is Wearable Devices, which helps to convert all the objects to sensors and augments human decision-making and warehouse operations. These devices have evolved from smartwatches to smart clothes, smart glasses, computing devices, exoskeletons, ring scanners, and voice recognition. Banking and Finance Cognition in the banking industry will help to improve operational efficiency, customer engagement, and experience and grow revenues. Cognitive banking will completely reshape the banking and financial institutions on three dimensions: Deeper contextual engagement, New analytics insights, and Enterprise transformation. We are already experiencing examples of such transformation for tasks like performing various banking transactions digitally, opening a new retail account, processing claims and loans in minutes. This technology has proved to be very helpful in the areas of product management and customer service support. Cognitive banking will provide customized support to the customers, it will help in deciding personalized investment plans based on the customer being risk-averse or risk- taker. Also, it will provide personalized engagement between the financial institution and the customer by dealing in the individual fashion with each customer and focusing on their requirements. Here, the computer will intelligently understand the personality of the customer based on the other content available online authored by the customer. Power and Energy ‘Smart Power’ is the new intelligent future. The oil and gas industry faces huge cost pressure to find, produce and distribute crude oil and its byproducts. Also, they face a shortage of skilled engineers and technical professionals. Energy firms take various critical decisions where huge capital is involved, like which site to explore, allocation of resources, and quantity of production. For a long time, this decision was taken based on the data collected and stored and the expertise and intuition of the project team. With cognitive computing, technologies process volumetric data to support decisions and learn from those results. This technology will help us to make various important decisions of the future like commercially viable oil wells, ways to make existing power stations more efficient, and will also give a competitive advantage to existing power companies. Cyber Security Cognitive Algorithms provides end-to-end security platforms and detects, assesses, researches, and remediate the threats. It will help to prevent cyber Attacks (or cognitive hacking), this will make customers less vulnerable to manipulation as well as provide a technical solution to detect any misleading data and disinformation. With the increase in volumetric data, and rise in cyber attacks, and the shortage of skilled cybersecurity experts we need modern methods like cognitive computing to deal with these cyber threats. Major security players in the industry have already introduced cognitive-based services for cyber threats detection and security analytics. Such cognitive systems not only detect threats but also assess systems and scan for vulnerabilities in the system and propose actions. The other side of the coin is that for cognitive computing we need huge volumetric data, now securing the privacy of the data is also of utmost importance. To take full advantage of cognitive computing we need to build a large database of information, and at the same time also maintain its confidentiality and prevent data leakage. Healthcare We have already briefly discussed the application of Cognitive Computing in healthcare. Recent advancement in cognitive computing has helped medical professionals to make better treatment decisions and improves the efficiency of the Medical professionals and also improves the outcomes of the patients. It is a self-learning algorithm that uses Machine learning algorithms, data mining techniques, visual recognition, and natural language processing, dependent on real-time patient information, medical transcripts, and other data. The system processes an enormous amount of data instantly to answer specific questions and make intelligent recommendations. Cognitive computing in healthcare links the functioning of humans and machines where computers and the human brain truly overlap to improve human decision-making. This will empower doctors and other medical professionals to better diagnose and treat their patients and above all helps in planning customized treatment modules. For example, Genome Medicine is one such area that has evolved by cognitive computing. Education Cognitive computing is going to change how the education industry has been working. It has already started bringing a few of the changes. It will change how the schools, colleges, and universities have been functioning, and it will help to provide personalized study material to students. Can you even imagine how fast a cognitive system can search the library or the journals and research papers from a digital library? A cognitive assistant can provide personal tutorials to students, guide them through the coursework, and can also help students to understand certain critical concepts at their own pace. It can also guide students in selecting the courses depending upon their interest. It can act as a career counsellor. Cognitive computing will not only help students but will also help teachers, support staff, and administrative staff for delivering better service, preparing student reports and feedback. there are 3 essential components of a cognitive computing system: 1. A way of interpreting input: A cognitive computing system needs to answer a question or provide a result based on an input. That input might be a search term, text phrase, a query asked in natural language, or it may be a response to an action of some sort (ex: procurement of a product). The first thing a system needs to do is understand the context of the signal. Examples: location, speed of motion (in case of smartphone personal assistants, for illustration purposes). Such context info will enable the system to narrow down the potential responses to those that are more appropriate. Cognitive computing systems need to start somewhere — they need to “know” or expect something about the user to interpret the input. The more contextual clues that can be derived, defined or implied, the easier it will be to narrow the appropriate types of information to be returned. 2. A body of content/information that supports the decision: The purpose of cognitive computing is to help humans make choices and solve problems. But the system does not make up the answer. Even synthesis of new knowledge is based on foundational knowledge. The “corpus” or domain of information is a key component. The more effectively that information is curated, the better the end results. Knowledge structures are important, taxonomies and metadata are required, and some form of information hygiene is required. In order to create a cognitive system, there needs to be organizational structures for the content which provide meaning to rather unstructured content. IBM Watson for example ingests many structured (as well as semi-structured) repositories of information: dictionaries, news articles and databases, taxonomies, and ontologies (ex , Wordnet, DBpedia..) These sources provide the information needed to respond to questions, forming the corpus of information that Watson draws upon. This initial modeling requires investment of time + expertise to build the foundational elements from which the system can then synthesize responses. Each of these actions requires content modeling and metadata structures, use cases (supported by customer engagement approach). 3. A way of processing the signal against the content/info corpus: This is where machine learning for example comes into play. ML has for long been applied to categorization and classification approaches, and advanced text analytics. The processing might be in the form of a query/matching algorithm or may involve other mechanisms to interpret the query, transform it, reduce ambiguity, derive syntax, define word sense, deduce logical relationships or otherwise parse/process the signal against the corpus. Machine learning has many “flavours” — from various types of supervised learning approaches (where a known sample or result is used to instruct the system what to look for), to classes of unsupervised approaches (where the system is simply asked to identify/form patterns and outliers) - and even combinations of these approaches at different stages of the process. (an unsupervised learning approach could, for instance, identify hidden structures so the output could be applied as a “training set” to other data sources). The key here is to iteratively improve the system’s performance over time by approximating an output and using that as an input for the next round of processing. In some cases, incorrect answers (as judged by a human or another data source) might be input for the next time the system encounters the problem or question. Infrastructure and Deployment Modalities In a cognitive system it is critical to have a flexible and agile infrastructure to support applications that continue to grow over time. As the market for cognitive solutions matures, a variety of public and private data need to be managed and processed. In addition, organizations can leverage Software as a Service (SaaS) applications and services to meet industry‐specific requirements. A highly parallelized and distributed environment, including compute and storage cloud services, must be supported. Data Access , Metadata & Management Services → Because cognitive computing centers around data, it is not surprising that the sourcing, accessing, and management of data play a central role. Therefore, before adding and using that data, there has to be a range of underlying services. To prepare to use the ingested data requires an understanding of the origins and lineage of that data. Therefore, there needs to be a way to classify the characteristics of that data such as when that text or data source was created and by whom. In a cognitive system these data sources are not static. There will be a variety of internal and external data sources that will be included in the corpus. To make sense of these data sources, there needs to be a set of management services that prepares data to be used within the corpus. Therefore, as in a traditional system, data has to be vetted, cleansed, and monitored for accuracy. Corpus, Taxonomies & Data Catalogs → Corpus is the knowledge base of ingested data and is used to manage codified knowledge. Such data is primarily be text‐based (documents, textbooks, patient notes, customer reports, and such) but contemporarily there’s support also for unstructured and semi‐structured data (ex. videos, images, sounds). Ontologies are often developed by industry groups to classify industry‐specific elements such as standard chemical compounds, machine parts, or medical diseases and treatments. In addition, the corpus may include ontologies that specify entities and their relationships (it is quite often required to use a subset of an industry‐based ontology to include only the data that pertains to the focus of the cognitive system). A taxonomy provides context within the ontology. Data Analytics Services → these are the techniques used to develop understanding of the data ingested and managed within the corpus. A set of advanced algorithms are applied to develop the model for the cognitive system. Typically, users can take advantage of structured, unstructured, and semi‐structured data that has been ingested and begin to use sophisticated algorithms to predict outcomes, discover patterns, or determine next best actions. These services do not live in isolation. They continuously access new data from the data access layer and pull data from the corpus. A number of advanced algorithms are applied to develop the model for the cognitive system. Continuous machine learning Machine learning is the technique that provides the capability for the data to learn without being explicitly programmed. Cognitive systems are not static. Rather, models are continuously updated based on new data, analysis, and interactions. It is the procedure embraced by Cognitive Systems typically involves two sets of dynamics: (a) Hypotheses Generation and (b) Hypotheses Evaluation. Hypothesis Generation and Evaluation It is a testable assertion based on evidence that explains some observed phenomenon; goal here is to look for evidence to support (or refute) hypotheses, this is typically accomplished through an iterative process of training the data. Training may occur automatically based on the systems analysis of data, or training may incorporate human end users. After training, it begins to become clear if the hypothesis is supported by the data. If the hypothesis is not supported by the data, the user has several options. For example, the user may refine the data by adding to the corpus, or change the hypothesis. To evaluate the hypothesis requires a collaborative process of constituents that use the cognitive system. Just as with the creation of the hypothesis, the evaluation of results refines those results and trains again. The Learning Process To learn from data you need tools to process both structured and unstructured data. For unstructured textual data, NLP services can interpret and detect patterns to support a cognitive system. Unstructured data such as images, videos, and sound requires deep learning tools. Data from sensors are important in emerging cognitive systems. Industries ranging from transportation to healthcare use sensor data to monitor speed, performance, failure rates, and other metrics and then capture and analyze this data in real time to predict behavior and change outcomes. Presentation and Visualization Services To interpret complex and often massive amounts of data requires new visualization interfaces. Data visualization is the visual representation of data as well as the study of data in a visual way. For example, a bar chart or pie chart is a visual representation of underlying data. Patterns and relationships in data are easier to identify and understand when visualized with structure, color, and such. The two basic types of data visualizations are static and dynamic. In either or both cases, there may also be a requirement for interactivity. Sometimes looking at the visualized representation of the data is not enough. You need to drill down, re‐position, expand and contract, and so on. This interactivity enables you to “personalize” the views of the data so that you can pursue non‐obvious manifestations of data, relationships, and alternatives. Visualization may depend on color, location, and proximity. Other critical issues that impact visualization include shape, size, and motion. Presentation services prepare results for output. Visualization services help to communicate results by providing a way to demonstrate the relationships between data. A cognitive system brings text or unstructured data together with visual data to gain insights. In addition, images, motion, and sound are also elements that need to be analyzed and understood. Making this data interactive through a visualization interface can help a cognitive system be more accessible and usable. Cognitive Applications A cognitive system must leverage underlying services to create applications that address problems in a specific domain. These applications that are focused on solving specific problems must engage users so that they gain insights and knowledge from the system. In addition, these applications may need to infuse processes to gain insight about a complex area such as preventive maintenance or treatment for a complex disease. An application may be designed to simulate the smartest customer service agent. The end goal is to turn an average employee into the smartest employee with many years of experience. A well‐designed cognitive system provides the user with contextual insights based on role, the process, and the customer issue they are solving. The solution should provide the users insights so they make better decisions based on data that exists but is not easily accessible Building the Corpus A corpus is a machine‐readable representation of the complete record of a particular domain or topic. Experts in a variety of fields use a corpus or corpora for tasks such as linguistic analysis to study writing styles or even to determine authenticity of a particular work. a number of questions have to be addressed early in the design phase for a cognitive computing system: Which internal and external data sources are needed for the specific domain areas and problems to be solved? Will external data sources be ingested in whole or in part? How can you optimize the organization of data for efficient search and analysis? How can you integrate data across multiple corpora? How can you ensure that the corpus is expanded to fill in knowledge gaps in your base corpus? How can you determine which data sources need to be updated and at what frequency? The choice of which sources to include in the initial corpus is critical. Sources ranging from medical journals to Wikipedia may now be efficiently imported in preparation for the launch of a cognitive system. In addition, it may be equally important to ingest information from videos, images, voice, and sensors. These sources are ingested at the data access layer Other data sources may also include subject‐specific structured databases, ontologies, taxonomies, and catalogs Corpus Management Regulatory and Security Considerations Data sources and the movement of that data are increasingly becoming heavily regulated, particularly for personally identifiable information. Some general issues of data policies for protection, security, and compliance are common to all applications, but cognitive computing applications learn and derive new data or knowledge that may also be subject to a growing body of state, federal, and international legislation When the initial corpus is developed, it is likely that a lot of data will be imported using extract‐transform‐load (ETL) tools. These tools may have risk management, security, and regulatory features to help the user guard against data misuse or provide guidance when sources are known to contain sensitive data. The availability of these tools doesn’t absolve the developers from responsibility to ensure that the data and metadata is in compliance with applicable rules and regulations. Protected data may be ingested (for example, personal identifiers) or generated (for example, medical diagnoses) when the corpus is updated by the cognitive computing system. Bringing Data into the Cognitive System Unlike many traditional systems, the data that is ingested into the corpus is not static. You need to build a base of knowledge that adequately defines your domain space. You begin populating this knowledge base with data you expect to be important. As you develop the model in the cognitive system, you refine the corpus. Therefore, you will continuously add to the data sources, transform those data sources, and refine and cleanse those sources based on the model development and continuous learning. d be considered and potentially ingested. However, this does not mean that all sources will be of equal value. Leveraging Internal and External Data Sources Most organizations already manage huge volumes of structured data from their transactional systems and business applications, and unstructured data such as text contained in forms or notes and possibly images from documents or corporate video sources. Although some firms are writing applications to monitor external sources such as news and social media feeds, many IT organizations are not yet well equipped to leverage these sources and integrate them with internal data sources. Most cognitive computing systems will be developed for domains that require ongoing access to integrated data from outside the organization. Just as an individual learns to identify the right external sources to support decision making—from newspapers to network news to social media on the Internet—a cognitive computing system generally needs to access a variety of frequently updated sources to keep current about the domain in which it operates. Also, like professionals who must balance the news or data from these external sources against their own experience, a cognitive system must learn to weigh the external evidence and develop confidence in the source as well as the content over time. For example, a popular magazine with articles on psychology may be a valuable resource, but if it contains data that is in conflict with a refereed journal article on the same topic, the system must know how to weigh the opposing positions. Data Access and Feature Extraction Services The data access level depicts the main interfaces between the cognitive computing system and the outside world. Any data to be imported from external sources must come through processes within this layer. Cognitive computing applications may leverage external data sources in formats as varied as natural language text, video images, audio files, sensor data, and highly structured data formatted for machine processing. The analogy to human learning is that this level represents the senses. The feature extraction layer has to complete two tasks. First, it has to identify relevant data that needs to be analyzed. The second task is to abstract data as required to support machine learning. The data access level is shown as separate but closely bound to the feature extraction level to reinforce the idea that some data must be captured and then analyzed or refined before it is ready to be integrated into a corpus suitable for a particular domain. Any data that is considered unstructured—from video and images to natural language text—must be processed in this layer to find the underlying structure. Feature extraction and deep learning refer to a collection of techniques—primarily statistical algorithms—used to transform data into representations that capture the essential properties in a more abstract form that can be processed by a machine learning algorithm. Analytics Services Analytics refers to a collection of techniques used to find and report on essential characteristics or relationships within a data set. In general, the use of an analytic technique provides insights about the data to guide some action or decision. A number of packaged algorithms such regression analysis are widely used within solutions. Within a cognitive system, a wide range of standard analytics components are available for descriptive, predictive, and prescriptive tasks within statistical software packages or in commercial component libraries. A variety of tools that support various tasks within cognitive computing systems are available. A cognitive computing system generally has additional analytical components embedded in the machine learning cycle algorithms. Machine Learning :Continuous learning without reprogramming is at the heart of all cognitive computing solutions. Although the techniques used to acquire, manage, and learn from data vary greatly, at their core most systems apply algorithms developed by researchers in the field of machine learning. Machine learning is a discipline that draws from computer science, statistics, and psychology. Finding Patterns in Data :A typical machine‐learning algorithm looks for patterns in data and then takes or recommends some action based on what it finds. A pattern may represent a similar structure (for example, elements of a picture that indicate a face), similar values (a cluster of values similar to those found in another data set) or proximity (how “close” the abstract representation of one item is to another). Proximity is an important concept in pattern identification or matching. Two data strings representing things or concepts in the real world are “close” when their abstract binary representations have similar characteristics. Supervised Learning :Supervised learning refers to an approach that teaches the system to detect or match patterns in data based on examples it encounters during training with. The training data should include examples of the types of patterns or question‐answer pairs the system will have to process. Learning by example, or modeling, is a powerful teaching technique that can be used for training systems to solve complex problems. After the system is operational, a supervised learning system also uses its own experience to improve its performance on pattern matching tasks. In supervised learning, the job of the algorithm is to create a mapping between input and output. The supervised learning model has to process enough data to get to the wanted level of validation, usually expressed as accuracy on the test data set. Both the training data and independent test data should be representative of the type of data that will be encountered when the system is operational. Reinforcement Learning Reinforcement learning is a special case of supervised learning in which the cognitive computing system receives feedback on its performance to guide it to a goal or good outcome. Unlike other supervised learning approaches, however, with reinforcement learning, the system is not explicitly trained with sample data. In reinforcement learning, the system learns to take next actions based on trial and error. Some typical applications of reinforcement learning include robotics and game playing. The machine learning algorithms assess the goodness or effectiveness of policies or actions, and reward the most effective actions. A sequence of successful decisions results in reinforcement, which helps the system generate a policy that best matches the problem being addressed. Reinforcement learning for cognitive computing is most appropriate where a system must perform a sequence of tasks and the number of variables would make it too difficult to develop a representative training data set. For example, reinforcement would be used in robotics or a self‐driving car. The learning algorithm must discover an association between the reward and a sequence of events leading up to the reward. Then the algorithm can try to optimize future actions to remain in a reward stat Unsupervised Learning Unsupervised learning refers to a machine learning approach that uses inferential statistical modeling algorithms to discover rather than detect patterns or similarities in data. An unsupervised learning system can identify new patterns, instead of trying to match a set of patterns it encountered during training. Unlike supervised learning, unsupervised learning is based solely on experience with the data rather than on training with sample data. Unsupervised learning requires the system to discover which relationships among data elements or structures are important by analyzing attributes like frequency of occurrence, context (for example, what has been seen or what has occurred previously), and proximity. Unsupervised learning is the best approach for a cognitive computing system when an expert or user cannot give examples of typical relationships or question‐answer pairs as guides to train the system. This may be due to the complexity of the data, when there are too many variables to consider, or when the structure of the data is unknown (for example, evaluating images from a surveillance camera to detect which person or persons are behaving differently from the crowd) Unsupervised learning is also appropriate when new patterns emerge faster than humans can recognize them so that regular training is impossible. For example, a cognitive computing system to evaluate network threats must recognize anomalies that may indicate an attack or vulnerability that has never been seen before Hypotheses Generation and Scoring A hypothesis in science is a testable assertion based on evidence that explains some observed phenomenon or relationship between elements within a domain. The key concept here is that a hypothesis has some supporting evidence or knowledge that makes it a plausible explanation for a causal relationship. It isn’t a guess. When a scientist formulates a hypothesis as an answer to a question, it is done in a way that allows it to be tested. The hypothesis actually has to predict an experimental outcome. An experiment or series of experiments that supports the hypothesis increases confidence in the ability of the hypothesis to explain the phenomenon. This is conceptually similar to a hypothesis in logic, generally stated as “if P then Q”, where “P” is the hypothesis and “Q” is the conclusion. In the natural sciences we conduct experiments to test hypotheses. Using formal logic we can develop proofs to show that a conclusion follows from a hypothesis (or that it does not follow). In a cognitive computing system we look for evidence—experiences and data or relationships between data elements—to support or refute hypotheses. That is the basis for scoring or assigning a confidence level for a hypothesis. If a cognitive computing hypothesis can be expressed as a logical inference, it may be tested using mechanical theorem proving algorithms. Typically, however, cognitive computing applications solve problems in domains with supporting data that are not so neatly structured. These domains, like medicine and finance, have rich bodies of supporting data that are better suited to statistical methods like those used in scientific experimental design. Hypothesis Generation The discussion about the scientific method said that a hypothesis is formulated to answer a question about a phenomenon based on some evidence that made it plausible. The experimental process is designed to test whether the hypothesis applies in the general case, not just with the evidence that was used to develop the hypothesis. there are two key ways a hypothesis may be generated. The first is in response to an explicit question from the user, such as “What might cause my fever of 102 and sore throat?” it may recognize that there are too many answers to be useful and request more information from the user to refine the set of likely causes. This approach to hypothesis generation is frequently used when the goal is to detect a relationship between cause and effect in a domain in which there is a known set of causes and a known set of effects, but there are so many combinations that the mapping of all causes to all effects is an intractable problem for humans to solve. The second type of hypothesis generation does not depend on a user asking a specific question. Instead, the system constantly looks for anomalous data patterns that may indicate threats or opportunities. Detecting a new pattern creates a hypothesis based on the nature of the data. For example, if the system is monitoring network sensors to detect threats, a new pattern may create a hypothesis that this pattern is a threat, and the system must either find evidence to support or refute that hypothesis. If the system is monitoring real‐time stock transactions, a new pattern of buying behavior may indicate an opportunity. In these systems, the type of hypotheses that will be generated depends on assumptions of the system designers rather than on the actions of the users. Both types of application have the system generate one or more hypotheses based on an event, but in the first case, the event is a user question, and in the second it is driven by a change in the data itself. Hypothesis Scoring: cognitive computing systems build a corpus of relevant data for a problem domain. Then, in response to a user question or change in the data, the system generates one or more hypotheses to answer a user’s question or explain a new data pattern. The next step is to evaluate or score these hypotheses based on the evidence in the corpus, and then update the corpus and report the findings to the user or another external system. Hypothesis scoring is a process in which the representation of the hypothesis is compared with data in the corpus to see what evidence exists to support the hypothesis, and what may actually refute it (or rule it out as a valid possible explanation). scoring or evaluating a hypothesis is a process of applying statistical methods to the hypothesis‐evidence pairs to assign a confidence level to the hypothesis. The actual weights that are assigned to each piece of supporting evidence can be adjusted based on experience with the system and feedback during training and during the operational phase. If none of the hypotheses score above a predetermined threshold, the system may ask for more evidence (a new diagnostic blood test) if that information could change the confidence in the hypothesis. Techniques for measuring the proximity of two data elements or structures for pattern matching, such as the fit between two hypothesis‐evidence pairs, generally rely on a binary vector representation (such as a sparse distributed representation [SDR]) that can be manipulated using matrix algebra with readily available tools. The generation/scoring loop may be set to continue until a user is satisfied with the answer or until the system has evaluated all options Presentation and Visualization Services As a cognitive computing system cycles through the hypothesis generation and scoring cycle, it may produce new answers or candidate answers for a user. In some situations, the user may need to provide additional information. How the system presents these findings or questions will have a big impact on the usability of the system in two ways. First, when presenting data supporting a hypothesis such as a medical diagnosis or recommended vacation plan, the system should present the finding in a way that conveys the most meaning with the least effort on the part of the user and support the finding with relevant evidence. Second, when the system requires additional details to improve its confidence in one or more hypotheses, the user must present that data in a concise and unambiguous way. The general advantage for visualization tools is their capability to graphically depict relationships between data elements in ways that focus attention on trends and abstraction rather than forcing the user to find these patterns in the raw data Following are three main types of services available to accomplish these goals. Narrative solutions, which use natural language generation techniques to tell a story about the data or summarize findings in natural language. This is appropriate for reporting findings or explanations about the evidence used to arrive at a conclusion or question. Visualization services present data in nontext forms, including: Graphics, ranging from simple charts and graphs to multidimensional representations of relationships between data. Images, selected from the data to be presented or generated from an underlying representation. (For example, if feature extraction detects a “face” object, a visualization service could generate a “face” or pictograph from a standard features library.) Gestures or animation of data designed to convey meaning or emotion. Reporting services refers to functions that produce structured output, such as database records, that may be suitable for humans or machines Infrastructure The infrastructure/deployment modalities layer represents the hardware, networking, and storage foundation for the cognitive computing application. As noted in the discussion on emerging neuromorphic hardware architectures, “Future Applications for Cognitive Computing,” most cognitive computing systems built in the next decade will primarily use conventional hardware. The two major design considerations for cognitive computing infrastructure decisions are: The two major design considerations for cognitive computing infrastructure decisions are: Distributed Data Management—For all but the smallest applications, cognitive computing systems can benefit from tools to leverage distributed external data resources and to distribute their operational workloads. Managing the ongoing ingestion of data from a variety of external sources requires a robust infrastructure that can efficiently import large quantities of data. Based on the domain, this may be a combination of structured and unstructured data available for batch or streaming ingestion. Today, a cloud‐first approach to data management is recommended to provide maximum flexibility and scalability. Parallelism— The fundamental cognitive computing cycle of hypothesis generation and scoring can benefit enormously from a software architecture that supports parallel generation/scoring of multiple hypotheses, but performance ultimately depends on the right hardware. Allocating each independent hypothesis to a separate hardware thread or core is a requirement in most cases for acceptable performance as the corpus scales up and the number of hypotheses increases. Although performance improvements should be seen within the system as it learns, the rate of data expansion in the corpus generally outpaces this performance improvement. That argues strongly for the selection of hardware architecture that supports relatively seamless expansion with additional processors Design Principles for Cognitive Systems In a cognitive computing system, the model refers to the corpus and the set of assumptions and algorithms that generate and score hypotheses to answer questions, solve problems, or discover new insights. How you model the world determines what kind of predictions you can make, patterns and anomalies you can detect, and actions you can take. The initial model is developed by the designers of the system, but the cognitive system will update the model and use the model to answer questions or provide insights. The corpus is the body of knowledge that machine learning algorithms use to continuously update that model based on its experience, which may include user feedback. Design Principles for Cognitive Systems A cognitive system is designed to use a model of a domain to predict potential outcomes. Designing a cognitive system involves multiple steps. It requires an understanding of the available data, the types of questions that need to be asked, and the creation of a corpus comprehensive enough to support the generation of hypotheses about the domain based on observed facts. Therefore, a cognitive system is designed to create hypotheses from data, analyze alternative hypotheses, and determine the availability of supporting evidence to solve problems. By leveraging machine learning algorithms, question analysis, and advanced analytics on relevant data, which may be structured or unstructured, a cognitive system can provide end users with a powerful approach to learning COGNITIVE DESIGN PRINCIPLES A cognitive system should be able to understand any kind of data and communicate with multiple data sources to process structured and unstructured data and generate meaningful insights or recommendations. Natural language processing, text to speech conversion, speech to text conversions are used widely in cognitive systems. The process involved in designing a cognitive system is iterative and should have 7 basic steps as shown in Figure. These are not sequential steps, and hence in the implementation process, we can go from one step to other depending on the results at each step The maturity of a cognitive model is measured by the confidence level of the recommendations or the actions taken by a cognitive process. Hence, improving the confidence level of a system plays a vital role in making the cognitive process successful. There are multiple factors which can influence the confidence level depending on the data it has access to. In a normal scenario, the confidence level can be improved based on the human actions taken to the recommendations suggested by the cognitive system. These are self-evolving and require human intervention for updating the context and improving the confidence level. The Role of NLP in a Cognitive System NLP is a set of techniques that extract meaning from text. These techniques determine the meaning of a word, phrase, sentence, or document by recognizing the grammatical rules—the predictable patterns within a language. They rely, as people do, on dictionaries, repeated patterns of co‐occurring words, and other contextual clues to determine what the meaning might be. NLP applies the same known rules and patterns to make inferences about meaning in a text document. Further, these techniques can identify and extract elements of meaning, such as proper names, locations, actions, or events to find the relationships among them, even across documents. These techniques can also be applied to the text within a database and have been used for more than a decade to find duplicate names and addresses or analyze a comment or reason field, for instance, in large customer databases. The Importance of Context Translating unstructured content from a corpus of information into a meaningful knowledge base is the task of NLP. Linguistic analysis breaks down the text to provide meaning. The text has to be transformed so that the user can ask questions and get meaningful answers from the knowledge base. Any system, whether it is a structured database, a query engine, or a knowledge base, requires techniques and tools that enable the user to interpret the data. The key to getting from data to understanding is the quality of the information. With NLP it is possible to interpret data and the relationships between words. It is important to determine what information to keep and how to look for patterns in the structure of that information to distill meaning and context. NLP enables cognitive systems to extract meaning from text, Phrases, sentences, or complex full documents provide context so that you can understand the meaning of a word or term. This context is critical to assessing the true meaning of text‐based data. Patterns and relationships between words and phrases in the text need to be identified to begin to understand the meaning and actual intent of communications. When humans read or listen to natural language text, they automatically find these patterns and make associations between words to determine meaning and understand sentiment. There is a great deal of ambiguity in language, and many words can have multiple meanings depending on the subject matter being discussed or how one word is combined with other words in a phrase, sentence, or paragraph. When humans communicate information there is an assumption of context. There are many layers to the process of understanding meaning in context. Various techniques are used such as building a feature vector from any information that can be extracted from the document. Statistical tools help with information retrieval and extraction. These tools can help to annotate and label the text with appropriate references (that is, assigning a name to an important person in the text). When you have a suffi cient amount of annotated text, machine learning algorithms can ensure that new documents are automatically assigned the right annotations. Connecting Words for Meaning The nature of human communications is complicated. Humans are always transforming the way language is used to convey information. Two individuals can use the same words and even the same sentences and mean different things. We stretch the truth and manipulate words to interpret meaning. Therefore, it is almost impossible to have absolute rules for what words mean on their own and what they mean within sentences. To understand language we have to understand the context of how words are used in individual sentences and what sentences and meanings come before and after those sentences. We are required to parse meaning so that understanding is clear. It is not an easy task to establish context so that those individuals asking questions and looking for answers gain insights that are meaningful. Understanding Linguistics NLP is an interdisciplinary field that applies statistical and rules‐based modeling of natural languages to automate the capability to interpret the meaning of language. Therefore, the focus is on determining the underlying grammatical and semantic patterns that occur within a language or a sublanguage (related to a specificfield or market). For instance, different expert domains such as medicine or laws use common words in specialized ways. Therefore, the context of a word is determined by knowing not just its meaning within a sentence, but sometimes by understanding whether it is being used within a particular domain. For example, in the travel industry the word “fall” refers to a season of the year. In a medical context it refers to a patient falling. NLP looks not just at the domain, but also at the levels of meaning that each of the following areas provide to our understanding. Language Identification and Tokenization In any analysis of incoming text, the first process is to identify which language the text is written in and then to separate the string of characters into words (tokenization ). Many languages do not separate words with spaces, so this initial step is necessary. NLP looks not just at the domain, but also at the levels of meaning that each of the following areas provide to our understanding. Language Identification and Tokenization In any analysis of incoming text, the first process is to identify which language the text is written in and then to separate the string of characters into words (tokenization ). Many languages do not separate words with spaces, so this initial step is necessary. Phonology Phonology is the study of the physical sounds of a language and how those sounds are uttered in a particular language. This area is important for speech recognition and speech synthesis but is not important for interpreting written text. However, to understand, for instance, the soundtrack of a video, or the recording of a call center call, not only is the pronunciation of the words important (regional accents such as British English or Southern United States), but the intonation patterns. A person who is angry may use the same words as a person who is confused; however, differences in intonation will convey differences in emotion. When using speech recognition in a cognitive system, it is important to understand the nuances of how words are said and the meaning that articulation or emphasis conveys. Morphology Morphology refers to the structure of a word. Morphology gives us the stem of a word and its additional elements of meaning. Is it singular or plural? Are the verbs first person, future tense, or conditional? This requires that words be partitioned into segments known as morphemes that help bring understanding to the meaning of terms. This is especially important in cognitive computing, since human language rather than computing language is the technique for determining answers to questions. Elements in this context are identified and arranged into classes. There are elements including prefixes, suffixes, infixes, and circumfixes. For example, if a word begins with “non‐” it has a specific reference to a negative. There is a huge difference in meaning if someone uses the verb “come” versus the verb “came.” Combinations of prefixes and suffixes can be combined to form brand new words with very different meanings. Morphology is also used widely in speech and language translation as well as the interpretation of images. Although many dictionaries have been created to provide explanations of different constructions of words in various languages, it is impossible for these explanations to ever be complete (each human language has its own context and nuances that are unique). In languages such as English, rules are often violated. There are new words and expressions created every day. This process of interpreting meaning is aided by the inclusion of a lexicon or repository of words and rules based on the grammar of a specific language. For example, through a technique called parts of speech tagging or tokenization, it is possible to encapsulate certain words that have definitive meaning. This may be especially important in specific industries or disciplines. For example, in medicine the term “blood pressure” has a specific meaning, However, the words blood and pressure when used independently can have a variety of meanings. Likewise, if you look at the elements of a human face, each component may independently not provide the required information Lexical Analysis Lexical analysis within the context of language processing is a technique that connects each word with its corresponding dictionary meaning. However, this is complicated by the fact that many words have multiple meanings. The process of analyzing a stream of characters from a natural language requires a sequence of tokens (a string of text, categorized according to the rules as a symbol such as a number or comma). Specialized taggers are important in lexical analysis. For example, an n‐gram tagger uses a simple statistical algorithm to determine the tag that most frequently occurs in a reference corpus. The analyzer (sometimes called a lexer ) categorizes the characters according to the type of character string. When this categorization is done, the lexer is combined with a parser that analyzes the syntax of the language so that the overall meaning can be understood. The lexical syntax is usually a regular language whose alphabet consists of the individual characters of the source code text. The phrase syntax is usually a context‐free language whose alphabet consists of the tokens produced by the lexer. Lexical analysis is useful in predicting the function of grammatical words that initially could not be identified. For example, there might be a word like “run” that has multiple meanings and can be a verb or a noun Syntax and Syntactic Analysis Syntax applies to the rules and techniques that govern the sentence structure in languages. The capability to process the syntax and semantics of natural language is critical to a cognitive system because it is important to deduct inferences about what language means based on the topic it is being applied to. Therefore, although words may have a general meaning when used in conversation or written documents, the meaning may be entirely different when used in context of a specific industry. For example, the word “tissue” has different definitions and understanding based on the context of its use. For example, in biology, tissue is a group of biological cells that perform a specific function. However, a tissue can also be used to wrap a present or wipe a runny nose. Even within a domain context, there can still be word‐sense ambiguity. In a medical context, “tissue” can be used with skin or a runny nose. Syntactical analysis helps the system understand the meaning in context with how the term is used in a sentence. This syntactic analysis or parsing is the overall process for analyzing the string of symbols in a natural language based on conforming to a set of grammar rules. Within computational linguistics, parsing refers to the technique used by the system to analyze strings of words based on the relationship of those words to each other, in context with their use. Syntactical analysis is important in the question‐answering process. For example, suppose you want to ask, “Which books were written by British women authors before the year 1800?” The parsing can make a huge difference in the accuracy of the answer. In this case, the subject of the question is books. Therefore, the answer would be a list of books. If, however, the parser assumed that “British woman authors” was the topic, the answer would instead be a list of authors and not the books they wrote Construction Grammars Although there are many different approaches to grammar in linguistics, construction grammar has emerged as an important approach for cognitive systems. When approaching syntactical analysis, the results are often represented in a grammar that is often written in text. Therefore, interpretation requires a grammar model that understands text and its semantics. Construction grammar has its roots in cognitive‐oriented linguistic analysis. It seeks to find the optimal way to represent relationships between structure and meaning. Therefore, construction grammar assumes that knowledge of a language is based on a collection of “form and function pairings.” The “function” side covers what is commonly understood as meaning, content, or intent; it usually extends over both conventional fields of semantics and pragmatics. Construction grammar was one of the first approaches that set out to search for a semantically defined deep structure and how it is manifested in linguistic structures. Therefore, each construction is associated with the principle building blocks of linguistic analysis, including phonology, morphology, syntactic, semantics, pragmatics, discourse, and prosodic characteristics. Discourse Analysis One of the most difficult aspects of NLP is to have a model that brings together individual data in a corpus or other information source so that there is coherency. It is not enough to simply ingest vast amounts of data from important information sources if the meaning, structure, and intention cannot be understood. Certain assertions may be true or false depending on the context. For example, people eat animals, but people are animals, and in general don’t eat each other. However, timing is important to understanding context. For example, during the 18th century, cigarette smoking was thought to be beneficial to the lungs. Therefore, if someone were ingesting an information source from that period of time, it would assume that smoking was a good thing. Without context there would be no way to know that premise of that data was incorrect. Discourse is quite important in cognitive computing because it helps deal with complex issues of context. When a verb is used, it is important to understand what that verb is associated with in terms of reference. Within domain‐specific sources of data you need to understand the coherence of related information sources. For example, what is the relationship between diabetes and sugar intake? What about the relationship between diabetes and high blood pressure? The system needs to be modeled to look for these types of relationships and context. Pragmatics Pragmatics is the aspect of linguistics that tackles one of the fundamental requirements for cognitive computing: the ability to understand the context of how words are used. A document, an article, or a book is written with a bias or point of view. For example, the writer discussing the importance of horses in the 1800s will have a different point of view than the writer talking about the same topic in 2014. In politics, two documents might discuss the same topic and take opposite sides of the argument. Both writers could make compelling cases for their point of view based on a set of facts. Without understanding the background of the writer, it is impossible to gain insight into meaning. The field of pragmatics provides inference to distinguish the context and meaning about what is being said. Within pragmatics, the structure of the conversation within text is analyzed and interpreted. Techniques for Resolving Structural Ambiguity Disambiguation is a technique used within NLP for resolving ambiguity in language. Most of these techniques require the use of complex algorithms and machine learning techniques. Even with the use of these advanced techniques, there are no absolutes. Resolution of ambiguity must always deal with uncertainties. We can’t have complete accuracy; instead, we rely on the probability of something being most likely to be true. This is true in human language and also in NLP. For example, the phrase, “The train ran late,” does not mean that the train could “run”; rather the train was expected to arrive at the station later than it was scheduled. There is little ambiguity in this statement because it is a commonly known phrase. However, others phrases are easily misunderstood. For example, examine the phrase, “The police officer caught the thief with a gun.” One might decide that it was the police officer that used a gun to arrest the thief. However, it may well have been the thief was using the gun to commit a crime. Sometimes, the truth of meaning can be hidden inside a complicated sentence. Because cognitive computing is a probabilistic rather than a deterministic approach to computing, it is not surprising that probabilistic parsing is one way of solving disambiguation. Probabilistic parsing approaches use dynamic programming algorithms to determine the most accurate explanation for a sentence or string of sentences. Importance of Hidden Markov Models One of the most important statistical models for processing both image and speech understanding are Markov Models. Increasingly, these models are fundamental to understanding the hidden information inside images, voice, and video. It is now clear that it is complicated to gain a clear understanding of the meaning that is often hidden within language. While the human brain automatically understands how to cope with the fact that the real meaning of a sentence may be indirect, “The cow jumped over the moon” may seem like an impossible task if the sentence were read literally. However, the sentence refers to a song for young children and is intended to be unrealistic and silly. The human mind calculates the probability that this sentence is intended to be a literal action between the cow and the moon. The human understands through the context of their environment, which may dictate a specific interpretation. The way systems interpret language requires a set of statistical models that are an evolution of a model developed by A.A. Markov in the early 1900s. Markov asserted that it was possible to determine the meaning of a sentence or even a book by looking at the frequency that words would occur in text and the statistical probability that an answer was correct. The most important evolution of Markov’s model for NLP and cognitive computing is Hidden Markov Models (HMMs) The premise behind HMMs is that the most recent data will tell you more about your subject than the distant past because the models are based on the foundations of probability. HMMs therefore help with predictions and filtering as well as smoothing of data. Hidden Markov Models (HMMs) are intended to interpret “noisy” sequences of words or phrases based on probabilistic states. In other words, the model takes a group of sentences or sentence fragments and determines the meaning. Using HMMs requires thinking about the sequence of data. HMMs are used in many different applications including speech recognition, weather patterns, or how to track the position of a robot in relationship to its target. Therefore, Markov models are very important for when you need to determine the exact position of data points when there is a very noisy data environment. Applying HMMs allows the user to model the data sequence supported by a probabilistic model. Within the model, an algorithm using supervised learning will look for repeating words or phrases that indicate the meaning and how various constructs affect the likelihood that a meaning is true. Markov models assume that the probability of a sequence of words will help us determine the meaning. There are a number of techniques used in HMMs to estimate the probability that a word sequence has a specific meaning. For example, there is a technique called maximum likelihood estimation that is determined by normalizing the content of a corpus. The value of HMMs is that they do the work of looking for the underlying state of sentences or sentence fragments. Therefore, as the models are trained on more and more data they abstract constructs and meaning. The capability to generate probabilities of meaning and the state transition are the foundation of HMM models and are important in cognitive understanding of unstructured data. The models become more efficient in their ability to learn and to analyze new data sources. Although HMMs are the most prevalent method in understanding the meaning of sentences, another technique called maximum entropy is designed to establish probability through the distribution of words. To create the model, labeled training data is used to constrain the model. This classifies the data. There are a number of approaches that are important in understanding a corpus in context with its use in a cognitive system. Word‐Sense Disambiguation (WSD) Not only do you have to understand a term within an ontology, it is critical to understand the meaning of that word. This is especially complex when a single word may have multiple meanings depending on how it is used. Given this complexity, researchers have been using supervised machine learning techniques. A classifier is a machine learning approach that organizes elements or data items and places them into a particular class. There are different types of classifiers used depending on the purpose. For example, document classification is used to help identify where particular segments of text might belong in your taxonomy. Often classifiers are used for pattern recognition and therefore are instrumental in cognitive computing. When a set of training data is well understood, supervised learning algorithms are applied. However, in situations in which the data set is vast and it cannot easily be identified, unsupervised learning techniques are often used to determine where clusters are occurring. Scoring of results is important here because patterns have to be correlated with the problem being addressed. Other methods may rely on a variety of dictionaries or lexical knowledge bases. This is especially important when there is a clear body of knowledge—in health sciences, for example. There are many taxonomies and ontologies that define diseases, treatments, and the like. When these elements are predefined, it allows for interpretation of information into knowledge to support decision making. For example, there are well known characteristics of diabetes at the molecular level, the evolution of the disease, and well‐tested, successful treatments. Semantic Web NLP by itself provides a sophisticated technique for discovering the meaning of unstructured words in context with their usage. However, to be truly cognitive requires context and semantics. Ontologies and taxonomies are approaches that are expressions of semantics. In fact, the capability to combine natural language processing and the semantic web enables companies to more easily combine structured and unstructured data in ways that are more complicated with traditional data analytic methods. The semantic web provides the Resource Description Framework (RDF), which is the foundational structure used in the World Wide Web for processing metadata (information about the origins, structure, and meaning of data). RDF is used as a way to more accurately find data than would be found in a typical search engine. It provides the ability to rate the available content. RDF also provides a syntax for encoding the particular metadata with standards such as XML (Extensible Markup Language) that supports interoperability between data sources. One of the benefits of schemas that are written in RDF is that it provides one set of properties and classes for describing the RDF generated schemas. The semantic web is instrumental in providing a cognitive approach to gaining insights from a combination of structured and unstructured sources of information in a consistent way Applying Natural Language Technologies to Business Problems Cognitive computing has found application in various real-world scenarios, enhancing human decision- making, automating complex tasks, and improving overall efficiency. Here are some examples of real-world use cases of cognitive computing IBM Watson for Oncology is used for analyzing medical literature, clinical trial data, and patient records to recommend personalized treatment options for cancer patients. Cognitive computing systems analyze vast amounts of financial data in real-time to detect patterns and anomalies, helping financial institutions identify potential fraudulent activities. Most Companies are utilizing cognitive computing to create intelligent virtual assistants and chatbots that can understand natural language, answer customer queries, and provide personalized assistance. Cognitive systems assist in automating the recruitment process by analyzing resumes, screening candidates, and even conducting initial interviews, streamlining the hiring process. Enhancing the Shopping Experience Leveraging the Connected World of Internet of Things Leveraging the Connected World of Internet of Things As more and more devices, from cars to highways and traffic lights, are equipped with sensors, there will be the ability to make decisions about what actions to take as conditions change. Traffic management is a complex problem in many large metropolitan areas. If city managers can interact with sensor‐based systems combined with unstructured data about events taking place (rallies, concerts, and snow storms), alternative actions can be looked at. A traffic manager may want to ask questions about when to reroute traffic under certain circumstances. If that manager can use an NLP interface to a cognitive system, these questions can be answered in context with everything from weather data to density of traffic to the time when an event will start. Individual domains such as traffi c routing and weather predictions will each have their own Hidden Markov Models. In a cognitive system it is possible to correlate this data across domains and models. Matching this data with an NLP engine that interprets textual data can result in signifi cant results. The NLP question and answer interface can help the human interact with this complex data to recommend the next best action or actions Voice of the Customer The capability for companies to understand not only what their customers are saying but also how it will impact their relationship with that customer is increasingly important. One technique companies use to understand customer attitudes is sentiment analysis. Sentiment analysis combines text analytics, NLP, and computational linguistics in order to make sense of text-based comments provided by customers. For example, a company can analyze customer sentiment to predict sales for new product offerings from one of its divisions. However, a customer is not simply a customer for a single business unit. Many customers actually will do business with several different business units within the same company. Creating a corpus of customer data across business units can enable the customer service representative to understand all interactions with customers. Many of these interactions will be stored in notes in customer care systems. These same customers may add comments to social media sites and send e‐mail messages directly to the company complaining about problems. There are subtleties in how customers use language that need to be understood to get a clear indication of customer intent Fraud Detection One of the most important applications for NLP and cognitive computing is the detection of fraud. Traditional fraud detection systems are intended to look for known patterns from internal and external threat databases. Determining risk before it causes major damage is the most important issue for companies dealing with everything from hackers to criminal gangs stealing intellectual property. Although companies leverage firewalls and all sorts of systems that put up a barrier to access, these are not always effective. Smart criminals often find subtle techniques that go under the radar of most fraud detection systems. Having the capability to look for hidden patterns and for anomalies is critical to preventing an event from happening. In addition, leveraging thousands of fraudulent claims documents, an insurance company can be better prepared to detect subtle indications of fraud. NLP‐based cognitive approaches can enable the user to ask questions related to the corpus of data that has been designed based on a model of both acceptable and unacceptable behavior. This corpus can be fed with new information about detected schemes happening somewhere in the world. Understanding not only the words but also their context across many data sources can be applied to fraud prevention. Understanding word sense in complex documents and communications can be significant in preventing fraud. Taxonomies A taxonomy is a hierarchical framework, or schema, for the organization of organisms, inanimate objects, events, and/or concepts. We see taxonomies daily as humans, and we don’t give them much thought. Taxonomies are the facets, filters, and search suggestions commonly seen on modern websites. For example, books can be categorized as fiction and nonfiction at a high level. That may work in some instances, but in most cases, that is too high of a grouping level, so we further subdivide each high-level category until we are satisfied we have achieved an appropriate grouping level. Figure 1 shows an example of a taxonomy for books. Another taxonomy example is how you sort your documents on your computer. For example, some may choose to start with a subject and then sub-divide by year, while others may do the opposite. There are no absolute right and wrong with taxonomies, just degrees of appropriateness. The most important question to ask when creating a taxonomy is, “does this hierarchical grouping meet my needs?” Taxonomies A taxonomy is a representation of the formal structure of classes or types of objects within a domain. Taxonomies are generally hierarchical and provide names for each class in the domain. They may also capture the membership properties of each object in relation to the other objects. The rules of a specific taxonomy are used to classify or categorize any object in the domain, so they must be complete, consistent, and unambiguous. This rigor in specification should ensure that any newly discovered object must fit into one, and only one, category or object class. The concept of using a taxonomy to organize knowledge in science is well established. In fact, if you’ve ever started a guessing game by asking “is it animal, mineral, or vegetable?” you were using Linnaeus’ 1737 taxonomy of nature. Linnaeus called those three categories the “kingdoms” in his taxonomy (Figure ). Everything in nature had to fall into one of those categories, which he further divided into class, order, genus, species, and variety. At any level in a taxonomy, there can be no common elements between classes. If there are, a new, common higher‐level category is required. Members of any class in the taxonomy inherit all the properties of their ancestor classes. For example, if you know that humans are mammals, you know that they are endothermic (warm‐blooded) vertebrates with body hair and produce milk to feed their offspring. Of course, you also know that humans breathe, but you know that because everything in the class mammals belong to the phylum chordata, which are all animals, and animals respire. Inheritance simplifies representation in a taxonomy because common properties need be specified only at the highest level of commonality. In a cognitive computing system, the reference taxonomy may be represented as objects in an object‐oriented programming language or in common data structures such as tables and trees. These taxonomies consist of rules and constructs that will not likely change over time. Ontologies According to Wikipedia, an ontology “encompasses a representation, formal naming, and definition of the categories, properties, and relations between the concepts, data, and entities that substantiate one, many, or all domains of discourse.” In other words, ontologies allow us to organize the jargon of a subject area into a controlled vocabulary, thereby decreasing complexity and confusion. Without ontologies, you have no frame of reference, and understanding is lost. As Robert Engles states in his blog post On the role and the whatabouts of Ontology, ontologies are “essential in modern architectural patterns to ensure data quality, governance, findability, interoperability, accessibility, and reusability.” For example, an ontology will allow one to associate the Book taxonomy with the Customer taxonomy via relationships. An ontology is more challenging to create than a taxonomy because it needs to capture the interrelationships between business objects/concepts by encapsulating the language and terminology of the business area you are modeling. A properly created ontology will expose the understanding of how the elements in the model relate to each other. Based on this understanding, one can infer intent via the relationships. A virtual assistant like Alexa uses these relationships phrases and synonyms of those phrases to define the user’s intention. An ontology provides more detail than a taxonomy, although the boundary between them in practice is somewhat fuzzy. An ontology should comprehensively capture the common understanding—vocabulary, definitions, and rules—of a community as it applies to a specific domain. The process of developing an ontology often reveals inconsistent assumptions, beliefs, and practices within the community. It is important in general that consensus is reached, or at least that areas of disagreement in emerging fields be surfaced for discussion. In many fields, professional associations codify their knowledge to facilitate communications and common understanding. These documents may be used as the basis of an ontology for a cognitive computing system. Other Methods of Knowledge Representation In addition to ontologies, there are other approaches to knowledge representation. Two examples are described in the following sections. Simple Trees A simple tree is a logical data structure that captures parent‐child relationships. In a model where the relationships are rigid and formalized, a simple tree, implemented as a table with (element, parent) fields in every row, is an efficient way to represent knowledge. Simple trees are used frequently in data analytic tools and in catalogs. For example, a retailer’s catalog may have 30 or 40 categories of products that it offers. Each category would have a series of elements that are members of that category. The Semantic Web Some members of the World Wide Web Consortium (W3C) are attempting to evolve the current web into a “semantic web” as described by Tim Berners‐Lee, et al in Scientific American in 2001. In a semantic web, everything would be machine usable because all data would have semantic attributes, as described in the W3C’s Resource Description Framework (RDF).The current web is basically a collection of structures or documents addressed by uniform resource locators (URLs). When you find something at an address, there is no required uniformity about how it is represented. By adding semantics, you would force structure into everything on the web. If we did have a semantic web, we could use more of what is on the web for a cognitive system without extensive preprocessing to uncover structural information Applying Advanced Analytics to Cognitive Computing Advanced analytics refers to a collection of techniques and algorithms for identifying patterns in large, complex, or high-velocity data sets with varying degrees of structure. It includes sophisticated statistical models, predictive analytics, machine learning, neural networks, text analytics, and other advanced data mining techniques. Some of the specific statistical techniques used in advanced analytics include decision tree analysis, linear and logistic regression analysis, social network analysis, and time series analysis. These analytical processes help discover patterns and anomalies in large volumes of data that can anticipate and predict business outcomes. Accordingly, advanced analytics is a critical element in creating long-term success with a cognitive system that can ask for the right answers to complex questions and predict outcomes. It explores the technologies behind advanced analytics and how they can be leveraged in a knowledge-driven cognitive environment. With the right level of advanced analytics, you can gain deeper insights and predict outcomes in a more accurate and insightful manner. Cognitive Analytics which will exploit the massive advances in High Performance Computing by combining advanced Artificial Intelligence and Machine Learning techniques with data analytics approaches. They add, “Cognitive Analytics applies human-like intelligence to certain tasks, and brings together a number of intelligent technologies, including semantics, artificial intelligence algorithms, deep learning and machine learning. Applying such techniques, a cognitive application can get smarter and more effective over time by learning from its interactions with data and with humans.” advantages that can be obtained from leveraging cognitive analytics. They include: Improved Customer Interaction. Banik notes, “There are three areas where cognitive computing is useful for consumer interaction: enhanced client services; providing a tailored service; and guaranteeing a speedier response to consumer needs.” Enhanced Productivity. Menon notes there are four areas of productivity where cognitive analytics can be beneficial: improved decision-making and better planning; significant cost reductions; better learning experiences; and better security and governance. Business Growth. Banik writes, “Cognitive analytics promotes corporate success by: increasing sales in new markets [and helping] launch new goods and services.” The next frontier, which comes with opportunities for enormous change, includes big data analytics and incorporates the technologies of machine learning and cognitive computing. As shown in Figure, there is a convergence of technologies cutting across analytics and artificial intelligence. One major push for this convergence is the change in the timing and immediacy of data. Today’s applications often require planning and operational changes at a fast rate for businesses to remain competitive. Waiting 24 hours or longer for results of a predictive model is no longer acceptable. For example, a customer relationship management application may require an iterative analytics process that incorporates current information from customer interactions and provides outcomes to support split‐second decision making, ensuring that customers are satisfied. In addition, data sources are more complex and diverse. Therefore, analytic models need to incorporate large data sets including structured, unstructured, and streaming data to improve predictive capabilities. The multitude of data sources that companies need to evaluate to improve model accuracy includes operational databases, social media, customer relationship systems, web logs, sensors, and videos. Increasingly, advanced analytics is deployed in high‐risk situations such as patient health management, machine performance, and threat and theft management. In these use cases the ability to predict outcomes with a high degree of accuracy can mean lives are saved and major crises are averted. In addition, advanced analytics and machine learning are used in situations in which the large volume and fast speed of data that must be processed demands automation to provide a competitive advantage. Typically, human decision makers use the results of predictive models to support their decision‐making capabilities and help them to take the right action There are situations, however, in which pattern recognition and analytic processes lead to action without any human intervention. For example, investment banks and institutional traders use electronic platforms for automated or algorithmic trading of stocks. Statistical algorithms are created to execute orders for trades based on pre‐established policies without humans stepping in to approve or manage the trades. Automated trading platforms use machine learning algorithms that combine historical data and current data that may have an impact on the price of the stock. For example, a trading algorithm may be designed to automatically adjust based on social media news feeds. This approach can provide rapid insight as large volumes of current data are processed at incredibly fast speeds. Although taking action based on this early (and unverified) information may improve trading performance, the lack of human interaction can also lead to erroneous actions. For example, automated trading algorithms have responded to fake or misleading social media news feeds leading to a rapid fall in the stock market. A human would have hopefully taken the time to check the facts. The following two examples illustrate how companies are using machine learning and analytics to improve predictive capabilities and optimize business results. Analytics and machine learning predict trending customer issues. The speed of social media can accelerate a small customer issue and grow it into a major complication before a company has time to react. Some companies decrease the time it takes to react to customer concerns by leveraging SaaS offerings that use machine learning to look for trends in social media conversations. The software compares the social media data to historical patterns and then continuously updates the results based on how the predicted pattern compares to actual results. This form of machine learning provides at least 72 hours of advanced warning on trending issues before they are picked up by mainstream media. As a result, marketing and public relations teams can take early action to protect the company’s brand and mitigate customer concerns. Media buyers use the service to quickly identify changing customer purchasing trends, so they can determine where they should place their ads in mobile applications and web environments Analytics and machine learning speed the analysis of performance to improve service level agreements (SLA). Many companies find it hard to monitor IT performance at fast enough intervals to identify and fix small problems before they escalate and negatively impact SLAs. Using machine learning algorithms, companies can identify patterns of IT behavior and orchestrate its systems and operational processes to become more prescriptive. These systems can learn to adapt to changing customer expectations and requirements. For example, telecommunications companies need to anticipate and prevent network slowdowns or outages so that they can keep the network operating at the speeds required by their customers. However, it can be nearly impossible to identify and correct for interruptions in bandwidth if network monitoring is not done at a sufficiently granular level. For example, with a new machine learning solution offered by Hitachi, telecoms can analyze large streams of data in real time. Hitachi’s customers can combine analysis of historical data and real-time analysis of social media data to identify patterns in the data and make corrections in the performance of the network. There are many situations in which this would be helpful to customers. For example, if a streaming video application shows a popular sporting event and the game goes into overtime, an adaptive system could automatically add an additional 15 minutes of bandwidth support, so end users are provided with consistent high‐quality service. Machine learning can help the system adapt to a variety of changes and unusual occurrences to maintain quality of performance Key Capabilities in Advanced Analytics You can’t develop a cognitive system without using some combination of predictive analytics, text analytics, or machine learning. It is through the application of components of advanced analytics that data scientists can identify and understand the meaning of patterns and anomalies in massive amounts of structured and unstructured data. These patterns are used to develop the models and algorithms that help determine the right course of action for decision makers. The analytics process helps you understand the relationships that exist among data elements and the context of the data. Machine learning is applied to improve the accuracy of the models and make better predictions. It is an essential technology for advanced analytics, particularly because of the need to analyze big data sources that are primarily unstructured in nature. The Relationship Between Statistics, Data Mining, and Machine Learning Statistics, data mining, and machine learning are all included in advanced analytics. Each of these disciplines has a role in understanding data, describing the characteristics of a data set, finding relationships and patterns in that data, building a model, and making predictions. There is a great deal of overlap in how the various techniques and tools are applied to solving business problems. Many of the widely used data mining and machine learning algorithms are rooted in classical statistical analysis. The following highlights how these capabilities relate to each other. Statistics It is the science of learning from data. Classical or conventional statistics is inferential in nature, meaning it is used to reach conclusions about the data (various parameters). Although statistical modeling can be used for making predictions, the focus is primarily on making inferences and understanding the characteristics of the variables. The practice of statistics requires that you test your theory or hypothesis by looking at the errors around the data structure. You test the model assumptions to understand what may have led to the errors with techniques such as normality, independence, and constant variances. The goal is to have constant variances around your model. In addition, statistics requires you to do estimation using confidence values and significance testing—test a null hypothesis and determine the Data mining, which is based on the principles of statistics, is the process of exploring and analyzing large amounts of data to discover patterns in that data. Algorithms are used to find relationships and patterns in the data and then this information about the patterns is used to make forecasts and predictions. Data mining is used to solve a range of business problems such as fraud detection, market basket analysis, and customer churn analysis. Traditionally, organizations have used data mining tools on large volumes of structured data such as customer relationship management databases or aircraft parts inventories. Some analytics vendors provide software solutions that enable data mining of a combination of structured and unstructured data. Generally, the goal of data mining is to extract data from a larger data set for the purposes of classification or prediction. In classification, the idea is to sort data into groups. For example, a marketer might be interested in the characteristics of people who responded to a promotional offer versus those who didn’t respond to the promotion. In this example, data mining would be used to extr

Document Details

Tags

Related

Full Transcript

Upgrade to continue