AI, Digital Trends, and Web Evolution (PDF)

Summary

This document presents a lecture or presentation on Artificial Intelligence (AI), digital trends, and the evolution of the World Wide Web, from Web 1.0 to Web 4.0. It discusses different aspects of AI, including its definitions, the Turing test, the Chinese room thought experiment, and various AI types like narrow/general and generative AI. It also covers concepts like machine learning, deep learning, and data science.

Full Transcript

Business and Finance a.a. 2024/2025 Information Systems Digital Trends and AI Prof. Francesco Tornieri Digital Trends Cloud Big Data Computing Social...

Business and Finance a.a. 2024/2025 Information Systems Digital Trends and AI Prof. Francesco Tornieri Digital Trends Cloud Big Data Computing Social AI Media/Web Internet of Mobile DIGITAL TRENDS things Mobile Mobile Web Web Web 1.0 – Static web Read-Only: Early version of the internet with static web pages. Content: Information was published in a one-way manner. Interaction: Users could: – only read, – no engagement – no dynamic interaction. Examples: Early websites like personal pages or company brochures. Web 2.0 – Social Web User Participation: Emphasis on collaboration and user-generated content. Content: Interactive elements—users could: – create, – comment, – share Platforms: Rise of social media, blogs, forums (e.g., Facebook, Wikipedia). Characteristics: Dynamic websites with high interactivity. Web 3.0 – Semantic Web Machine Understanding: Data becomes more interconnected and contextual. AI and Personalization: Intelligent systems understand user needs better. Key Technologies: AI, natural language processing, semantic metadata. Examples: Google Knowledge Graph, voice assistants like Siri. Web 4.0 – Ubiquitous Web Seamless Experience: Full integration of the web into daily life. AI-Driven: AI actively interacts with and adapts to user behaviors. IoT Integration: Connection of multiple devices for a holistic experience. Goal: A highly predictive web that anticipates user needs in real time. AI: What is it? The popularity of AI in the media is in part due to the fact that people have started using the term when they refer to things that used to be called by other names. You can see almost anything from statistics and business analytics to manually encoded if-then rules called AI Reason 1: no officially agreed definition Even AI researchers have no exact definition of AI Reason 2: the legacy of science fiction The confusion about the meaning of AI is made worse by the visions of AI present in various literary and cinematic works of science fiction AI: What is it? Turing The very nature of the term “artificial intelligence” brings up philosophical questions whether intelligent behavior implies or requires the existence of a mind, and to what extent is consciousness replicable as computation. Alan Turing (1912-1954) was an English mathematician and logician. He is rightfully considered to be the father of computer science. Turing's most prominent contribution to AI is his imitation game, which later became known as the Turing test. In the test, a human interrogator interacts with two players, A and B, by exchanging written messages (in a chat). If the interrogator cannot determine which player, A or B, is a computer and which is a human, the computer is said to pass the test. The argument is that if a computer is indistinguishable from a human in a general natural language conversation, then it must have reached human-level intelligence. AI: What is it? Turing AI: What is it? The Chinese room The idea that intelligence is the same as intelligent behavior has been challenged by some. The best-known counter-argument is John Searle's Chinese Room thought experiment. Searle describes an experiment where a person who doesn't know Chinese is locked in a room. Outside the room is a person who can slip notes written in Chinese inside the room through a mail slot. The person inside the room is given a big manual where she can find detailed instructions for responding to the notes she receives from the outside. Searle argued that even if the person outside the room gets the impression that he is in a conversation with another Chinese-speaking person, the person inside the room does not understand Chinese. Likewise, his argument continues, even if a machine behaves in an intelligent manner, for example, by passing the Turing test, it doesn't follow that it is intelligent or that it has a “mind” in the human way. AI: What is it? The Chinese room General vs narrow AI When reading the news, you might see the terms “general” and “narrow” AI. Narrow AI refers to AI that handles one task. General AI, or Artificial General Intelligence (AGI) refers to a machine that can handle any intellectual task. All the AI methods we use today fall under narrow AI, with general AI being in the realm of science fiction. A related dichotomy is “strong” and “weak” AI.. Strong AI would amount to a “mind” that is genuinely intelligent and self-conscious. Weak AI is what we actually have, namely systems that exhibit intelligent behaviors despite being “mere” computers. General vs narrow AI In addition to AI, there are several other closely related topics that are good to know at least by name. These include machine learning, deep learning data science Machine learning can be said to be a subfield of AI, which itself is a subfield of computer science (such categories are often somewhat imprecise and some parts of machine learning could be equally well or better belong to statistics). Machine learning enables AI solutions that are adaptive. A concise definition can be given as follows: Systems that improve their performance in a given task with more and more experience or data. General vs narrow AI Deep learning is a subfield of machine learning, which itself is a subfield of AI, which itself is a subfield of computer science. The “depth” of deep learning refers to the complexity of a mathematical model, and that the increased computing power of modern computers has allowed researchers to increase this complexity to reach levels that appear not only quantitatively but also qualitatively different from before. Data science is a recent umbrella term (term that covers several subdisciplines) that includes machine learning and statistics, certain aspects of computer science including algorithms, data storage, and web application development. AI: Subareas (types) The roots of machine learning are in statistics, which can also be thought of as the art of extracting knowledge from data. Especially methods such as linear regression and Bayesian statistics. The area of machine learning is often divided in subareas according to the kinds of problems being attacked: Supervised learning: We are given an input, for example a photograph with a traffic sign, and the task is to predict the correct output or label, for example which traffic sign is in the picture (speed limit, stop sign, etc.). In the simplest cases, the answers are in the form of yes/no (we call these binary classification problems). Unsupervised learning: There are no labels or correct outputs. The task is to discover the structure of the data: for example, grouping similar items to form “clusters”, or reducing the data to a small number of important “dimensions”. Data visualization can also be considered unsupervised learning. Reinforcement learning: Commonly used in situations where an AI agent like a self-driving car must operate in an environment and where feedback about good or bad choices is available with some delay. Also used in games where the outcome may be decided only at the end of the game. AI: Supervised learning Instead of manually writing down exact rules to do the classification, the point in supervised machine learning is to take a number of examples, label each one by the correct label, and use them to “train” an AI method to automatically recognize the correct label for the training examples as well as (at least hopefully) any other images. This of course requires that the correct labels are provided, which is why we talk about supervised learning The user who provides the correct labels is a supervisor who guides the learning algorithm towards correct answers AI: Supervised learning - Example Suppose we have a data set consisting of apartment sales data. For each purchase, we would obviously have the price that was paid, the size of the apartment in square meters (or square feet, if you like), the number of bedrooms, the year of construction, the condition (on a scale from “disaster” to “spick and span”). We could then use machine learning to train a model that predicts the selling price based on these features. AI: Unsupervised learning There are a couple potential mistakes that we'd like to make you aware of. They are related to the fact that unless you are careful with the way you apply machine learning methods, you could become too confident about the accuracy The first thing to keep in mind in order to avoid big mistakes, is to split your data set into two parts: the training data the test data. The algorithm using only the training data, this gives us a model or a rule that predicts the output based on the input variables: while a model may be a very good predictor in the training data, it is no proof that it can generalize to any other data this is where the test data comes in handy: we can apply the trained model to predict the outputs for the test data and compare the predictions to the actual outputs AI: Unsupervised learning In unsupervised learning, the correct answers are not provided. This makes the situation quite different since we can't build the model by making it fit the correct answers on training data. It also makes the evaluation of performance more complicated since we can't check whether the learned model is doing well or not. Typical unsupervised learning methods attempt to learn some kind of “structure” underlying the data. For example, visualization where similar items are placed near each other and dissimilar items further away from each other. It can also mean “clustering” where we use the data to identify groups or “clusters” of items that are similar to each other but dissimilar from data in other clusters. AI: Unsupervised learning Example: Supermarket collect data on purchasing behaviors through fidelity cards to better understand their customers. The manager can visualize the data in a graph where customers with similar habits are represented by closely located points. The manager can also apply clustering to group customers into clusters based on similar purchasing behaviors, such as "health food enthusiasts" or "fish lovers.“ The machine learning method creates the clusters but does not automatically provide labels for them: this task is left to the user. AI: Unsupervised learning centroid AI: Deep learning Deep learning refers to certain kinds of machine learning techniques where several “layers” of simple processing units are connected in a network so that the input to the system is passed through each one of them in turn. This architecture has been inspired by the processing of visual information in the brain coming through the eyes and captured by the retina. This depth allows the network to learn more complex structures without requiring unrealistically large amounts of data. AI: Deep learning A neural network, either biological and artificial, consists of a large number of: – simple units, neurons, that receive and transmit signals to each other. The neurons are very simple processors of information, consisting of a cell body and wires that connect the neurons to each other. Most of the time, they do nothing but sit still and watch for signals coming in through the wires. AI: Deep learning Isolated from its fellow-neurons, a single neuron is quite unimpressive, and capable of only a very restricted set of behaviors. When connected to each other, however, the system resulting from their concerted action can become extremely complex. To find evidence for this, look no further than (to use legal jargon) "Exhibit A": your brain! The behavior of the system is determined by the ways in which the neurons are wired together. Each neuron reacts to the incoming signals in a specific way that can also adapt over time. This adaptation is known to be the key to functions such as memory and learning. AI: Deep learning Compared to how computers traditionally work, neural networks have certain special features: 1. In a traditional computer, information is processed in a central processor (CPU) which can only focus on doing one thing at a time. The CPU can retrieve data to be processed from the computer's memory, and store the result in the memory. Thus, data storage and processing are handled by two separate components of the computer: the memory and the CPU. In neural networks, the system consists of: 1. a large number of neurons, each of which can process information on its own so that instead of having 2. a CPU process each piece of information one after the other, the neurons process vast amounts of information simultaneously. 2. The second difference is that data storage (memory) and processing isn't separated like in traditional computers. The neurons both store and process information so that there is no need to retrieve data from the memory for processing. The data can be stored short term in the neurons themselves (they either fire or not at any given time) or for longer term storage, in the connections between the neurons 3. It use special hardware (computer devices) that can process many pieces of information at the same time. This is called parallel processing. Incidentally, graphics processors (or graphics processing units, GPUs) have this capability and they have become a cost-effective solution for running massive deep learning methods. AI: Deep learning HIDDEN LAYER It performs mathematical calculations on input data, applying nonlinear transformations. A neural network can have more than one hidden layer, with each layer helping the model improve its ability to learn complex patterns from the data. INPUT LAYER OUTPUT LAYER It receives the initial It returns the desired input data and stores final predictions. it in a tensor. This This is the last step layer is responsible of the model and for converting raw provides results data into a format based on the that the model can processing process. performed in the hidden layers. AI: Deep learning - DEMO MODEL: ResNet18 – Input Layer: 64 neurons – Hidden Layers: layers with 64, 128, 256, and 512 neurons – Output Layer: 1000 neurons, corresponding to the 1000 classes of the ImageNet dataset. Genarative AI Generative AI is a type of artificial intelligence designed to create new content, such as text, images, audio, or video, based on models trained on pre- existing data. It uses advanced neural networks, often based on deep learning techniques, to learn the structure and features of data and generate original and realistic outputs. Applications of generative AI range from creating text, images, or music to simulating complex scenarios and designing new concepts. Genarative AI Generative AIs have the following main characteristics: Creation of new content: They can generate new information, such as text, images, music, or videos, based on existing data. Generative models: They use statistical models or advanced neural networks, such as deep neural networks (e.g., GPT), to learn patterns from training data and create realistic outputs. Creativity and variability: They can produce original and diverse outputs, enabling the generation of personalized and varied content every time. Unsupervised or semi-supervised learning: They often learn the structure of data without the need for labels, using techniques like deep learning. Wide applications: They are used in fields such as text creation, image generation, music, data simulations, design, and much more. AI: ChatGPT? How It Works: The GPT engine is built on an architecture known as the Transformer, introduced in a 2017 by Google." This architecture is particularly effective for processing sequences of text. Transformer and Attention: At the core of the Transformer is the self-attention mechanism, which allows the model to assign "weights" (or attention) to different words in a sentence while generating responses. This mechanism helps the model determine which parts of the input are most important for producing a coherent answer. Training Phases: The model undergoes pre-training AI: ChatGPT? The Model is Trained in Two Stages 1. Pre-training: The model is trained on large amounts of text, where it learns to predict the next word in a sequence. It doesn't understand the meaning of words but learns to recognize statistical patterns and relationships between them. 2. Fine-tuning: The model can then be further adapted for specific tasks (like answering questions or translating text) using specialized datasets. AI: ChatGPT? TOKEN A token represents a small unit of text (depending on the context)—such as a – word, – part of a word, – or even a single character Tokens are essential for how the GPT model works, as it processes sentences by breaking the text down into these minimal units. Before GPT can process text, it first divides it into tokens through a process called tokenization. AI: ChatGPT? Example: "Today I am in Milan and it is raining" Becomes: [Today], [I], [am], [in], [Milan], [and], [it], [is], [raining] Depending on the complexity of the sentence or the language, tokens might break a word into parts (e.g., "raining" could become [rain], [ing]). Why? The model doesn't work with words directly but with numerical sequences (numeric vectors). Each token is converted into a vector using a technique called embedding, which assigns each token a numerical representation that captures its position in the "meaning" space within a vast matrix. AI: ChatGPT? TOKEN Every GPT model has a limit on the number of tokens it can handle at one time, known as the maximum context length. For instance, the GPT-4 engine can handle up to 8,192 tokens in a single request. This includes both: – the tokens from your input and – The tokens from the generated output. If this limit is exceeded, the extra tokens are truncated, which can result in incomplete or irrelevant answers. AI: ChatGPT? Practical Example Sentence: Today I am in Milan and the weather is bad. STEP 1: TOKENIZATION [Today], [I], [am], [in], [Milan], [and], [the], [weather], [is], [bad] Each common word becomes a separate token, like "Today", "I", "in". Words like "Milan," which the model recognizes, remain a unique token. AI: ChatGPT? STEP 2: EMBEDDING (Token Vectorization) After tokenization, the model works not directly with words but with numbers. Each token is mapped to a numeric vector through a process called embedding. These numbers represent the token's position in the semantic space. – For example, the vectors for "today" and "tomorrow" will be closer to each other than "today" and "elephant." This is a numeric representation of a word (or token) in a vector space. Each word is mapped to a vector, and these vectors capture various semantic features, such as: – Relationships between similar words. – Distances between words with different meanings. – Grammatical aspects like tense, gender, or plurality. AI: ChatGPT? FASE 2: EMBEDDING (Token Vectorization) These numerical values are used by the model to calculate relationships and similarities between tokens. They don't have direct meaning like traditional math values but are learned during the pre-training process. AI: ChatGPT? STEP 3: ATTENTION MECHANISM It is a mathematical tool that the GPT model uses to determine how much each word (token) should influence other words in a sentence. The meaning of a word depends on the context in which it appears, and it’s not enough to just consider its individual embedding. Example: – The word “bad" by itself can have multiple meanings: “bad" as a weather phenomenon (e.g., "bad weather"). “bad" as a bad person In a sentence like "The weather is bad," context clues help us understand that “bad" refers to the weather (the attention matrix helps the model understand this by assigning greater weight to the relationship between “bad" and “weather," rather than to other words in the sentence). AI: ChatGPT? STEP 3: ATTENTION MATRIX The diagonal of the matrix is equal to 1 (each token gives full meaning to itself— maximum correlation). The off-diagonal values represent how much one token influences the others. Example: In the row for “weather" the token “bad" has a higher attention value (0.40) because they are related

Use Quizgecko on...
Browser
Browser