Summary

This presentation provides an overview of Natural Language Processing (NLP). It details the core concepts such as segmentation, tokenization, stemming, lemmatization, speech tagging, and named entity recognition using examples and diagrams.

Full Transcript

How do these systems manage to sound and seem so human-like? How do they respond to me so intelligently and how are they so articulate? The magic of Natural Language Processing! Natural Language Processing (NLP) refers to the branch of Artificial Intelligence that gives the machines the ability to...

How do these systems manage to sound and seem so human-like? How do they respond to me so intelligently and how are they so articulate? The magic of Natural Language Processing! Natural Language Processing (NLP) refers to the branch of Artificial Intelligence that gives the machines the ability to read, understand and derive meaning from human languages This is achieved by combining the field of Linguistics + Computer Science to decipher language structure and guidelines and to make models which can comprehend, break down and separate significant details from text and speech. Vast Amounts of Data Through their daily interactions with each other through public social media, humans transfer vast quantities of data to each other – this data is mostly freely available. This data is extremely useful in understanding human behavior and customer habits. Data analysts and machine learning experts utilize this data to give machines the ability to mimic human linguistic behavior. This helps save millions in terms of manpower and time - you don't need to always have a person present at the other end of a phone. Ubiquitous & Disruptive NLP is a lot more widespread than you may realize; you use it every day in seemingly normal and insignificant situations: Don't know how to correctly spell a word? Autocorrect has you covered (Grammarly) Need to see if your article or thesis will get flagged for copyright violations? A plagiarism checker will search through the web and find any cases of published documents which may match your work line by line. Need to make a call or take a call, or have your sms read to you while driving? Autopilot/Google Assistant, Apple Carplay and Android Auto help achieve all this so easily. Other Voice Assistants – Amazon vs Apple vs Google vs … NLP Process Most of the techniques used in NLP are simple grammar techniques that we have been taught in school With NLP it is actually pretty easy to learn (teach machines): i) You start off with a document or an article, ii) to make your algorithm understand what is going on in it you need to process it into a form which is easily comprehensible by the machine - this is no different than making a child learn to read for the first time, a) You start off by performing segmentation which is to break the entire document down into its constituent sentences - you can do this by segmenting the article along its punctuations like full stops and commas, b) for the algorithm to understand these sentences we get the words in a sentence and to explain them individually to our algorithm so we break down our sentence into its constituent words and store them this is called tokenizing where each word is called a token, c) we can make the learning process faster by getting rid of non-essential words which do not add much meaning to our statement and are just there to make our statement sound more cohesive; words such as “are” and “the” are called stop words, Now that we have the basic form of our document we need to explain it to our machine: a) we first start off by explaining that some words like “skipping”, “skips” and “skipped” are the same word with added prefixes and suffixes; this is called stemming, b) we also identify the base words for different word tense mood gender etc. this is called lemmitization - from the base word lemma. (a heading indicating the subject or argument of a literary composition or annotation) c) now we explain the concept of nouns verbs articles and other parts of speech to the machine by adding these tags to our words this is called tagging d) next we introduce our machine to pop culture references and everyday names by flagging names of movies, important personalities or locations etc. that may occur in the document; this is called named entity tagging e) once we have our base words and tags we use a machine learning algorithm like Naive Bayes to teach our model humans sentiment and speech Ref: https://www.youtube.com/watch?v=CMrHM8a3hqw

Use Quizgecko on...
Browser
Browser