Morphological Parsing Explained

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

What is the primary goal of morphological parsing?

  • To identify the syntactic structure of a sentence.
  • To translate a word from one language to another.
  • To find the constituent morphemes within a word. (correct)
  • To check the spelling of a word.

Why is morphological parsing considered a necessary step in many NLP applications?

  • It simplifies the process of information retrieval.
  • It enhances the accuracy of machine translation.
  • It aids in effective spelling checking.
  • All of the above. (correct)

What does it mean when morphological parsing provides more than one lexical level representation for a given word?

  • The word is a proper noun.
  • The word is misspelled.
  • The parsing algorithm has failed.
  • The word is ambiguous. (correct)

Which component is NOT essential for building a morphological parser?

<p>Syntactic Analyzer (C)</p> Signup and view all the answers

What is the role of 'morphotactics' in morphological parsing?

<p>Explaining morpheme ordering within words. (D)</p> Signup and view all the answers

What do 'orthographic rules' primarily address in the context of morphological parsing?

<p>The spelling changes when morphemes combine. (D)</p> Signup and view all the answers

In the context of morphology and FSAs, what does it mean to 'accept' a string?

<p>To recognize the string as being in the language. (A)</p> Signup and view all the answers

Why is it inefficient to list all possible forms of words in a language when using FSAs for morphology?

<p>Many languages have a vast number of word forms, making it impractical. (A)</p> Signup and view all the answers

How are computational lexicons typically structured for use in morphological parsing?

<p>With stems, affixes, and morphotactic information. (B)</p> Signup and view all the answers

What is the primary purpose of using Finite State Automata (FSA) in spell checking?

<p>To identify correctly spelled words without listing every form. (C)</p> Signup and view all the answers

What is the significance of 'derivational morphology'?

<p>It involves creating new words by adding derivational affixes. (A)</p> Signup and view all the answers

What is the main capability added by Finite State Transducers (FSTs) compared to Finite State Automata (FSAs)?

<p>FSTs can recognize and generate pairs of strings, where FSAs only accept or reject strings. (C)</p> Signup and view all the answers

Regarding morphological parsing with FSTs, what does the 'lexical level' represent?

<p>A simple concatenation of morphemes. (A)</p> Signup and view all the answers

In two-level morphology, what is the purpose of the 'surface level'?

<p>To represent the actual spelling of the word. (C)</p> Signup and view all the answers

What does it mean to say that a Finite State Transducer (FST) 'reads one string and generates another'?

<p>The FST maps the input string to an output string based on defined rules. (A)</p> Signup and view all the answers

In the formal definition of a finite state transducer, what does the symbol $\Sigma_i$ represent?

<p>The input alphabet. (D)</p> Signup and view all the answers

Within the formal definition of an FST, what does 'Q' signify?

<p>A finite set of states. (C)</p> Signup and view all the answers

What is a 'regular relation' in the context of Finite State Transducers (FSTs)?

<p>A set of pairs of strings. (D)</p> Signup and view all the answers

In FST conventions, what does the notation 'c:$\epsilon$' on a transition typically indicate??

<p>Reading 'c' and writing nothing. (D)</p> Signup and view all the answers

When using FSTs, what is the typical role of the 'first symbol' on a transition?

<p>To read from one tape. (C)</p> Signup and view all the answers

What is indicated by the 'other symbols' on a transition when using FSTs?

<p>They specify writing to the second tape. (C)</p> Signup and view all the answers

According to the materials, what is the purpose of orthographic rules in NLP?

<p>To govern spelling variations when morphemes combine. (C)</p> Signup and view all the answers

A rule stating that '-y changes to -ie before -s' is an example of what kind of rule?

<p>An orthographic rule (A)</p> Signup and view all the answers

What is the role of 'intermediate tapes' in multi-tape machines?

<p>To handle irregular spelling changes. (D)</p> Signup and view all the answers

What is one of the key purposes of multi-tape machines in morphology?

<p>Managing irregularities in spelling. (D)</p> Signup and view all the answers

In the context of Multi-Level Tape Machines, what is the lexical level used for?

<p>Storing the base form of words and their features. (A)</p> Signup and view all the answers

What is the function of the Intermediate level in Multi-Level Tape Machines?

<p>Acting as a bridge in transforming lexical forms to surface forms. (D)</p> Signup and view all the answers

Within Multi-Level Tape Machines, which level directly represents the final, displayed form of a word?

<p>Surface tape (C)</p> Signup and view all the answers

In a multi-level tape machine, what data transformation takes place between the intermediate and surface levels?

<p>Handling spelling changes (D)</p> Signup and view all the answers

According to the provided content, what is the relationship between the lexical, intermediate, and surface levels?

<p>One machine transduces between the lexical and the intermediate level, and another handles spelling changes to the surface tape. (B)</p> Signup and view all the answers

Flashcards

Morphological parsing

Finding the constituent morphemes in a word.

Lexical form

A representation of a word's more abstract form.

Ambiguity in parsing

Situations where a word has multiple possible parsings.

Lexicon

List of stems and affixes with basic information.

Signup and view all the flashcards

Morphotactics

Model of morpheme ordering within a word.

Signup and view all the flashcards

Orthographic rules

Spelling rules that model changes when morphemes combine.

Signup and view all the flashcards

Accept (in FSA context)

Accepts strings that are in the language.

Signup and view all the flashcards

Reject (in FSA context)

Rejects strings that are not in the language.

Signup and view all the flashcards

Lexicon as an FSA

Stores words, captures inflectional and derivational morphology.

Signup and view all the flashcards

Lexical Level (Morphology)

The simple concatenation of morphemes making up a word.

Signup and view all the flashcards

Surface Level (Morphology)

The actual spelling of the final word.

Signup and view all the flashcards

Transducer

A transducer maps between one set of symbols and another.

Signup and view all the flashcards

Finite-State Transducer (FST)

Recognizes or generates pairs of strings.

Signup and view all the flashcards

FST function

Reads one string and generates another.

Signup and view all the flashcards

Regular language

Set of strings.

Signup and view all the flashcards

Regular relation

A set of pairs of strings.

Signup and view all the flashcards

Multi-Tape Machines

Adding more tapes and use the output of one tape machine as the input to the next.

Signup and view all the flashcards

Multi-Level Tape Machines

Use one machine to transduce between the lexical and the intermediate level, and another machine to handle the spelling changes to the surface tape.

Signup and view all the flashcards

Study Notes

Parsing/Generation vs. Recognition

  • Machines can recognize strings in a language.
  • Recognition isn't always sufficient.
  • Parsing assigns a structure to a string. An example is converting the surface form "ate" to the parsed form "eat +V +PAST".
  • Production/generation produces a surface form from a structure, like converting "eat +V +PAST" to "ate".

Morphological Parsing

  • This process finds constituent morphemes in a word.
  • Recognizing "foxes" involves breaking it into "fox" and "-es".
  • Examples of morphological parsing include labeling morphemes with category labels, where "cats" becomes "cat +N +PL".
  • Morphological parsing is useful in information retrieval, machine translation, and spelling checking.

Finite-State Morphological Parsing

  • It aims to find the lexical form of a word from its surface form.
  • Ambiguity is possible, where a word can have multiple lexical-level representations.

Building a Morphological Parser

  • A lexicon provides a list of stems and affixes with basic information.
  • Morphotactics models morpheme order, determining how morpheme classes can combine within a word.
  • Orthographic rules model spelling changes during morpheme combination, such as changing "city + -s" to "cities", which is the y→ ie rule.
  • Words consist of stems and affixes and may be prefixes, suffixes, circumfixes or infixes.

Morphology and FSAs

  • FSAs are used to capture facts about morphology.
  • FSAs must accept strings in the language and reject those that are not.
  • It avoids listing every form of every word in the language which would be inefficient and sometimes impossible.

Simple Rules

  • Regular singular nouns are acceptable as they are.
  • Regular plural nouns usually have an "-s" at the end.
  • Irregular nouns are accepted as they are.

Finite-State Morphological Parsing: Lexicon and Morphotactics

  • A Lexicon serves as a repository for words.
  • Computational lexicons comprise lists of stems and affixes, along with a representation of the morphotactics.
  • The most common way of modeling morphotactics is the finite-state automaton.

Plugging in Words

  • Replace class names such as "reg-noun" with FSAs to recognize all words within that class.
  • The resulting FSA defines the level of the individual letter.
  • FSAs can then solve the problem of morphological recognition: determining if an input string forms a valid English word.

Notes on FSA Use

  • The FSA only confirms validity without giving the lexical representation, and also doesn't address orthographic issues.
  • Note: Sometimes an FSA can accept a wrong word (foxs).

Generalizations

  • Sets of related words can be built with derivational morphology.
  • ADJ + -ity → NOUN.
  • ADJ or NOUN + -ize → VERB.

Morphological Parsing with FST

  • FSTs produce "cat +N +PL" from the input "cats."
  • Two-level morphology represents a word as a correspondence between a lexical level (simple morpheme concatenation) and the surface level (actual spelling).
  • Mapping rules that maps letter sequences like "cats" on the surface level into morpheme and features sequence like "cat +N +PL" on the lexical level.

Morphological Parsing with FST (Cont.)

  • The mapping between levels uses a finite-state transducer (FST).
  • A transducer maps one set of symbols to another.
  • An FST is a two-tape automaton that recognizes or generates pairs of strings with a more general function than an FSA.
  • An FSA defines a formal language, while an FST defines a relation between sets of strings.
  • Viewing an FST: A machine reads one string and generates another.

Finite State Transducers (FSTs)

  • A simple explanation: it is the addition of another tape or extra symbols to the transitions.
  • One tape is used to read "cats" while the other writes "cat +N +PL".

Finite State Transducers (Cont.)

  • A-finite state transducer is defined by the tuple (∑i, ∑o, Q, qo, F, ∆):
  • Σi is the input alphabet.
  • Σo is the output alphabet.
  • Q is a finite set of states.
  • q0 is the start state, where q0 ∈ Q.
  • F is the set of accepting states, where F ⊆ Q.
  • Δ is a relation (Δ : Q × Σi → Q × Σo).

Regular Relations

  • A regular language is defined as a set of strings.
  • Regular relation: a set of pairs of strings, e.g., {a:1, b:2, c:2}.
  • An example has an input alphabet Σi = {a, b, c} and an output alphabet Σo = {1, 2}.

Transitions

  • c:c means read a c on one tape and write a c on the other.
  • +N:É› means read a +N symbol on one tape and write nothing on the other.
  • +PL:s means read +PL and write an s on the other.

Typical Uses

  • Read from one tape using the first symbol on the machine transitions
  • Write to the second tape using the other symbols on the transitions.

Example

  • Find FST that takes the lexical form: box +N +PL and produces the intermediate form: box^s#.

Spelling Rules and FSTs.

  • Consonant doubling doubles the 1-letter consonant before -ing/-ed. e.g. beg/begging.
  • E deletion removes the silent e before -ing and -ed. e.g. make/making.
  • E insertion is where e is added after -s , -z, -x, -ch, -sh, before -s. e.g. watch/watches.
  • Y replacement is where -y changes to -ie before -s, -i before -ed. e.g. try/tries.
  • K insertion is where Verb ending with vowel + -c add -k. e.g. panic/panicked.

Multi-Tape Machines

  • They add more tapes, using one tape machine's output as the next's input, to handle complications.
  • They also handle irregular spelling changes via intermediate tapes with intermediate symbols.
  • This is called Generativity.
  • Nothing is privileged about the directions
  • Writing is possible from one and reading from the other or vice-versa.
  • Generation is one way, whereas analysis (or parsing) is the other.

Multi-Level Tape Machines

  • One machine transduces between the lexical and intermediate level.
  • Another machine handles spelling changes to the surface tape.

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

Use Quizgecko on...
Browser
Browser