Podcast
Questions and Answers
What is the primary goal of morphological parsing?
What is the primary goal of morphological parsing?
- To identify the syntactic structure of a sentence.
- To translate a word from one language to another.
- To find the constituent morphemes within a word. (correct)
- To check the spelling of a word.
Why is morphological parsing considered a necessary step in many NLP applications?
Why is morphological parsing considered a necessary step in many NLP applications?
- It simplifies the process of information retrieval.
- It enhances the accuracy of machine translation.
- It aids in effective spelling checking.
- All of the above. (correct)
What does it mean when morphological parsing provides more than one lexical level representation for a given word?
What does it mean when morphological parsing provides more than one lexical level representation for a given word?
- The word is a proper noun.
- The word is misspelled.
- The parsing algorithm has failed.
- The word is ambiguous. (correct)
Which component is NOT essential for building a morphological parser?
Which component is NOT essential for building a morphological parser?
What is the role of 'morphotactics' in morphological parsing?
What is the role of 'morphotactics' in morphological parsing?
What do 'orthographic rules' primarily address in the context of morphological parsing?
What do 'orthographic rules' primarily address in the context of morphological parsing?
In the context of morphology and FSAs, what does it mean to 'accept' a string?
In the context of morphology and FSAs, what does it mean to 'accept' a string?
Why is it inefficient to list all possible forms of words in a language when using FSAs for morphology?
Why is it inefficient to list all possible forms of words in a language when using FSAs for morphology?
How are computational lexicons typically structured for use in morphological parsing?
How are computational lexicons typically structured for use in morphological parsing?
What is the primary purpose of using Finite State Automata (FSA) in spell checking?
What is the primary purpose of using Finite State Automata (FSA) in spell checking?
What is the significance of 'derivational morphology'?
What is the significance of 'derivational morphology'?
What is the main capability added by Finite State Transducers (FSTs) compared to Finite State Automata (FSAs)?
What is the main capability added by Finite State Transducers (FSTs) compared to Finite State Automata (FSAs)?
Regarding morphological parsing with FSTs, what does the 'lexical level' represent?
Regarding morphological parsing with FSTs, what does the 'lexical level' represent?
In two-level morphology, what is the purpose of the 'surface level'?
In two-level morphology, what is the purpose of the 'surface level'?
What does it mean to say that a Finite State Transducer (FST) 'reads one string and generates another'?
What does it mean to say that a Finite State Transducer (FST) 'reads one string and generates another'?
In the formal definition of a finite state transducer, what does the symbol $\Sigma_i$ represent?
In the formal definition of a finite state transducer, what does the symbol $\Sigma_i$ represent?
Within the formal definition of an FST, what does 'Q' signify?
Within the formal definition of an FST, what does 'Q' signify?
What is a 'regular relation' in the context of Finite State Transducers (FSTs)?
What is a 'regular relation' in the context of Finite State Transducers (FSTs)?
In FST conventions, what does the notation 'c:$\epsilon$' on a transition typically indicate??
In FST conventions, what does the notation 'c:$\epsilon$' on a transition typically indicate??
When using FSTs, what is the typical role of the 'first symbol' on a transition?
When using FSTs, what is the typical role of the 'first symbol' on a transition?
What is indicated by the 'other symbols' on a transition when using FSTs?
What is indicated by the 'other symbols' on a transition when using FSTs?
According to the materials, what is the purpose of orthographic rules in NLP?
According to the materials, what is the purpose of orthographic rules in NLP?
A rule stating that '-y changes to -ie before -s' is an example of what kind of rule?
A rule stating that '-y changes to -ie before -s' is an example of what kind of rule?
What is the role of 'intermediate tapes' in multi-tape machines?
What is the role of 'intermediate tapes' in multi-tape machines?
What is one of the key purposes of multi-tape machines in morphology?
What is one of the key purposes of multi-tape machines in morphology?
In the context of Multi-Level Tape Machines, what is the lexical level used for?
In the context of Multi-Level Tape Machines, what is the lexical level used for?
What is the function of the Intermediate level in Multi-Level Tape Machines?
What is the function of the Intermediate level in Multi-Level Tape Machines?
Within Multi-Level Tape Machines, which level directly represents the final, displayed form of a word?
Within Multi-Level Tape Machines, which level directly represents the final, displayed form of a word?
In a multi-level tape machine, what data transformation takes place between the intermediate and surface levels?
In a multi-level tape machine, what data transformation takes place between the intermediate and surface levels?
According to the provided content, what is the relationship between the lexical, intermediate, and surface levels?
According to the provided content, what is the relationship between the lexical, intermediate, and surface levels?
Flashcards
Morphological parsing
Morphological parsing
Finding the constituent morphemes in a word.
Lexical form
Lexical form
A representation of a word's more abstract form.
Ambiguity in parsing
Ambiguity in parsing
Situations where a word has multiple possible parsings.
Lexicon
Lexicon
Signup and view all the flashcards
Morphotactics
Morphotactics
Signup and view all the flashcards
Orthographic rules
Orthographic rules
Signup and view all the flashcards
Accept (in FSA context)
Accept (in FSA context)
Signup and view all the flashcards
Reject (in FSA context)
Reject (in FSA context)
Signup and view all the flashcards
Lexicon as an FSA
Lexicon as an FSA
Signup and view all the flashcards
Lexical Level (Morphology)
Lexical Level (Morphology)
Signup and view all the flashcards
Surface Level (Morphology)
Surface Level (Morphology)
Signup and view all the flashcards
Transducer
Transducer
Signup and view all the flashcards
Finite-State Transducer (FST)
Finite-State Transducer (FST)
Signup and view all the flashcards
FST function
FST function
Signup and view all the flashcards
Regular language
Regular language
Signup and view all the flashcards
Regular relation
Regular relation
Signup and view all the flashcards
Multi-Tape Machines
Multi-Tape Machines
Signup and view all the flashcards
Multi-Level Tape Machines
Multi-Level Tape Machines
Signup and view all the flashcards
Study Notes
Parsing/Generation vs. Recognition
- Machines can recognize strings in a language.
- Recognition isn't always sufficient.
- Parsing assigns a structure to a string. An example is converting the surface form "ate" to the parsed form "eat +V +PAST".
- Production/generation produces a surface form from a structure, like converting "eat +V +PAST" to "ate".
Morphological Parsing
- This process finds constituent morphemes in a word.
- Recognizing "foxes" involves breaking it into "fox" and "-es".
- Examples of morphological parsing include labeling morphemes with category labels, where "cats" becomes "cat +N +PL".
- Morphological parsing is useful in information retrieval, machine translation, and spelling checking.
Finite-State Morphological Parsing
- It aims to find the lexical form of a word from its surface form.
- Ambiguity is possible, where a word can have multiple lexical-level representations.
Building a Morphological Parser
- A lexicon provides a list of stems and affixes with basic information.
- Morphotactics models morpheme order, determining how morpheme classes can combine within a word.
- Orthographic rules model spelling changes during morpheme combination, such as changing "city + -s" to "cities", which is the y→ ie rule.
- Words consist of stems and affixes and may be prefixes, suffixes, circumfixes or infixes.
Morphology and FSAs
- FSAs are used to capture facts about morphology.
- FSAs must accept strings in the language and reject those that are not.
- It avoids listing every form of every word in the language which would be inefficient and sometimes impossible.
Simple Rules
- Regular singular nouns are acceptable as they are.
- Regular plural nouns usually have an "-s" at the end.
- Irregular nouns are accepted as they are.
Finite-State Morphological Parsing: Lexicon and Morphotactics
- A Lexicon serves as a repository for words.
- Computational lexicons comprise lists of stems and affixes, along with a representation of the morphotactics.
- The most common way of modeling morphotactics is the finite-state automaton.
Plugging in Words
- Replace class names such as "reg-noun" with FSAs to recognize all words within that class.
- The resulting FSA defines the level of the individual letter.
- FSAs can then solve the problem of morphological recognition: determining if an input string forms a valid English word.
Notes on FSA Use
- The FSA only confirms validity without giving the lexical representation, and also doesn't address orthographic issues.
- Note: Sometimes an FSA can accept a wrong word (foxs).
Generalizations
- Sets of related words can be built with derivational morphology.
- ADJ + -ity → NOUN.
- ADJ or NOUN + -ize → VERB.
Morphological Parsing with FST
- FSTs produce "cat +N +PL" from the input "cats."
- Two-level morphology represents a word as a correspondence between a lexical level (simple morpheme concatenation) and the surface level (actual spelling).
- Mapping rules that maps letter sequences like "cats" on the surface level into morpheme and features sequence like "cat +N +PL" on the lexical level.
Morphological Parsing with FST (Cont.)
- The mapping between levels uses a finite-state transducer (FST).
- A transducer maps one set of symbols to another.
- An FST is a two-tape automaton that recognizes or generates pairs of strings with a more general function than an FSA.
- An FSA defines a formal language, while an FST defines a relation between sets of strings.
- Viewing an FST: A machine reads one string and generates another.
Finite State Transducers (FSTs)
- A simple explanation: it is the addition of another tape or extra symbols to the transitions.
- One tape is used to read "cats" while the other writes "cat +N +PL".
Finite State Transducers (Cont.)
- A-finite state transducer is defined by the tuple (∑i, ∑o, Q, qo, F, ∆):
- Σi is the input alphabet.
- Σo is the output alphabet.
- Q is a finite set of states.
- q0 is the start state, where q0 ∈ Q.
- F is the set of accepting states, where F ⊆ Q.
- Δ is a relation (Δ : Q × Σi → Q × Σo).
Regular Relations
- A regular language is defined as a set of strings.
- Regular relation: a set of pairs of strings, e.g., {a:1, b:2, c:2}.
- An example has an input alphabet Σi = {a, b, c} and an output alphabet Σo = {1, 2}.
Transitions
- c:c means read a c on one tape and write a c on the other.
- +N:É› means read a +N symbol on one tape and write nothing on the other.
- +PL:s means read +PL and write an s on the other.
Typical Uses
- Read from one tape using the first symbol on the machine transitions
- Write to the second tape using the other symbols on the transitions.
Example
- Find FST that takes the lexical form: box +N +PL and produces the intermediate form: box^s#.
Spelling Rules and FSTs.
- Consonant doubling doubles the 1-letter consonant before -ing/-ed. e.g. beg/begging.
- E deletion removes the silent e before -ing and -ed. e.g. make/making.
- E insertion is where e is added after -s , -z, -x, -ch, -sh, before -s. e.g. watch/watches.
- Y replacement is where -y changes to -ie before -s, -i before -ed. e.g. try/tries.
- K insertion is where Verb ending with vowel + -c add -k. e.g. panic/panicked.
Multi-Tape Machines
- They add more tapes, using one tape machine's output as the next's input, to handle complications.
- They also handle irregular spelling changes via intermediate tapes with intermediate symbols.
- This is called Generativity.
- Nothing is privileged about the directions
- Writing is possible from one and reading from the other or vice-versa.
- Generation is one way, whereas analysis (or parsing) is the other.
Multi-Level Tape Machines
- One machine transduces between the lexical and intermediate level.
- Another machine handles spelling changes to the surface tape.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.