Tolkien Schema Modeling Deck Prototype PDF

Meaning schema modeling An introduction to the workflow SLS-Tolkien@ April 2022 go/sls-modeling-deck Deck Overview & Modeling Process Breakdown What is a Meaning Schema MS modeling workflow Process overview Checking pre-existing schema coverage Modeling a schema Using the Model tool Slot configuration Scoping collections and overlapping entities Adding/Expanding grammar Showcase examples Intent Lab Companion Results-in-OSRP (RiO) MRF and RSE Running Tests Generate KScorer signals Run KE/SxS/QU evals Run a demo Rating diffs and GAP analysis Review and submission process Appendix Useful tools Additional tools Resources A meaning schema is a semantic structure meant to express a particular concept, in a way that emulates the human understanding of language. Meaning schema allow Google to derive meaning, and therefore user intention, from an utterance. A more precise ability to derive meaning means a much higher confidence in the quality of the answer What is a Meaning Schema (MS) Main components of a MS: Slot(s) Proto (Grammar) examples Signals Fulfillment A meaning schema is a semantic structure meant to express a particular concept, in a way that emulates the human understanding of language. Meaning schema allow Google to derive meaning, and therefore user intention, from an utterance. A more precise ability to derive meaning means a much higher confidence in the quality of the answer (not necessarily fulfillment). Meaning schema used to be called intents (and often are, still). Meaning schema modeling is the process by which we create new or improve upon existing meaning schema, ensuring that they comply with some unified modeling principles. Given schema’s stated function of “semantic building blocks”, it’s fundamental that we maximise the composability of the entire Meaning Catalog, in order to allow for the successful interpretation of increasingly complex queries. It is the process by which we create a new meaning schema or improve an existing one. Modeling and ensuring composability of the entire Meaning catalog are essential for the successful interpretation of user queries. What is Meaning Schema modeling? A meaning schema is a semantic structure meant to express a particular concept, in a way that emulates the human understanding of language. Meaning schema allow Google to derive meaning, and therefore user intention, from an utterance. A more precise ability to derive meaning means a much higher confidence in the quality of the answer (not necessarily fulfillment). Meaning schema used to be called intents (and often are, still). Meaning schema modeling is the process by which we create new or improve upon existing meaning schema, ensuring that they comply with some unified modeling principles. Given schema’s stated function of “semantic building blocks”, it’s fundamental that we maximise the composability of the entire Meaning Catalog, in order to allow for the successful interpretation of increasingly complex queries. With Grammar we refer to the grouping of utterances added to a schema. For instance, it can be examples of queries. Grammar helps to train machine learning models to learn when a query or utterance should trigger for a schema. What is Grammar? MS modeling workflow Process overview 5 Submit the Workspace 4 Run Evals 3 Create/Expand Grammar (Re)Model the Schema 2 1 Create New Workspace Create a dedicated workspace Configure slots, add descriptions and curate the most suitable collections for the MS. Define the most pragmatic patterns for the schema, plus induction examples. Create KScorer signals (if needed) and calculate GAP via Evals. Send the workspace for review, get LGTM, submit the workspace. We can use Query Debugger to check if there is a schema which already covers the entities and grammar that we are planning to use. For example, if we intend to have a schema focused on the cast of a film, we could try the following query in QDB: “The Dark Knight cast” By doing this, we can see that there already is a schema named CastOfVisualNarrativeWork which answers our query and, therefore, there is no need to create a new one. Be sure to try out multiple pragmatic queries of your future schema’s patterns, to ensure that you are not creating a duplicate schema. However, if you do end up creating a new one… Checking pre-existing schema coverage Click ‘New Workspace’: Name the workspace appropriately. Add a quick description including the MS being modeled and the pipeline they came from. Useful template Add any relevant editors (TLs should be added by default). Add anyone relevant to CC (can be done later). Add the bug number(s) from Buganizer. Click ‘Save Workspace’ to create the workspace. 1 2 3 4 5 6 Modeling a schema Start typing the name of the schema you want to model. The list will update with the most relevant options. If the MS already exists in the catalog, it will show up below. If it doesn’t exist, you can click on + Create “schema”. There are rules that should be followed when choosing the name for a new schema, those can be found in go/meaning-style Using the model tool 10 Pick the Management area(s). Add any further appropriate Tags. Tags are used to provide metadata about the schema. They should be mentioned on your bug with your task guidelines. Add a description according to the current guidelines. Tick the “Enabled for Loose Parser” box. Loose Parser parses all types of queries. It combines annotations to produce structured interpretations. “Enable answerless voting” box (if specifically requested) Enabling answerless voting will allow the MS to participate in the voting stage as if it had a valid answer for the query. Please consult this doc for more detailed guidance. In general, a schema should not be enabled for answerless voting. If it is, please have the author provide concrete reasons for why it is needed and evals (e.g. QU Eval and/or KE SxS) showing the benefits. Please consult this doc for more detailed guidance. All schemas that have answerless voting enabled MUST include a schema-level note with the tag answerless_voting with a justification about why. Other schemas enabling answerless voting must provide a QU Eval with positive VGUP@1 or other strong justification. 11 Slot configuration 12 Name and description of the slot(s) In order to decide the scope of the slots of the schema, refer to the following g3doc. As a general guideline, the slot name should describe the relationship between the slot and the schema itself. For more information on how to name the slots and fill the description, refer to the existing guidelines. 13 Other value types Horizontal grammar (a.k.a. other value types) Horizontal grammars are cross-domain grammars for recognizing phrases referring to people, times, places, numbers, currencies, and so on. Generally, a small number of meaning schemas are associated with these horizontal subgrammars. For example, Time is associated with the DateTime subgrammar, Person with Person, and so on. These connections are done once, at the meaning schema level (via Self). Because these horizontal grammars are a limited set, there won't be a frequent need to assign these when modeling schemas. There is one common use case when specifying that a meaning schema's self is "string type". Source Adding collections or MIDs Defining collections Defining the correct set of collections for your schema starts with defining the most pragmatic query for it. Ask yourself what user query should trigger your schema and come up with a potential query e.g. “healthcare professions” is a pragmatic query for the schema FiledOfStudyProfessions Navigate to Hume and try to search for “healthcare” 16 Defining collections Choose the most relevant result that represents “healthcare” from your query and go to “/m13n/collections” section. That way you can find all collections that contain the MID for “healthcare”. Select the collections you find relevant for your MS and, if needed, repeat these steps with some other queries. 17 Scoping collections and overlapping entities 18 Using Scopy to find relevant collections Allows to easily find out which entities are associated with specific grammar patterns Shows the traffic for each collection With the help of Scopy we can easily find out which entities are associated with specific grammar patterns. If we wanted to see which entities are most commonly associated with the grammar “songs” we could do it in the following way: This results in a lot of musical artists being shown, so we can try to narrow down the scope. 19 Using Scopy to find relevant collections Allows for a view of specific collections shown in combination with the provided grammar By defining a slot, we can make it so that only entities from specific collections are shown in combination with the provided grammar. We can therefore limit the entities shown to /collection/musical_artists By adding a slot named “Entity” with the previously mentioned collection inside of it, this is what we see. 20 Checking top traffic Top Traffic tool gives a better idea of what collection and grammar to use To have a better idea of what collections/grammar to use, we can use the Top Traffic tool For this specific example, we will be using a fictional schema which has an “$Entity professions” induction pattern This can give us a hint about what kind of entities and respective collections would be a good fit for the schema 21 Finding overlapping entities for collections in Hume Hume has the functionality to show entities featured in more than one collection. When using a collection as a constraint, it has to be specified as /m13n/collections/collection_hrid in the hume query interface In order to achieve this view, you need to change the “==” sign to “!=” for one of the collections We know already that the same MIDs (Entities) might be part of more than one collection. Thus, many KG collections overlap, that is, share the same entities. If we want to fetch the exact list of the MIDs sharing the same collections, we can use the Hume search page to do so. Let’s find the overlapping entities between /collection/actors and /collection/singers: Go to https://hume.google.com/graph/search and set 2 similar constraints on the right:The output will be a list of all entities that are both actors and singers. On that note, if you change the “==” sign to “!=” for one of the collections, you will instead get a list of the entities that are only actors, or only singers. 22 Adding/ Expanding grammar 23 Expand Tool Main tool used to add patterns matching your schema in a Search Stack. The tools allows to mark patterns as bad or good, or alternatively fetch further pattern predicted to be bad or good based on the already added ones. Expand tool: It is important to have defined the slots for your Meaning Schema by the time you get to the Expand tool. Any patterns you add here will receive the "induction" data tag, meaning that they will be used in inducing and training the Grammar that will trigger on a Search stack. Go to the Expand tool and navigate to your Meaning Schema. Make sure the filters are set to the correct Workspace. Optional: If you added query examples in the Model tool be sure to import them using the prompt on the top of the page. Add a new pattern by clicking on “+ Examples”. You will have to provide an example for what the slot could resolve to. (screenshot) Separate patterns by line in order to add multiple patterns at once. Add a couple more initial patterns. Mark the patterns as good. Marking a pattern as good means it will trigger your Meaning Schema on a search stack. Marking a pattern as bad will blocklist the pattern. Use the "Fetch and Rank" functionality. This will suggest patterns based on what you have already marked as good. Predicted patterns will be light green. Currently prediction is based on Loose Parsing. If something is predicted in the Expand tool it will also trigger your Meaning Schema in production. Therefore, you do not need to explicitly mark predicted patterns as good. You can continue to "Fetch and Rank" as many times as you wish. It will continue to use the information from queries you have thumbed-up or thumbed-down to find related queries. Please note: for editing patterns or changing slot annotation we use Query Examples Tool. 24 Query Examples We use the Query Examples (il/queries) tool to view, add, edit or delete query examples. 25 Thumbing Autoannotate Data tags: induction, training, pattern, RSE_edit. Thumbing: helps you to evaluate if a query is in scope, by marking it as good, or if it’s out of scope, by marking it as bad. If this is a positive query pattern, choose "Thumbs up". If this is a negative query pattern, choose "Thumbs down". Be purposeful when adding negative examples. Negative examples are not required, so you should have a specific reason in mind when adding them. Autoannotate: Click "Autoannotate" if you want the Intent Lab to guess the correct slots. Or you can manually annotate the slots. After you add the example, click on the word(s) that represent a slot's value, then choose the slot name from the drop down. Data tags: Induction tag is the most common tag as LP generates parsing models from "induction" examples. A training tag is given to all examples used for building Orbit & KBST models. A pattern is required when the utterance includes slot patterns rather than natural words. Rse_edit is applied to examples that use the reach semantic editor, most commonly used for compositional examples. These have annotations that should be reviewed/spot checked manually. Source 26 Expand Tool Automatically predicts patterns Issues a warning when adding a duplicate “Fetch and rank” can be used to generate more potential patterns automatically Query examples Can edit, delete and add any kind of patterns Can edit patterns via Rich Semantics Editor for MRF Can change tags and annotations Differences between Expand Tool and Query Examples Expand Tool When adding an unmarked pattern, it will automatically predict if any of the already existing thumbed up induction patterns already cover the new pattern by turning it light-green You will get a warning when adding duplicate patterns “Fetch and rank” can be used to generate more potential patterns automatically Cannot edit or delete patterns Query examples Can edit, delete and add any kind of patterns Can edit patterns via Rich Semantics Editor for MRF Can change tags and annotations 27 Showcase examples Showcase examples are unique: help clarify the scope of the schema exemplify prototypical query patterns and slot combinations illustrating how each slot should be annotated Showcase examples are a unique type of example that have the following purpose: Helping to clarify the scope of the schema, via the usage of in scope (positive) and out of scope (negative) examples Exemplifying prototypical query patterns and slot combinations for a schema Illustrating how each slot should be annotated Providing referential queries for NLU training and evaluation data development To add a showcase example, you can either create a brand new example with that specific tag via the Query Examples tool, or you can add it to an already existing example by typing the tag name as shown in the image to the left. 28 Intent Lab Companion 29 Intent Lab Companion The Intent Lab companion is a tool that flags potential problems within the workspace you are working on, or issues that were already present in the schema that is currently being modified. It can be accessed within Intent Lab, by clicking on the icon on the right side of the screen. 30 MRF and RSE 31 We need Meaning Representation (MRF) to de-couple natural language expressions from feature code. You don't want to be examining the raw string of a user query or utterance deep in your feature code; you want to be looking at a semantic abstraction of the string. Meaning representation provides exactly this: it's an indirect vocabulary for modeling the query or utterance in a way that's independent of feature or product requirements. How to use it: Rich Semantics Editor Guide Meaning Representation Guide Definitions: Meaning Representation Formalism (MRF) is an intermediate language between natural language and database queries or C++ code or machine code. The Rich Semantics Editor (RSE) is a tool integrated into Query examples designed for users with query modeling experience to build complex semantic representations (MRF trees, aka IQL trees) for a query. 32 Composition Composition is a modeling technique which allows you to use independent schema to cover complex user queries. There are 2 different kinds of compositions which you can use in meaning representation: Nesting schemas (using a complex semantic structure as an argument within another semantic structure) Operators (combining semantic structures using one of a small number of logical operations) More details here: https://g3doc.corp.google.com/nlp/meaning/g3doc/composition.md?cl=head 33 Nesting schemas To allow a schema to be composable (nested) with others you should modify its answer_type slot. There are 2 types of nesting: Association with a collection By direct call Adding a collection or a schema name in a regular slot of another schema will enable composition. More details here: https://scholar.harvard.edu/files/yxiang/files/ho10-comp-sem.pdf 34 Most common: Intersect and RelatedTo (intersection of related entities, usually works together) Sort (sorting a list by an attribute) HorizontalCounting (counting equivalent entities) DateRestrict (working with a specific time range) Operators are meaning schema that are "special" in two ways: they have special logic throughout the stack (parsing, fulfillment, typechecking, etc); they are designed to be reusable, i.e. they apply to a broader class of queries than normal schema. Common Operators More info: go/mrf-operators How to access RSE tool How to build a MRF tree NOTE: access to the video is restricted to sls-tolkien members Running Tests 38 Generate KScorer signals KScorer (Knowledge scorer) is used to score, suppress and rank interpretations. Depending on the task, you have 3 options: Generate KScorer signals Use already existing ones Do not generate KScorer signals for the MS For more guidance regarding signal generation please refer to go/sls-signals KScorer (Knowledge scorer) is used to score, suppress and rank interpretations. Depending on the task, you have 3 options: Generate KScorer signals Use already existing ones Do not generate KScorer signals for the MS For generating KScorer (as well as running evals, demos, etc.) we use JobRunner. In case you are working on a schema that already has signals, or you are generating signals for a second time or more after doing changes to your new schema, the “Regenerate training examples” box should be ticked to ensure that the training examples are a good fit to the current modeling of the schema. Please note: before running any job from JobRunner always sync your workspace. 39 If you need to generate KScorer: Choose the schema you need to generate signals for. If you need to generate signals for schema that are being worked on: Tick the ones you need by scrolling down the options. If KScorer is not required for the MS you are working on, simply untick the Intent Signals box. DO NOT forget to sync before starting the job. Generate KScorer signals 40 You will be notified via email on the success/failure of the signals generation. In case of success, you will be informed via email. It could fail for two different reasons: It can fail immediately if less than 100 training examples were generated for any MS --> You will receive an email about this. It had enough training examples but it failed for another reason --> The email should have more information pertaining to this error. Generate KScorer signals 41 Run evals KE_Eval is the primary tool for evaluating the overall impact of changes to KE Search features. You must run KE_Eval for every change you make in KE. When re-running evals after performing changes, be sure to do it via JobRunner instead of re-running old workflows. KE_SxS is a secondary tool for evaluating the impact of changes. KE_SxS is similar to KE_Eval, but runs targeted analyses instead of a single comprehensive analysis. When running a SxS, be sure to do it without fallback. You can change locale, if needed. In the preset dropdown, you can choose the preset you need, in most cases it will be “Search KE SxS (UIGE)” DO NOT forget to sync before starting the job. 42 Run evals QU_Eval provides a direct measure of query understanding quality at the interpretation generation and ranking level, before resolution, fulfillment, or UI. QU Evals are well-suited for evaluating: Interpretation generation or ranking changes; Interpretations that don't have fulfillments or features yet, especially as a supplement to regular feature-based evals. DO NOT forget to sync before starting the job. 43 A demo helps to evaluate how the implemented changes are working.This is especially important when evals return 0 diffs, and we need to showcase the impact of our changes. A demo is usually started from JobRunner Navigate to JobRunner Choose the same experiment (by clicking on pencil) Tick the Demo box Choose KE SxS (UIGE) preset Press ‘run’ Run a demo 44 Another way to start a demo is through your Flower workspace. You can go to your Flower experiment Press Restore demo DO NOT forget to sync before starting the job. Or you can start a new demo: Open your Flower experiment Press ‘Start another’ from your SxS eval Rename to demo Change ‘Analysis_type’ to demo Run experiment Run a demo 45 Other useful functionalities provided by Flower, are the ability to: Schedule the start date of an eval Start a new eval from an existing workflowin a separate workspace Add additional owners to a Flower workspace For our team the relevant group would be mdb/sls-tolkien When adding a new owner to a workspace, for individual people use the following syntax: user/LDAP, for groups use mdb/GROUP. Additional Flower functionalities to run evals 46 Rating Diffs and GAP Analysis 47 Rating and GAP analysis Go to your flower workspace Find the SxS Click on ‘Product eval report’ Open the link for SxS More often than not, your KE Eval will return 0 diffs - whenever this is not the case, the diffs that correspond to the random set should be rated for the locale(s) you are working with. After your evals are done, you need to rate the results. For that purpose, we use Query Debugger Query Debugger is a tool to view and debug Google Search Results pages (SRP) and Google Assistant responses. It supports both side-by-side (SxS) comparison and single page debugging. Query Debugger is used for viewing and debugging of results of KE_EVAL (SxS diffs) live debugging of queries and comparing different (incl. custom) search stack results collecting user ratings for each of the diffs 48 Rating and GAP analysis After viewing and debugging the results of SxS diffs, go to the EAS Link of your SxS Flower workflow and look for the SLS GAP Analysis tab. It will generate and store your GAP automatically. 49 Review and submission process 50 Send your workspace for review Before sending your WS for review, please check that the WS description and the links are correctly updated. Select a reviewer from the relevant team automatically - this applies to external reviews only. For the internal review, you can add mdb/sls-tolkien as a reviewer group or type the LDAP of a reviewer manually. Some projects have specific internal review procedures, which should be followed instead of these general guidelines. Req. review to send the WS for review. 1 (or 2) internal and 1 external reviews are usually needed before a WS can be submitted. Review process First you get an LGTM from an internal reviewer, depending on your task. After that, you will need to send the Workspace to be approved by the person who owns the Management Area for the MS you are working on. Relevant ENG owner groups should be provided automatically for you. (Optional) You can tick the box ‘Reviewer can submit’ if you might be unable to do so yourself (e.g. due to PTO) to ensure the WS is submitted within the deadline. Sending to review Click ‘Request review’ on the top left of your WS page Select the approver’s list from the menu and click ‘Request review’. This step will automatically assign a reviewer to your WS. 51 Submit and check if your work has reached Prod Submission Once your Workspace has been LGTMed you may submit it. Submitting can take a few moments, because there are a couple of presubmit checks that need to take place. If the pre-submit checks cause the submission to fail, you will need to address the problem(s) causing the issue. When will grammar reach production? Various binaries need to release new versions in order for you changes to be available in production. Please click here to understand how to check if your changes are in prod. 52 Appendix 53 Useful tools Meaning Explorer il/me on Intent Lab Model il/model on Intent Lab Expand il/expand on Intent Lab Query Examples il/qe on Intent Lab JobRunner il/jr on Intent Lab Flower flower/ Query Debugger querydebugger/ Scopy scopy/ Hume hume/ 54 Additional tools Alfredo go/sls-alfredo (@brunomartins) Jaquim go/jaquim (@brunomartins) 55 Resources Meaning Representation Guide Life of a Meaning Schema Core Concepts Meaning Schema Scope and Granularity Getting started with Collections Introduction to Evals KScorer signals for SLS Search Reviewer Guide Terminology Guide 56 Q&A 57 Modeling Quiz After the completion of the module, please make sure to take the Modeling Quiz This is mandatory for every new onboardee in order to track completion of the module. 58 Training Feedback Please make sure to provide feedback about this training module using the survey below.THANK YOU! Module Feedback Form Module Code: [MT1] Suggestions/Feedback? or Want to request a change? Please send your comments or requests via the link below and we will get back to you as soon as we can. Click here to open a ticket to the L&D team Thank you! Contributors: SLS-Tolkien@asyav@barretoj@ Review Date: October 24, 2023 Reviewer: @barretoj

Tolkien Schema Modeling Deck Prototype PDF

Document Details

Tags

Related

Summary

Full Transcript