Updated SBS Guidelines PDF
Document Details
Uploaded by ValiantBoltzmann
Mohamed bin Zayed University of Artificial Intelligence
Tags
Summary
This document provides guidelines for rating search results based on user satisfaction. It details various result types, grading processes, and common mistakes to avoid. The document also outlines satisfaction principles and explains the importance of considering user needs and query context.
Full Transcript
Search Satisfaction Guidelines A guide to providing satisfaction ratings for search results Version 1.2 Introduction 4 Flights 17 Search Needs and Satisfaction 4...
Search Satisfaction Guidelines A guide to providing satisfaction ratings for search results Version 1.2 Introduction 4 Flights 17 Search Needs and Satisfaction 4 Movies/TV Shows/Books/Music 18 The Query 5 How to Assign Ratings 19 Steps in the Grading Process 5 When to Grade Highly Satisfying (HS) 19 Definitions 6 When to Grade Satisfying (S) 22 Result Validation 8 When to Grade Somewhat Satisfying (SS) 24 Wrong Language 8 When to Grade Not Satisfying (NS) 25 Content Unavailable 8 Grading Specific Situations & Result Types 27 Inappropriate 9 Ambiguous Queries (Multiple Interpretations) 28 Satisfaction Principles 11 Locale Sensitivity 30 Satisfaction Scale 11 English Results in Non-English Locales 31 Degrees of Separation 12 Redirected Pages 31 Think About the Meaning, Not Just Matching Words 13 Apps 32 Consider User Effort 13 News 33 Consider Source Quality 13 Maps 34 Overview of Result Types 14 Web Video 35 Web Results 14 Dictionary, Stocks, Weather, Knowledge / Answers , Sports 36 Apps 14 Web Results (also called Suggested Web Sites) 36 Maps 14 Web Images 36 Stocks 15 Common Grading Mistakes 39 Dictionary 15 Failing to Use Web Search 39 Weather 15 Failing to Visit Destination Page 40 Sports 15 Ignoring Time and Place 40 News 16 Ignoring Conceptual Distance 40 Web Images 16 Ignoring Relevance Grading Principles 41 Web Video 16 Examples: Satisfaction Rating 43 Answers and Knowledge 17 Highly Satisfying 43 Satisfying Examples 45 Somewhat Satisfying Examples 48 Not Satisfying Examples 50 Other Aspects Related to Search Satisfaction Grading 51 Overall Preference Rating (OPR) 51 Writing Comments 52 OPR & Comment Examples 53 Introduction Search Needs and Satisfaction 4 Search Needs and Satisfaction The Query 5 Search engine users are trying to accomplish a task (or achieve a goal) Steps in the Grading Process 5 that requires some information or quick access to some other Definitions 6 resource, such as an app. A userʼs information need or search need is de ned as the information or resource that the user needs in order to accomplish A search service may return many di erent types of results. How are their task. The user's query is an attempt to express that need to the these graded? What is a satisfying search result? In these guidelines search engine. If the search results enable the user to accomplish their we talk about what constitutes a search query, the di erent types of task, we say that the search need is satis ed. results, and how to grade them. In addition we describe some typical grading tasks that use the principles learned in satisfaction grading. We say that a result is satisfying if it satis es the search need of a query. Results can be more satisfying or less satisfying depending on how well or how completely they satisfy the need. A search need → a search query A search query → results returned You may assume all searches are made on an Apple iOS mobile device. What is Search Need and Relevance Page 4 of 58 ff fi fi fi ff A query and its associated information in the grading interface. The Query 1. Click on the Google and Bing web search links and scan the results to make sure you understand what the query is about. Keep in mind The grading interface displays each query together with additional queries can have more than one meaning. information that provides useful context. As shown in the gure above, 2. Validate the result to make sure it can be graded, as explained in the this includes the following components: “Result Validation” section. Following step (1) is crucial for correct The query itself validation. Web Search links you will use to research the possible intents and 3. Assign the satisfaction rating per the guidelines outlined in interpretations of the query Relevance Principles The language of the user. We do not want to return results in other Assigning a Satisfaction Rating languages Special Situations The location of the user. We want to return results appropriate for When assigning your grade, be on the lookout for common mistakes! their area (e.g. locations of business). Details can be found in “Common Mistakes made.” Date of query. We want to return results that are relevant in time. ⚠ Search engines often correct query spelling errors and/or predict (“autocomplete”) what a partially typed query was intended to be. If the web search ⚠ Unless you have been speci cally instructed otherwise, skip to the next results show results for a corrected or autocompleted version of the query, you task if any of the above information about the query is missing. should grade your result as if the user typed the corrected or completed query. Examples: Steps in the Grading Process Query is “fac,” result is “facebook.com”. Grade as if the query was “facebook.” Query is “ted cruise,” result is a wikipedia page about U.S. senator Ted Cruz. The grading of results consists of the following steps. Grade as if the query was “ted cruz.” What is Search Need and Relevance Page 5 of 58 fi fi Definitions The following terms are used throughout these guidelines: Term Definition Examples Stephen Curry Yellowstone National Park Jupiter Médecins Sans Frontières A person, place, organization, business, product, service, or Starbucks Named Entity event whose name would normally be capitalized in English. Post-It Notes (This includes ctional entities.) Skype Super Bowl LI Boxer Rebellion Frodo Baggins photosynthesis elephant A word or phrase describing a concept or object of study ROC curve (other than a named entity) that users may wish to learn linear algebra more about. Knowledge terms may come from any eld of cancer Knowledge Term study, including: science, technology, mathematics, medicine, oligarchy history, philosophy, literature, art, economics, etc. They are veto most often noun phrases, but may also be other parts of existentialism speech. metaphor impressionism interest rate What is Search Need and Relevance Page 6 of 58 fi fi Term Definition Examples Microsoft (company): www.microsoft.com U.S. Internal Revenue Service (government A website provided by a named entity (or their employer or organization): www.irs.gov O cial Site organization) that represents how they want to be presented Taylor Swift (performer): www.taylorswift.com to the world online. Henry Louis Gates Jr. (professor at Harvard University): https://aaas.fas.harvard.edu/ people/henry-louis-gates-jr A generalization of o cial site that includes not just o cial sites but also other online “homes” provided by an entity and https://twitter.com/StephenKing O cial Online Presence existing on commercial services such as social networks. This https://www.youtube.com/user/therock may include: a Twitter feed, Facebook page, YouTube https://www.instagram.com/badbunnypr/ channel, Instagram feed, or other similar platform. A business (or organization) that consists of many locations Starbucks that all provide basically the same product or service, AND Taco Bell Chain Business where its customersʼ (or usersʼ) primary interaction with the Party City business happens in person at those locations. California Department of Motor Vehicles Jacinda Ardern Taj Mahal Anything whose concept or identity can be usefully conveyed ball-peen hammer by a visual image. People and places are visually distinctive dodecahedron Visually Distinctive Entity entities, but so are certain tools, geometric gures, mesa geological or architectural features, and visual artworks. ying buttress “The Thinker” (sculpture by Rodin) What is Search Need and Relevance Page 7 of 58 fl ffi ffi ffi fi ffi Result Validation Wrong Language 8 4. Query is in a foreign language and result is in locale language, but Content Unavailable 8 query is also the name of a popular song, movie, business, etc. in the current locale (e.g. “viva la vida” query in en-US). Inappropriate 9 Before you can grade the satisfaction of a result, youʼll be asked to indicate whether there are any problems that would prevent you from ⚠ English results are never considered Wrong Language judging it. There are three types of result problems youʼll be asked to identify: wrong language, content unavailable, and inappropriate. Content Unavailable Wrong Language Flag result as content unavailable in any of these situations: A result is in the wrong language if it is neither in English nor in the language of the userʼs locale. A result is a web/news or videos result but does not show a page when clicked. However, there are a few exceptions that are NOT considered wrong language results: Result requires log-in or subscription to access, speci cally where the user would be able to see the content of the page by logging in, but 1. Result (e.g. amazon.co.jp) is the same country-speci c site as you cannot. requested by the query (“amazon.co.jp”), even if the requested site is not in your locale. The browser presents a dialog box warning of a privacy or security issue on the page. 2. Query and result are in the same language, even though itʼs not the primary language for this locale. Required information for this result type is missing (e.g. no distance shown for Maps result). 3. User is visiting another country, query is for a local business or attraction, result is in the language of the visited country (i.e. where query was submitted), and there is no equivalent result in the userʼs ⚠ Even if there is enough content to provide a rating but the page is behind a pay-wall/log-in, please check the Content Unavailable ag own locale language. Result Validation Page 8 of 58 fl fi fi Inappropriate A result is considered inappropriate if it has any of the following: pornography, adult advertising/services, sex toys, illegal drugs, hate speech, gambling, spam/phishing, pirated content(including those posing as free video streaming services), or gore/shock In general, we want to connect users with useful content for their topic attempt to arti cially boost their relevance (e.g., link farming, of interest while protecting them from being exposed to harmful keyword stu ng, etc). information summarized below. Results that do not contain original and useful content. Examples: Hateful: the result should not advocate discriminatory content that pages with content scraped from Wikipedia or otherwise intentionally attacks someoneʼs dignity. This can include references automatically-created content. or commentary about religion, race, sexual orientation, gender, national/ethnic origin, or other targeted groups. Illegal: We also manually remove reported results in those circumstances that are required by law in the corresponding locale Violent or harmful: the result should not intentionally incite imminent (e.g., images of child abuse, content related to sex tra cking, violent, physically dangerous, or illegal activities, nor provide copyright infringement, etc.) and when action is required to keep information that leads to immediate harm. people safe (e.g., involuntary posting of sensitive personal information, etc). Movie streaming sites such as those posing as free Sexually explicit: the result should not have overtly sexual or movies are also part of this category pornographic material, de ned by Websterʼs Dictionary as "explicit descriptions or displays of sexual organs or activities that are ⚠ Content that might otherwise be considered inappropriate is acceptable principally intended to stimulate erotic without su cient if it occurs in a medical, educational, ne art, or journalistic context, and aesthetic or emotional feelings.” should not be agged (e.g Wikipedia). Contradicting expert consensus on public interest topics: the Examples result should not contradict well-established or expert consensus on a popular topic or issue. This includes misleading or inaccurate User searched for [tinyzone] and the result is https:// information. tinyzonetv.to/ which contains pirated content. Spam Results that are malicious, deceptive, or manipulative. User searched for [sdc.com] and result is http://sdc.com/, or user Examples: pages that contain phishing schemes, install viruses, or searched [olga 24k gold] and the result is https://www.lelo.com/ Result Validation Page 9 of 58 ffi fl fi fi fi ffi ffi blog/olga-24k-gold-review/. Both results contain adult advertising and should be agged. Irrespective of whether the user was searching for this, these results need to be agged. Result Validation Page 10 of 58 fl fl Satisfaction Principles Satisfaction Scale 11 Degrees of Separation 12 Think About the Meaning, Not Just Matching Words 13 Consider User Effort 13 Consider Source Quality 13 Satisfaction Scale When judging how satisfying each result is, youʼll use the following scale Highly Satisfying Satisfying Somewhat Satisfying Not Satisfying Almost all users would want to see this result. Many users would be interested in seeing this Some users may nd this result useful, but itʼs This result has nothing to do with the query, or Itʼs authoritative, accurate, up-to-date, and result. Satisfying results often provide probably not what most searchers were looking provides incorrect information, and should not addresses the most likely search need(s). If the supplementary information that is “one step for. Itʼs often only indirectly related to the be shown. user is asking a speci c question, the result away” from the query topic. search need or assumes an uncommon gives the correct answer clearly and concisely. For example, if the query is a restaurant, it interpretation of the query. All results agged as “Inappropriate”, might be a review of the restaurant; if the “Content Unavailable”, or “Wrong Language” query is a company, it might be the current should be rated as Not Satisfying. stock price, or news about the company. Satisfaction Scale Result Validation Page 11 of 58 fl fi fi Degrees of Separation Results are often associated with concepts in the real world, and di erent concepts are connected by their relationships. For example, the concept of the singer “Beyoncé” is related to the concept of her album “Lemonade,” which in turn is related to a review of the album in Rolling Stone magazine, which is related to the author of the review, Rob She eld. Each time we pass through one of these relationships, we increase the distance from the original concept Query : Beyoncé Query: Rolling Stone Lemonade album review Beyoncé's o cial website. The review of the album. Highly Satisfying Her “Lemonade” album on iTunes. The album. Satisfying A Rolling Stone magazine review of the album. The singer's o cial site and Rob She eld's Twitter. Somewhat Satisfying The reviewer Rob She eld's Twitter. Random article from same issue of Rolling Stone Not Satisfying Degrees of Separation We can think of these relationships as “degrees of separation” so in this example, the review of the Lemonade album is two degrees of separation from Beyoncé. When Grading results, each degree of separation from the concept mentioned in the query, that is, the number of relationships you have to traverse to get to the result, lowers the grade by one level. See table above. Result Validation Page 12 of 58 ffi ffi ffi ffi ffi ff Think About the Meaning, Not Just Matching Words Consider Source Quality Note that some highly satisfying results may not contain all (or even Sources of results, including web sites and news providers, can have any) of the query words; what matters is the meaning. For example: large di erences in quality. When you are grading a result, particularly if the userʼs query is looking for speci c information ̶ pay attention to The result www.premierleague.com/home is highly satisfying for the the quality of the source(see table “Source Quality”). For example, if query “english premier league soccer” even though that result you are interested in getting news about an event that happened in a doesnʼt contain the words “english” or “soccer.” certain city, a story in that cityʼs newspaper is generally more reliable The result https://music.apple.com/us/album/25/1544494115 is than a blog post by a random person who doesnʼt live there. If the highly satisfying for the query “adeleʼs third album,” even though it source of a result is low quality, you should assign a lower grade than doesnʼt contain the word “third.” you would have otherwise. It's also possible for a result to contain all the query words and not be High Quality Low Quality satisfying. For example: Professionally written, clear and Unclear, hard to read, lled with Writing understandable. grammatical and spelling errors. The result https://en.wikipedia.org/wiki/My̲Girl̲Has̲Gone (a web page about a song from the 1960s) is not satisfying for the query “gone girl,” even though the result contains both query words. Gone Has "hidden agenda," such as Neutral point of view, or makes point of view Motivation pretending to o er information while clear. Girl is the title of a book and movie from the 2010s, and the song actually trying to sell its services. result is clearly not what the user intended. Well-known and well-respected among those Unknown (or known to be unreliable Reputation who provide this kind of service. and untrustworthy). Consider User Effort Use of If o ering scienti c or medical information, Makes medical or scienti c claims When the user is looking for speci c information, a result that displays cites sources. without citations or evidence. Citations this information directly is preferable to a regular web result. For example, if the query is “how old is Obama”, then a Knowledge card that directly displays his age without requiring any user action is better Source Quality than a web result that the user needs to click on, wait for it to load, and scroll through to nd the desired information. Result Validation Page 13 of 58 ff ff ff fi fi fi fi fi fi Overview of Result Types There are many types of search results. Some results, when clicked, take you to a web page. Some others reveal rich user experiences when clicked. Others are self contained (not clickable) and answer search needs directly in the information presented, without the need for further user action. Rating advice is given sections How to Assign Ratings and Special Advice for Result Types. Web Results By far the most common result types. These ʻcardsʼ usually have an icon with a brief title of the webpage and are designed to be clicked by the user and taken to the corresponding website. Maps These results help the user navigate to a place. Usually they have address and distance from the user. If itʼs a business it often has hours of operation. Apps Cards that take the user to the Apple app store (or open an app on the device). Usually they have an icon of the app and the star ratings. Result Validation Page 14 of 58 Stocks Weather This card provides nancial information related to stocks. They should This card that shows the temperature of a location (and sometimes show the ticker symbol, the company name and the stock price. When other weather conditions). When the user taps this card, they are the user interacts with this card detailed stock information such shown detailed multi day weather forecasts. historic price graphs are displayed. Sports These cards are meant to display sports scores, or latest scores for a Dictionary team (and dates of upcoming matches). Some examples This card shows the de nition of word. When the user interacts with this card it provides detailed usage. Result Validation Page 15 of 58 fi fi News Web Video These are often types of web results that are restricted to news sites The user can click on these results which play a video (usually taken (sports, fashion, political and so on). The usually have ʻage of newsʼ from video channels such as YouTube and Vimeo. indicator at the bottom. They are designed to be clicked on and take the user to the destination news site. Web Images Groups of images clustered together. Usually the user doesnʼt interact with the images and they provide visual information about the search query. Result Validation Page 16 of 58 Answers and Knowledge Flights Users ask questions (implicit, explicit, grammatically incorrect) about a This will display ight status such arrival time, departure time and concept or knowledge term or general knowledge question. Knowledge destinations. When the user taps on this result, detailed information cards can return exact answers or rich experiences about knowledge about arrival/departure gates, baggage claims are displayed. concepts and entities. (Note, the term “Knowledge” might not appear) Query: Where is Olympics 2024 Query: macron Query: Bubonic plague Query: haiku Result Validation Page 17 of 58 fl Movies/TV Shows/Books/Music Cards that provide the user a very rich experience for example to watch movies/tv show, learn about the cast, social media links, links to media related sites (e.g IMDB), listen to music, get lyrics for songs, read books. They usually show a picture, popularity ratings etc. Some examples: Result Validation Page 18 of 58 How to Assign Ratings When to Grade Highly Satisfying (HS) ⚠ Note that some types of results can never be HS. Almost all users would want to see this result. Itʼs authoritative, accurate, up-to-date, and addresses the most likely search need(s). News results can never be HS, because people have di erent preferences for where they get their news, so we can’t say that almost all users would want to see a given story If the user is asking a speci c question, the result gives the correct Results for advice or recommendation queries (e.g.,“how to lose weight”, “chicken parmesan recipe”, answer clearly and concisely. “best beatles song”, “thai restaurant”) can never be HS, because we don’t know if almost all users would agree with the recommendation. When to grade Highly Satisfying Rule If query is And result is Description Examples Query is the name of a well-known app; result is the a. Query is “facebook”, result is the Facebook app. 1 App Query Official App app with that name b. Query is “calculator,” result is the built-in Calculator app. App Regularly Used Query is the name of a business; result is an app a. Query is “b of a,” result is the Bank of America mobile banking app. 2 Business to Interact with regularly used to interact with that business. See b. Query is “dominos,” result is the Domino’s Pizza app, which allows Business details under “Apps” in “Additional Guidance”. users to place orders. Query is looking for a specific location / business / a. Query is “1234 market street sf”; result is a Map for that exact address institution / point of interest, or the closest example b. Query is “new york public library”; result is a Map to that location of a chain business / type of business, and the c. Query is “larry and joe’s”; result is a Map to a restaurant with that name result showed that location on a map. in the same town where user is located 3 Maps Query Closest Map d. Query is “closest lowe’s”; result is a Map showing the Lowe’s store Queries with a map intent often have a distance location closest to the user’s location. qualifier e.g. "nearest", "closest", "near me". Also e. Query is “starbucks”; result is a Map showing the closest Starbucks such queries often relate to business where one branch. must physically go to e.g. gas stations, cinema halls How to Assign Ratings Page 19 of 58 fi ff Rule If query is And result is Description Examples a. Query is “facebook,” result is Facebook’s official website, facebook.com. b. Query is “taylor swift,” result is the singer’s official website, taylorswift.com. c. Query is “charli d’amelio” (social media personality/vlogger), result is her TikTok channel. Official Online Query is a named entity; result is an official online 4 Named Entity d. Query is “joe biden,” result is his Twitter profile https://twitter.com/ Presence presence for that entity if it has one. JoeBiden. e. Query is “empire falls book,” result is publisher’s official page for the book, https://www.penguinrandomhouse.com/books/159148/empire- falls-by-richard-russo/9780375726408/. f. Query is “captain fantastic,” result is official web site for the movie, https://bleeckerstreetmedia.com/captainfantastic. a. Query is “taylor swift” (singer), result is https://en.wikipedia.org/wiki/ Taylor_Swift. b. Query is “nope” (2022 movie), result is https://en.wikipedia.org/wiki/ Nope_(film). c. Query is “iliad” (ancient epic poem), result is https://en.wikipedia.org/ wiki/Iliad d. Query is “the school of athens” (Renaissance painting by Raphael), result is https://en.wikipedia.org/wiki/The_School_of_Athens Query is a named entity; result is the wikipedia Wikipedia or Other e. Query is “marie curie” (Nobel-prize-winning scientist); result is https:// page for that entity, a page from another 5 Named Entity Authoritative en.wikipedia.org/wiki/Marie_Curie authoritative reference, or a knowledge card about Reference f. Query is “angkor wat” (ancient temple complex in Cambodia); result is that entity. https://en.wikipedia.org/wiki/Angkor_Wat g. Query is “aristotle,” result is a page about the philosopher from the Stanford Encyclopedia of Philosophy h. Query is “jurassic world dominion,” result is https://www.imdb.com/ title/tt8041270/, IMDB page about that movie. i. Query is “mike trout,” result is page of this player’s official statistics in the Baseball Reference, https://www.baseball-reference.com/players/t/ troutmi01.shtml. How to Assign Ratings Page 20 of 58 Rule If query is And result is Description Examples Query is a knowledge term or general request to a. Query is “linguistics”; result is https://en.wikipedia.org/wiki/Linguistics learn about a subject; result is the wikipedia page b. Query is “what causes diabetes,” result is a page about that disease for that term, a page from another authoritative from the Mayo Clinic website (https://www.mayoclinic.org/diseases- Wikipedia or Other reference, or a knowledge card. Common for Knowledge Term or conditions/diabetes/symptoms-causes/syc-20371444). 6 Authoritative medical queries. “Learn About” Query c. Query is “utilitarianism,” result is a Dictionary info card giving the Reference definition of the term. Note that if “X” is a knowledge term, queries such d. Query is “challenger disaster” (historical event); result is https:// as “what is X?” or “tell me about X” still count as a en.wikipedia.org/wiki/Space_Shuttle_Challenger_disaster knowledge term queries. a. Query is “when did wwi end,” result is a direct answer or info card that says “November 11, 1918” b. Query is “dodgers score,” result is a sports info card that shows the current score of the Dodgers’ baseball game in progress, or (if no game is in progress), the final score of the most recent game they Query is asking for a specific piece of information played. Explicit Correct that has a simple right answer, and the result c. Query is “msft quote,” result is an info card showing the latest stock 7 Exact Question Answer showed that information directly without the need price for Microsoft (which has the stock symbol MSFT). for further user action. d. Query is “jet blue 334,” result is an info card showing the current status of that airline flight. e. Query is “define attenuated,” result is an info card showing the definition of that word. f. Query is “weather boston", result is an info card showing current weather for that city. a. Query is “nelson mandela,” result is the following set of images: Query is (or asks about) a visually distinctive entity, Visually Distinctive 8 Web Image and result is a high quality web image set showing Entity that entity. How to Assign Ratings Page 21 of 58 When to Grade Satisfying (S) Many users would be interested in seeing this result. Satisfying results often provide supplementary information that is “one step away” from the query topic. For example, if the query is a restaurant, it might be a review of the restaurant; if the query is a company, it might be the current stock price, or news about the company. Here are some common situations where a result is Satisfying: When to grade Satisfying Rule If query is And result is Description Examples Query is the name of an app, result is a variant version (e.g., a. Query is “candy crush saga,” result is app store result for 1 App Name Variant of App “Pro” or “Lite”) of or sequel to that app, or another “candy crush friends,” a newer game in the same series. complementary app from the same vendor. a. Query is “currency converter,” result is “My Currency Query is a description of a type of app or function that app App Performing Converter” app. 2 App Description needs to perform; result is an app (or web app) that performs That Function b. Query is “time in different countries,” result is https:// that function. www.timeanddate.com/worldclock/. Query is the name of a performer (singer, actor, etc.) or a. Query is “taylor swift,” result is Apple Music result for singer’s Performer’s/ creator (author, composer, artist, etc.); result is a 3 Performer/Creator recent album “Lover,” https://music.apple.com/us/album/lover/ Creator’s Work representation of their work (album, song, movie, book, etc.), 1468058165. where user can view/hear/download/stream/learn about it. Query is the name of a creative work (music album, movie, a. Query is “fleabag,” result is https://en.wikipedia.org/wiki/ 4 Creative Work Performer/Creator etc.); result is a representation of the creator/performer (e.g., Phoebe_Waller-Bridge, the wikipedia page about the creator and artist’s official site). star of that television series. How to Assign Ratings Page 22 of 58 Rule If query is And result is Description Examples a. Query is “jbl bluetooth speaker,” result is page of matching items from electronics retailer Best Buy. b. Query is “empire falls book,” result is Amazon’s detail page for that book, https://www.amazon.com/Empire-Falls- Query is the name of a product (which may be media item Richard-Russo/dp/0375726403 such as a book, movie, song, etc.); result is a page from a 5 Product Reputable Vendor c. Query is “captain fantastic,” result is iTunes store page for well-known site where the item can be purchased, that movie, https://itunes.apple.com/us/movie/captain- downloaded, or streamed. fantastic/id1127934488 d. Query is “taylor swift lover album,” result is Spotify page to stream that album, https://open.spotify.com/album/ 3rYkgtFOo9AlPaeKTtn6pM Query is a named entity, result is an authoritative page (other a. Query is “facebook,” result is news story “Facebook agrees to 6 Named Entity News than official online presence) providing news about that pay FTC $5 billion fine for various privacy violations,” dated entity. the same day the search was performed. Query is asking for specific piece of information with a simple a. Query is “barack obama age,” result is https:// Embedded Correct right answer, and the result contains that answer, but the en.wikipedia.org/wiki/Barack_Obama. 7 Exact Question Answer user has to take an action (e.g., follow link to destination b. Query is “cambridge library hours,” result is https:// page and read it) to get the answer. www.cambridgema.gov/cpl/hoursandlocations. a. Query is “ebola,” result is New York Times news story “Ebola Knowledge Term or Query is a knowledge term or request to learn about a 8 News Outbreak in Congo Is Declared a Global Health Emergency,” “Learn About” Query subject, result is relevant and timely news about that subject. published the same day search was performed. Query is the name of a chain business; result is a Map Secondary Maps a. Query is "dunkin", [in location Sunnyvale, CA], map result 9 Chain Business showing a nearby branch of business, but not the closest Result presents San Jose, CA location, 6.8 miles from the user. one. Query is a type of business, or a product or service; result is a. Query is “thai food” [in location Cambridge, MA], result Maps or Multiple map entry or an official website for a business of that type or is http://www.thesimilans.com, official site for local Thai 10 Type of Business Official Websites that offers that product/service. In the Maps case, business restaurant. must be nearby. b. Query is “thai restaurant”; result is a nearby thai restaurant. How to Assign Ratings Page 23 of 58 When to Grade Somewhat Satisfying (SS) Some users may nd this result useful, but itʼs probably not what most searchers were looking for. Itʼs often only indirectly related to the search need or assumes an uncommon interpretation of the query. When to grade Somewhat Satisfying Rule If query is And result is Description Examples Query is the name of a chain business or a type of Chain Business/Type of Moderately Distant business; result is a Map showing a branch of a. Query is "starbucks", user is in San Jose, CA, result is a map result for 1 Business Maps Result business that is not nearby, but still accessible starbucks, 17 miles away in Fremont, CA. (perhaps up to an hour’s drive away) Query is a type of business or organization; result is Official Website of a. Query is “vietnamese restaurant” [in Cupertino, CA]; result is https:// Type of Business/ the official website of an instance of this business 2 More Distant www.slanteddoor.com, the official site of a particular vietnamese Organization or organization that is not nearby, but is still Instance restaurant in San Francisco, CA, 50 miles from the user. accessible. a. Query is “zillow”, result is the video “Living Large in a Tiny Home” from Query is the name of the entity; result is not their Zillow’s YouTube channel. official website, but is a site, page, video, or app Company/Product/ Related Site/Video/ b. Query is “sonicare” (brand of electric toothbrush), result is website for 3 related to their business. For example, this might be Named Entity App Oral-B (a competing brand of electric toothbrush). a 3rd party site about that company or its products, c. Query is “billy idol” (singer), result is wikipedia page for Generation X, a or a site for a competing product or service. band from the 1970s he was in before he became famous. Query is the name of an event or named entity; a. Query is “super bowl news,” result is a news story “Patriots Come from. Stale but Valid News result is a news story about an earlier event or early Behind to Defeat Falcons in Super Bowl LI.” The story is still accurate, 4 Named Entity or Event Story news about the entity. The news story must still be but it describes something that happened in 2017, not in the most valid. recent or upcoming Super Bowl. Query is the name of a general concept or event a. Query is “dogs”, result is wikipedia page for the dog breed Beagle. Overly Specific (such as a TV show); result is about a specific b. Query is “suits” (a TV show that ran for 9 seasons), result is https:// 5 General Query Result instance of that concept or event (such as a www.peacocktv.com/watch-online/tv/suits/8003089882869075112/ particular episode of that show). seasons/5, a page where viewers can stream the 5th season. How to Assign Ratings Page 24 of 58 fi Rule If query is And result is Description Examples Query is the name of an app; result is that app on the Google Play store website. Since users are a. Query is “slickdeals”, result is https://play.google.com/store/apps/ 6 App Name Google Play Result conducting their search on an Apple iOS device, we developer?id=Slickdeals&hl=en. can assume most of them do not want an android app as a result. When to Grade Not Satisfying (NS) This result has nothing to do with the query, provides incorrect information, or fails the validation step, and should not be shown. When to grade Not Satisfying Rule If query is And result is Description Examples Result was flagged as Wrong Language, Content Flagged During a. Query is “uniqlo”; user is in en-US; result is “https://www.uniqlo.com/jp/ 1 Any Query Unavailable, or Inappropriate during validation Validation Step ja/“ which is in Japanese and was flagged as Wrong Language. step. a. Query is “samsung tv”, result is web page for Samsung washing Result that is not about the query topic. Note that in machine. some cases the URL may appear to be about the b. Query is “obama age”, result gives the age of Joe Biden. 2 Any Query Off-Topic Result query, but clicking through shows that the c. Query is “Messi goals”, (Messi is a soccer player) result is total goals by destination page is not related. Barcelona (his team) d. Query is “target stores”, result is about an Ace Hardware store location. a. Query is “starbucks” [in San Francisco, CA], result is a Maps result for Query indicates or assumes nearby location, result Unreasonably a Starbucks in San Diego, CA, 500 miles away. 3 Local Intent Query is so geographically distant that it makes no sense Distant Result b. Query is “airport” [in Boston, MA], result is official website for to show it. Heathrow Airport in London, UK. Query explicitly seeks result from a specific locale; Explicitly Locale- a. Locale is en_US, query is “kit kat japan,” result is https:// 4 Wrong Locale Result result pertains to a locale different from the one Sensitive Query www.hersheys.com/kitkat/en_us/home.html specified. How to Assign Ratings Page 25 of 58 Rule If query is And result is Description Examples a. Locale is en_US, query is “ticketmaster,” result is UK-specific Query does not mention a locale, but the user need Ticketmaster app Implicitly Locale- implicitly requires results from the user's locale; b. Locale is en_IN, query is “do I need a visa to visit japan,” result is US 5 Wrong Locale Result Sensitive Query result pertains to a locale different from the user's government page https://travel.state.gov/content/travel/en/ locale. international-travel/International-Travel-Country-Information-Pages/ Japan.html Query is asking for a specific answer; result is an Missing or Incorrect a. Query is “dmx real name,” result is an info card that says “dmx birth 6 Exact Answer Query info card that correctly identifies what the query is Answer name: dmx” (which is incorrect). asking, but then fails to give that answer. Result is a blank page, a parked domain, a 404 a. Query is “bisq restaurant cambridge”, result is http:// Result Fails to Load / error, something unavailable in user’s country, or 7 Any Query www.bisqcambridge.com Inaccessible anything else where the content has been removed b. Query is “brokerbot”; result is http://brokerbot.com or is inaccessible. How to Assign Ratings Page 26 of 58 Grading Speci c Situations & Result Types Ambiguous Queries (Multiple Interpretations) 28 Locale Sensitivity 30 English Results in Non-English Locales 31 Redirected Pages 31 Apps 32 News 33 Maps 34 Web Video 35 Dictionary, Stocks, Weather, Knowledge / Answers , Sports 36 Web Results (also called Suggested Web Sites) 36 Web Images 36 Speci c Situations & Result Types Page 27 of 58 fi fi Ambiguous Queries (Multiple Interpretations) While most queries express several di erent user intents, some queries are also ambiguous in what they refer to (e.g., “apple” could be a company or a fruit). In this case you should still grade the result, using the following additional guidelines. If you're not sure whether there is a dominant interpretation, look at the web search results for the query. If most of the highly ranked results on the rst page are for one interpretation, then you should consider that to be the dominant interpretation. Multiple Interpretatons Type Description Examples 1. The query is "allegiant", result is the o cial website for the airline. Grade as HS, since the dominant Dominant Interpretation Exists. Dominant Interpretation: If a result is for the dominant interpretation of the query is the airline. When one interpretation is much more popular than the interpretation, you should grade using the normal 2. The query is "apple", result is a map result for the others. guidelines. apple store near the user, but not the closest. Grade as S, since the dominant interpretation of the query is the technology company. 1. Query is “michael jordan”, result is IMDB page for actor Michael B. Jordan. Grade as SS, since dominant interpretation of query is for a di erent person, the former NBA basketball player. 2. Query is “american eagle”, result is home page of web developer americaneagle.com. Grade as SS Dominant Interpretation Exists. Secondary Interpretation: If a result would be relevant (rather than HS), since the dominant interpretation of When one interpretation is much more popular than the (HS/S/SS) for a secondary interpretation, you should the query is clothing retailer American Eagle others (cont’d) grade it as “SS”. Out tters. 3. Query is “golden retriever”, result is a song titled Golden Retriever. Grade as SS (rather than S/HS), since the the song is not the dominant interpretation of the query. The dog breed is the dominant interpretation for this query. Speci c Situations & Result Types Page 28 of 58 fi fi fi ffi ff ff Type Description Examples 1. Query is “um athletics,” (location is Texas) result is home page for the University of Miami athletics Sometimes there are several reasonable interpretations program. Grade as S (rather than HS), because “um but none of them are dominant. In that case you should athletics” could equally well refer to the University of grade normally for all of them, except that results that Multiple Interpretations, None Dominant. Michigan or University of Maryland athletics would have been HS if there were only one (or one When there are two or more interpretations of similar programs, among others. dominant) interpretation should be graded S instead. popularity. 2. Query is “um athletics,” result is a photo gallery showing some athletic facilities under construction That’s because if we can’t say which interpretation is at the University of Michigan. Grade normally: it’s one that nearly all users would want to see. SS, because although it relates to the query, it’s not what most users doing that search are looking for. Speci c Situations & Result Types Page 29 of 58 fi Locale Sensitivity Locale Sensitivity Scenario Grade Examples Explicitly Locale-Sensitive. Query is “amazon france”. The user is in EN-GB locale. Query explicitly speci es that user is seeking results Results that do not pertain to the locale speci ed in the The result is https://amazon.co.uk. Grade as NS, since from a locale that di ers from their current location. query should be automatically graded as “NS”. the Amazon page in the UK is not what the user is searching for. Implicitly Locale-Sensitive. Query does not explicitly ask for results in a particular Any results from a di erent locale (even if they’re in the Query is “ticketmaster”; user is located in US. Result is locale, but the user need is inherently locale- correct language) should be automatically graded as ticketmaster.co.uk. Grade as NS, since user did not speci c (e.g., local law information, country-speci c “NS”. express any interest in UK events. merchant sites, nearby real-world business). Foreign results (as long as they’re in the correct Query is “vaccine recommendations”. User’s locale is language) should be SLIGHTLY penalized by assigning en-US, and the result is https://www.nhs.uk. The NHS a grade one level lower than you would normally give. Mildly Locale-Sensitive. is the UK's National Health Service that provides health Query does not explicitly ask for results in a particular care to all British residents. Since di erent countries “HS” results should be downgraded to “S” locale, but those in other locales may be somewhat less “S” results should be downgraded to “SS” provide di erent medical advice for their residents, the useful. UK's advice would be less useful to a US resident than “SS” results should be downgraded to “NS” advice from a US medical agency. The result should be “NS” results should remain as “NS” SLIGHTLY penalized from S, down to SS. Not Locale-Sensitive. Query is “tennis news.” User is in en-US; result is news Grade result without regard to locale. Results from any locale would be equally useful for this from the BBC about the latest results from the query. Wimbledon tennis tournament. Speci c Situations & Result Types Page 30 of 58 fi fi ff ff ff fi ff fi fi English Results in Non-English Locales English is a widely-understood second language in many countries, and all our international graders are uent in it. For this reason, rather than simply marking an English result in a non-English locale as “wrong language,” graders should go ahead and grade the result, with the following locale-speci c considerations. You will need to use your own knowledge of the locale to decide which guideline to apply. English Results in Non English Locales Scenario Grade The user’s locale is one where most users understand English uently (i.e. ES-US) Grade the result normally, the same way you would if it were in the locale language. and would likely be interested in English-language results. Grade the result one level lower than you would if it were in the locale language. The user’s locale is one where many users understand English uently (i.e. Western ⚠ Results that would have been NS should still be graded as NS Europe) and would possibly be interested in English-language results. The user’s locale is one where relatively few users understand English uently and Grade the result as NS. would be unlikely to be interested in English-language results. Redirected Pages If the result displayed URL gets redirected to a di erent URL, then you should grade the page youʼre redirected to as if that were the result. Speci c Situations & Result Types Page 31 of 58 fi ff fl fl fl fl fi Apps When a user clicks these results it takes them to app store (usually Apple app store) or opens the app if present on the device. App Rating Guidance Rule Additional Details Rule 1 under HS refers to cases where the query is the name of a well-known app — a service that is best known as an app. Examples: Instagram, Spotify, and Candy Crush ⚠ A well-known app is not the same thing as a well-known company! Rule 3 under HS refers to cases where the query is a business and the result is an app “regularly used to interact with that business.” Meaning, the app is a common way that customers or clients perform the ordinary tasks they need to do business with that company. 1. If the query is the name of a bank, then the app should allow the user to perform mobile banking tasks. ⚠ Just because a company has an app does not mean that it’s regularly used 2. If the query is the name of a restaurant chain, then the app should allow the user to order food at that restaurant. to interact with that business. For example, the query “dell” refers to the name 3. If the query is the name of an airline, then the app should allow the user to of a computer company. But their app “Dell@Retail 2019” is described as “a make reservations, choose their seat assignment, and check ight status. chance for our global retail partners to immerse themselves in the design, 4. If the query is the name of a retail chain, th