Lecture 2 - Advanced Web Search PDF

Summary

This document is a lecture on advanced web search techniques. It covers topics such as introducing search engines, using truncation, quotation marks, and wild cards. It also discusses advanced search features, limiting searches by date, language or document type.

Full Transcript

Lecture 2 Advanced Web Search What you’ll Learn Introduction to search engines Truncation, Quotation Marks and Wild Cards Search operators Creating search statements Advanced search features Limiting searches by date, language or document type How search Engin...

Lecture 2 Advanced Web Search What you’ll Learn Introduction to search engines Truncation, Quotation Marks and Wild Cards Search operators Creating search statements Advanced search features Limiting searches by date, language or document type How search Engines Work To most people, Internet search engines refer to World Wide Web search engines. The World Wide Web, commonly known as the Web, is an information system where documents and other web resources are identified by Uniform Resource Locators, which may be interlinked by hypertext, and are accessible over the Internet. Before the World Wide Web became the most visible part of the Internet, there were already search engines in place to help people find information on the Internet. Some of the popular names at the time were like "gopher" and "Archie“. In the late 1980s, getting serious value from the Internet meant knowing how to use gopher, Archie, Veronica and the rest. The Web and other services of the web Search Engines - These tools are really a part of the World Wide Web and are often used when looking for information because the Web has grown so large and is without any inherent organizational structure. The Web and other services of the web E-mail - Exchanging electronic letters, messages, and small files. The Web and other services of the web Chat - IRC (Internet Relay Chat) for live discussions on the Internet. The Web and other services of the web. Hosting - Making information available to others on the Internet. The Web and other services of the web FTP - File Transfer Protocol is the most common method of transferring files between computers via the Internet. The Web and other services of the web Mailing Lists - E-mail messages forwarded to everyone on a special interest list. Other services of the Internet Besides the Web Telnet - Creation of a dumb terminal session to a host computer in order to run software applications on the host system. Other services of the Internet Besides the Web Usenet - Newsgroups for receiving news and sending out announcements. Web Search Spider Web Google is a website that uses the spider web method in order to accurately find websites that you are interested in............... Advanced Search Techniques Advanced search options are a set of very useful features offered by most search engines and search tools on the Web. Advanced search gives the Web searcher the ability to narrow their searches by a series of different filters; i.e., language, proximity, domain, etc Multimedia Searching Multimedia Searching such as videos, pictures, and how to use someone’s media legally.......... How search Engines Work A search engine tells you where a file or document can be found. But first, the file must be located. Search engines use special software robots known as spiders to build lists of the words found on millions of Web pages. When a spider is building its lists of words, the process is referred to as Web crawling. The usual starting points for the spiders are heavily used servers and very popular pages. The spider begins with a popular site and indexes the words on its pages and through every link found within the website. Spiders first look for words in the title, subtitles, meta tags and other positions of relative importance for special consideration during a user search. The Google spider was designed to index every significant word on a page, leaving out the articles "a," "an" and "the“. Other spiders like AltaVista take different approaches. How search Engines Work Assignment Explain how google web search engine works during page search on the web. State and explain three ways of improving page indexing during google web search. NB: After submission, be prepared to present your answer to the class. Data Centers All online activities including web searches are made possible by data centers around the world Basic Searching Narrowing Searches only use words that are key words to help narrow the number of searches...................... Don’t type long Sentences USE KEY WORDS.................... Use sites based upon what you are looking for If you are looking for videos, use youtube, if you are looking for images use images.google.com Truncation and Quotation Marks The two most helpful advanced search techniques are: 1) Quotation Marks 2) Truncation or Wild Card Quotation Marks Quotation marks are used around phrases. By using quotations marks, you are telling the computer to only bring back pages with the terms you typed in the exact order you typed them. Example: “health care reform” instead of health AND care AND reform Quotation Marks For example, if you are interested in finding information on social networking, it is best to search for “social networking” in quotation marks. Otherwise, the computer might search for social AND networking and find many more irrelevant results. Adding a Minus Adding minuses to a specific word tells the search engine not to search for sites related to that word........................... For example, I am interested in finding information on social networking, I can add words that I do not want the search engine to include in the search. E.g.: social media networking -Twitter Truncation and Wild Card symbols These are used to widen search results. This ensures you don't miss relevant records. Most databases are not intelligent - they just search for exactly what you type in. Truncation and wild card symbols enable you to overcome this limitation. These symbols can be substituted for letters to retrieve variant spellings and word endings. Truncation and Wild Card symbols A wild card symbol replaces a single letter - useful to retrieve alternative spellings and simple plurals e.g. Truncation and Wild Card symbols Truncation means to chop off. When you truncate you chop off the end of the word, so the computer can search for multiple endings. For example, if your research question includes the keyword education. You can truncate education, so that the computer will find all of the word ending variations. Educat* will find: Education Educate Educated Educating Truncation - Hint Be careful where you place the truncation symbol. Educate* will not find education or educating, although it will find educate and educated. Truncation will not find synonyms (i.e. scien* will not find the words botany, biology, or astronomy), although it may bring up articles on those topics IF they include the words science, scientific, or scientist. Search Operators - Boolean Also known as Boolean operators, search operators allow you to include multiple words and concepts in your searches. AND retrieves records containing both words. E.g. Finance and Accounting It narrows your search. Some databases automatically connect keywords with AND OR retrieves records containing either word. It broadens your search. You can use this to include synonyms in your search. E.g. marketing or advertising NOT retrieves your first word but excludes the second. LINK: https://www.google.com/advanced_search Advanced Search Through this page, you can do some of the filtering explained earlier by typing the required text in the textboxes and clicking on “Advanced Search” button A click on the “Advanced Search” button without typing anything brings the webpage on next slide. Advanced Search Here, one can search in some A click on the “I’m Feeling local languages as shown Lucky” button displays the under Google offered in:… Google Doodle Archives page Searching by date or language Many databases allow you to limit your search in various ways. Limits are usually available on advanced search screens, or you can apply them after doing your keyword search. Examples of the types of limits you can apply include: -by date -by language -by publication type (eg journal articles, chapters in books, review articles that provide detailed summaries of research, book reviews) Advanced Search Searching by File Types - Examples.doc site:domainname.com filetype:doc Example: site:ug.edu.gh filetype:doc.pdf e.g.: site:www.ug.edu.gh filetype:pdf Searching by File Types - Examples.ppt e.g.: Web tutorials filetype:ppt.gov e.g.: Ghana government filetype:gov.gov Other Advanced Search Operators AllIntext This operator will help you find whether all the terms that you are looking for shows up in the text of that page. This operator, however, isn’t pin-accurate because it won’t look for text on the page that appears close together. e.g. university of ghana allintext:accomodation Intext This operator is a more global operator that allows you to find any terms showing up on a webpage in any area – like the title, the page itself, the URL, and elsewhere. e.g. university of ghana intext:accomodation Other Advanced Search Operators Allintitle This search operator is a great way to find blogs that match the content you are writing about. For example, you could use allintitle to research what others are doing for that particular topic. Then, you could write your post to be better than theirs. e.g. allintitle:banku and okro Intitle This is a narrower operator that will help you find more targeted results for specific search phrases. If you wanted to find pages that are all about “banku and okro” for example, the following is how you would use it: e.g. intitle:banku and okro Other Advanced Search Operators Allinurl This one allows you to find pages with your requested search terms within the URL in internal search pages. For example, say you wanted to perform research on pages on a site that had the terms “banku and okro”. You would use the following: e.g. allinurl:banku and okro Inurl If you wanted to find pages on a site that has your targeted search term in the URL, and the second term in content on a website, you could use this operator. e.g. inurl:banku and okro Other Advanced Search Operators site This is used to search for a specific site. To locate a specific site, put “site:” in front of a site or domain. e.g. site:youtube.com or site:.gov. related This is used to search for related sites. To search for related sites, put “related:” in front of a web address you already know. e.g. related:banku and okro info This is used to get details of a site. To get details about a site, put “info:” in front of the site address. e.g. info:www.ug.edu.gh cached This is used to see Google’s cached version of a site. To get google cached version of a site, put “cached:” in front of the site address. e.g. cached:www.ug.edu.gh Google Scholar provides a simple way to broadly search for scholarly literature. From one place, you can search across many disciplines and sources: articles, theses, books, abstracts and court opinions, from academic publishers, professional societies, online repositories, universities and other web sites. Google Scholar Multi Media Search Techniques Search by voice Google Voice Search or Search by Voice is a Google product that allows users to use Google Search by speaking on a mobile phone or computer, i.e. have the device search for data upon entering information... Legally Using Pictures or Videos People who own multi media files have creative common licenses. This means they can let us use their media or not. All Rights Reserved, meaning we must get permission from the owners to use the Media................ Google Images You can use a picture as your search to find related images from around the web. Conclusion More Reliable Searching narrowing searches can get better more reliable sites when working on projects or papers.............. Less Time Wasted narrowing searches can take time away from searching when working on projects or papers........................... NEXT WEEK – MS_EXCEL

Use Quizgecko on...
Browser
Browser