Joshua Foster HIST 390 Blog

Sep 10

How does Google get its results so quickly?

  • “Spider” crawls through the web
  • Pre-conceived Index: compiles massive index of all the words it has seen/found (includes numbers)
  • Pre-ranked Pages: a “voting” system decided by “Higher-ranked” webpages accessing pages.
  • Title of webpage, boldness, etc are indicators of importance for webpages.

Past search engines would show top results as any page that had listed the topic the most.

Ranking is always a problem when it comes to search engines; “What does it cover, what does it not?”

All search engines use a different ranking algorithm for deciding which pages appear first.

“Web doesn’t work like a set of historical documents.”

“Pages are interrelated through the link.”

“Web is an interrelated set of votes.”

Google has scanned 20,000,000 books… only 200,000,000 books are believed to be in existence from any language (10% of all books online)

Search engines ignore Stop words like ‘The’, ‘A’, ‘An’, ‘Of’, ‘In’, ‘On’

 

How do you be more specific when searching?

  • Quotations look for a specific phrase in a document
  • Exclusion of words with use of –
  • ~ can sometime mean synonyms
  • + is a forced inclusion
  • Metadata (data about data including Publisher’s name, Year of Publish)
  • * is a wildcard (matches any word)
  • AND, OR help with results

“Literacy of the modern world…”

Leave a Reply

Your email address will not be published. Required fields are marked *