I had a great experience last week that brought together my work with kids and with adults.
One of my favorite searcher hang-outs is Irina Shamaeva’s
Boolean Strings group on LinkedIn. Cutting-edge professionals
in recruiting and sourcing make heavy use of the special syntax of search
engines, and so tend to gather online and discuss strategies. Since they are
excited to talk all day about unexpected ways to tweak various search tools to
get what you want, I can think of few places I’d rather be!
By chance, a Boolean Strings user asked great question on the same day
I’d dealt with it in a third-grade classroom: Why is Google returning results containing substitute search terms, related to the ones I entered…some of the time?
I teach that using a search engine
successfully is predicated upon choosing exactly the words that appear on the
page you desire for your search query. However, many adults (and kids) push
back, noting that Google automatically searches for more words than they type
in.
Take, for example, one child who was preparing
to write a paper for Model UN. He was searching [cape verde tb]. (Remember, the
[ ] replicate a search box into which you type your query, so this search is
simply entered into Google as cape verde
tb.) I noted that a lot of documents would spell out tuberculosis, and so shouldn’t he do the same? He simply pointed at
his screen, where Google had returned results with either tb or tuberculosis, and showed
me that whenever each of these terms appeared it was bolded, indicating Google
had used it as a search terms.
The problem, we discovered on further
investigation, was that Google did
make the substitution of a related term for tb,
but it did not make an automatic
substitution for Cape Verde, which is
often referred to by its Portuguese name, Cabo
Verde. The same happened to the student researching the clothing worn by
the Egyptian God, Geb, for whom dress got automatically substituted for clothing, but got results where costume appeared unbolded—and so we knew Google had not searched for it. Likewise, there was the
sourcer who got results for CNM when he searched [Certified Nurse Midwife], but
did not get PA when he searched [Physician Assistant]. (Thanks for the
inspiration, George!)
What is happening? Generally speaking, Google tracks every search done
by users (not who did them, just what they did). Some of the back-end programming notes when many, many people run the
same searches, and after a while it starts to subtly and automatically tweak your search terms. For example, after millions of people search for [tuberculosis]
and then searched again for [tb], or even [tuberculosis OR tb], Google
"learns" that they mean the "same" thing, and will look for
them both. However, in the example above, while untold numbers of people may
have taught Google about the tb/tuberculosis connection, apparently there have
not yet been enough [cape OR cabo verde] searches to link those two ideas, so
Google is not able to compensate. The same programming is used, with the same inconsistencies, to fuel the "related terms" search (e.g., [chess ~glossary]).
A helpful feature, but inconsistent. Even dangerous to quality results
if you depend on it. So what is a user to do?
As annoying as it sounds, if you don’t know for a fact precisely what
substitutions Google is making on your behalf, you must use OR in your
searches. The student who already observed the connection between tb and tuberculosis can simply use [tb]. But for a term like clothing,
until that student knows what is being substituted, he should always try a string of search
terms like [clothing OR dress OR costume OR wearing OR wore].
My take away: Always include the terms you want to search—don’t depend
on Google to do it for you!
Your thoughts?
What a fascinating blog! I'm glad to be reading this. I've always loved learning by searching and since I work with a lot of teachers and students, I can see some great applications for this. I'll email you... Nancy R.
Posted by: Nancy Rossiter | 04/04/2010 at 07:38 AM