understanding_search_engines

**This page is about probing understanding of search engines**
In my experience, students seem to have some rather bizarre ideas about what a search engine is. Whilst there is, at a technical level", some 'absolute fact' as to what search engines are and how they work, I'd prefer to take the view that learners have good reason to be believe what they say. In other words, they have constructed some knowledge, based on experience in life or in the classroom, to think in certain ways about search engines (for example). It is therefore interesting to probe what, in fact, they do believe.

This probe started way back in 2000 when I wrote down some of the seemingly bizarre ideas students told me about search engines in a written test, which followed a range of classroom activities which included a worksheet on exactly how a search engine was. I had reason to believe that students had not been very conscientious and probably hadn't fully completed the worksheet or studied very much for the test. But rather than writing "I don't know", they wrote down some answers which I interpreted as being a reasonable expression of their knowledge (and possibly uneffected by what I had tried to teach them!): ideas such as "they search the world to find something for you".

I had heard about "primed clinical interviewing" in a paper presented by Prof. Peter Fensham the year before (1999), the context for that paper was the science concepts held by teachers, but I'd wondered then then probing understandings of computer concepts. Drawing on the the ‘primed clinical interview’ of Fensham & Lui (1999), there are two groups of questions. First of all, there are questions which orientate the student to their ideas about the topic (so, drawing on the test responses that I'd already received there would be no shortage of orientating students to what they already knew). Secondly, students are asked a second group of questions which do not ask for their knowledge of the topic to be described directly, but in terms of analogies; none of the analogies are a precise representation of the subject, but each of them has at least a grain of truth about them.

It wasn't feasible to try out a probe with the same group of students, so I left the idea for a couple of years, and then set a class of students to work on responses to the following, in groups.


 * To what extent do you believe that the following are true statements about internet search engines** (orientating questions):
 * They search for data on every page of the Internet
 * They locate key words by searching the world
 * They are just like a library
 * They search a certain part of the Internet for you
 * A search engines searches the pages which are connected to it


 * Write a few sentences to describe why a search engine might be like** (the probe):
 * a library
 * a library catalogue
 * a librarian
 * a game of Chinese whispers
 * the index page to a book
 * someone who is a speed reader


 * Can you thinking of any other analogies for search engines?** (more probing)


 * Which do you think is the best analogy? Why?** (more probing)

What I learnt from this was that group work didn't really work. Firstly, students tended to defer to the member of the group who had the best reputation for being knowledgeable about computing. Secondly, it didn't present any other interesting analogies.

Now that this second cohort of students have passed, I'd like to try the approach again. I'm a bit concerned that students might not know very much about what a librarian does or what the game of Chinese whispers involves, so the analogies mightn't work. So part of the reason for the third iteration is to gain some insight into this, and to russle up some more analogies.

Maybe by the fifth or sixth iteration there'll be a tool which might be quite suitable for probing understanding of search engines!

Paul, I suggest you watch the Harvard Computer Science Extension School podcast.

With search engines, I would again take a step back and not think of the internet and or computers. Let's look at how people search things. For example, if we were given a pack of Jelly beans, how do we look for the grape flavour for example - we all know the characteristics of a grape, it has either a purple or green skin, it also has a taste different from say licorice or an apple. With these characteristics, in a pack of Jelly beans, we then search for the purple coloured jelly beans. We try and skim through the pack and search for the purple ones and then we enjoy the outcome of our 'hunt for the grape flavoured Jelly bean'.

Now let's take it forward, we all heard the famous Jack and Jill story.

Jack and Jill went up a hill to get a pale of water. Jack fell down and broke his crown and Jill came tumbling after.

Pretty simple isn't it, let's imagine we are looking for the words broke and we would like to display the sentence which contains that word. We know we have 2 sentences, what do we do? We go through each word from the 1st sentence until we 'find' broke. With that search, we then end up meeting the word 'broke' by midway through sentence 2, therefore, we know that the word we are looking for is in sentence 2.

But that is tedious, what if you had 100 sentences and you are looking for just a specific word... the answer will be on the next posting. : )

To resolve the Jack and Jill example, wouldn't you have to show students a little program that loops through and find the word broke, so they can experience the thing working?

The best Learning in Science project examples included little experiments the students could carry out to test various hypotheses, eg. that electric current requires a circuit. It's easy to get students to predict and test out a variety of situations in this way.

I have tried shock tactics, asking: How can google search 25 billion plus web pages in less than a second? The speed of search is an important consideration. How do they achieve this? I think it's important to add the speed of the process to the probe. That would tend to sideline the Chinese whisper and speed reader options (?)

Here is an article by Matt Cutts, [|How does Google collect and rank results?] which does include a couple of exercises for students

Here is my summary of the process:
 * 1) Spiders crawl web and retrieve pages
 * 2) Pages are numbered for future reference
 * 3) An index is built
 * 4) The index is distributed over hundreds or thousands of computers
 * 5) When a search is made it is processed by hundreds of computers to speed things up
 * 6) Results are ranked by relevance

One good metaphor from the Matt Cutts article is to speed up the process of searching a book index you could rip out the index and give one page to each person to search (Bill 2 Sept)