Transcript of Jim Davies's talk on March, 2010, at TEDxCarletonU.
This is the script that I'd planned to say; it differs slightly from what I ended up saying.
[Importance of Imagination] What's imagination? Broadly speaking, when we use the word we are talking about one of two things: First, we mean creativity. This is what we refer to when we say "Julie Taymor has a great imagination" or when someone says "I have no imagination." Another sense of imagination is the act of picturing something in your mind. And this kind of imagination happens all the time. It happens when we plan, reason hypothetically, think about the past, think about how things might have been, think about the future, fantasize, design, dream when we're asleep, and daydream, which we spend about 20% of our lives doing. In other words, imagination is incredibly important. [stars start] In fact, they've found that visualizing doing sports makes you better at actually doing that sport. [lab] My goal is to understand imagination. I want to get under the hood and figure out how it works. When you read a book, or hear a story, and you picture what is happening, where do those pictures come from? How does our mind know what stuff goes in and where to put it? So when I became a professor I set up the [logo animation] Science of Imagination Laboratory. According to my theory, what we imagine is constrained by three classes of things: First, the environment we are in when we imagine [current environment], including, for example, someone asking you to imagine something. or the book you're reading second, what we know about the world and how it works, what we call our model of the world [world understanding] and finally, the history of our visual memories. [visual memory] that is, everything we've ever seen. That's the part my work is focused on. How does our perceptual history affect our imagination? Which might sound a little strange. How can you possibly use science to study the imagination? It seems to be so elusive and mysterious, one of the things least accessible to objective measure. Well today I'm going to talk to you about how it's possible. How people have done it, How I've done it, And what the future holds for a science of imagination. [science] Here are some ways that imagination can be studied scientifically. Each of them have a weaknesses, so we use all of them and see if their results converge. First, you can have people can draw what they imagine. You might ask someone to imagine a triangle, and then draw what they imagine. We've done this in my lab and found that people tend to imagine triangles and other shapes with a flat side down, Our hypothesis is that this reflects the stability in the real world. [show falling shapes] You might be thinking that maybe we only imagine shapes with a flat side down because we see them like that all the time in our environment. I think that might be right too. But it doesn't answer the question of why the shapes we see in the world also have this remarkable consistency. I hypothesize that we see shapes in our environment with a flat side down for the same reason we imagine them that way: because it represents stability in a world with gravity. Sometimes cultures are the way they are because of historical circumstance, but sometimes there are deeper reasons. Over at McGill, Gosselin and Schynns did a clever experiment where they got to see pretty directly what was in the imaginations of people. They had people in the lab look at random dot patterns, about twenty thousand of them. They were told that some of these had the letter 's' hidden in them, and others did not. of course, none of them really did. It was all noise. but by averaging all of the ones people said "yes" to (because if you look hard enough you'll see something-- we call that top-down processing) You can see the ghostly image of the 's' that they thought they saw. [ghostly s] Each person even had a slightly different font! [show crisper 's'] Look at that! This is the most direct window into someone's actual imagination I've ever seen. There's also a field called Embodied Cognition That sheds light on how we imagine things. We imagine good motion as being left to right in our visual field. That's in our culture. In places where people write right to left, it's the opposite. and you'll notice, if you watch the Matrix movies, every time Neo gets into a fight, he's running left to right on the screen and his enemy is running at him from right to left. Also, Good seems to have an up direction, bad is down, and this is reflected in our language. I'm feeling down today, things are looking up. She came out on top. Psychologists have found that verbs have direction-- When given choices like these [show slide from Richardson and Spivey] People will choose the same direction for a verb pretty consistently respect is up, giving is left to right destruction is down. So when we imagine these activities, we are likely to imagine them happening in a certain direction. So, for example, when we picture someone giving her mother a gift, we would probably imagine the gift moving from left to right, across our visual field. [Visuo] Now, my lab runs psychological experiments, but we also [blank] try to simulate human imagination on computers. That is, we try to make a computer program to imagine the same way the average person would. And one thing that we can do very easily is imagine things of different sizes. Can you imagine a tiny Cheshire cat? [cat's smile, followed by the whole cat] Sure you can, even though you've probably never seen anything in your life that someone has labeled as a "tiny Cheshire cat." One question we ask is this: how do you know exactly how tiny to make that cat? [blank] One of the students in my lab, Jonathan Gagne, created a program called Visuo that predicts imagined heights, sizes, anything that can be described with a number. It can imagine a tree, or a long tree, even if it has never experienced anything that was labeled "long tree." How can it do this? [start crows flying] Well, Let's suppose you've seen many crows, According to our theory, in your memory you store a distribution of the sizes of each one. [all crows distribution] Here we have the distribution of all of the crows you've seen [crows flying into distribution] But some of the crows are labeled as, say, "large." [crows flying through all crows into large and small] These large crows add to the crow distribution, but also add to a special distribution for "large crows." Let's say you've never seen a raven that was labeled as large. You've just seen a bunch of ravens You've also seen buildings, ocelots, and improv groups. [show a few distributions along top] Now, you are called upon to imagine a "large raven." How do you know how big to make it? [question mark]* According to Visuo's theory, you choose the thing in memory that is the most related to raven in meaning. in this case, crow is more related than building or ocelot. [other choices disappear] Then you find out how to transform the distribution of all crows to the distribution of large crows [indicate transformation] then apply it to the distribution of "ravens." [indicate generation] What you end up with is an imagined distribution of large ravens, from which you can imagine one. [have a large raven fly out of the new large raven distribution, takes center of screen and flys] Here we have a program that imagines realistic sizes of things it's never seen before, another step toward computer creativity, and, hopefully, a step toward understanding it in human beings. [Peekaboom mining] If imagination reflects what we've seen, [blank] we need a database of what people have seen. Well, do don't have that, exactly, but we have a proxy for that: we take images form the web, and associate them with labels. [image showing what peekaboom has] We have a database collected from a game called Peekaboom, invented by Luis von Ahn, the same guy who invented capchas, and this database has about fifty thousand images, with labeled regions. So the computer can know not only what's in the image, but where it is. we are mining this for spatial relationships so the computer can automatically learn that roads are typically below cars, and that skies are typically above cars, all without having to tell the computer these facts explicitly. It can just learn them for itself. With this we can create a model that can imagine for us. so if you ask it to imagine a large bird above a house, We can use the database to know what else is in the image, and where. not just the bird and the house, but other stuff as well. we might also assume the image will have ground, and grass, and sky. [Oracle] By looking at what labels co-occur in images, we can predict what other things will be in the image and where they will be. For example, [blank] another student of mine, Cesar Astudillo, has created a website we call the image oracle we type "computer" into it, the system tells us there should also be the following objects, and how probable it would be to find them. screen 0.279102 window 0.183731 man 0.129032 windows 0.102384 keyboard 0.0911641 monitor 0.0827489 laptop 0.0687237 woman 0.0603086 desk 0.056101 people 0.0434783 Not only that, but because we have the spatial information, we know where those objects are likely to be in relation to the others. [blank] So you might ask the computer program to imagine a big tree. Visuo will be able to determine how big the tree should be, we will know what other things should appear in the image with the tree, we will know where they should go in the image, Then the system can go into the database and pull out those pixels to create an entirely new, imagined image. and then it's just a matter of stitching them all together, which is a problem the graphics community has made great progress on. We're on our way to creatively generating new 2d images based on a small user input. And then we'll do it in 3d. [computer vision] now, one of the great applications of all of this is in computer vision. [blank] computer vision is getting computer programs to see things in pictures and videos. It's got countless applications, from driving cars for us to recognizing disease in medical imaging. One kind of computer vision is object recognition, which is just trying to find out what objects are in a picture. So take a look at this picture. What do you think is missing? A sky? Or a monitor? What if I told you that the missing piece looked like the sky? you might suggest that maybe it's a picture of the sky on a monitor. The way object recognition works is that it passes a window over an image and tries to detect the objects in it. [pass over, detect face] What it doesn't do is use the other objects in the image to constrain its choices. [blank] But with knowledge of what kinds of objects tend to co-occur in an image, and their spatial relations to one another, we can help object recognition systems make better guesses about what's in the image. The image oracle I mentioned earlier can tell you how likely it is that two things appear in the same image. So if we type in "computer," and ask it for the probability of seeing the "sky," we can see that the probability is quite low: less than one percent. The system can suggest that it's probably a screen, which has a much higher probability screen 0.279102 window 0.183731 man 0.129032 windows 0.102384 keyboard 0.0911641 monitor 0.0827489 laptop 0.0687237 woman 0.0603086 desk 0.056101 people 0.0434783 [detectors] We've also built an understanding of spatial relationships into computer programs. If you ask our system to give an image of a cat over a tree, or a car close to a sign, the system can return images with those relations in them. So when we ask the program to give us an image with a cat below a tree, it returns this [show picture with caption "cat below tree 0.99"] and hand occluding book returns [show picture with caption "hand is-occluding book 0.75"] here's an image returned with a spoon close to a fork: [show picture with caption "spoon close-to fork 1.0"] [creativity] <-show Sometimes, when I talk about this stuff, people express that they don't think of imagining a cat and a tree real imagination. it's not creative. they are getting back to the first definition of imagination I talked about at the beginning. Well, even if you think of some of the most creative things you've ever seen, you'll notice that it still has a pretty firm basis in reality. There is some kind of physics, often characters that act like people. In fact, created worlds often only differ from the real world in a couple of interesting ways. What this means is that making a computer program that imagines creative new worlds requires a thorough understanding of what the real world is like. From there, we can teach it to tweak it in certain ways to get the kind of imaginative things we find in movies and other arts. You have to know what's real to create compelling fantasy. [individual differences] <-show People often wonder about why different people imagine things differently. We're obsessed with how people differ from one another. But again, we take for granted the similarities we all have in our imaginations. When I describe a scene in a kitchen, most people imagine the kitchen containing the same kinds of things. sinks, a fridge. It's only in particularly creative contexts that people put unusual things in there. like trying to imagine what an alien kitchen would look like. And even then, we need to understand how the average person imagines things before we can make the system be really creative. [conclusion] <-show To conclude, imagination is all around us; we use imagination all the time and there are scientific tools out there for understanding it, [stars] and even automating it. This can help us do automatic illustration, design sets for film, and create diagrams and intuitive visualizations for education and scientific publications. In the far future, perhaps they will create entire movies for us. Call me back for a TED talk in 20 years, I'll let you know how that progress is doing. And what will future computer systems come up with? [logo] right now, we can only imagine. thank you.