[ home | resume | contact | science | art | personal | email ]

The Science of Imagination

Transcript of Jim Davies's talk on March, 2010, at TEDxCarletonU.

This is the script that I'd planned to say; it differs slightly from what I ended up saying.


[Importance of Imagination]
What's imagination?
    Broadly speaking, when we use the word we are talking about one of two things:
        First, we mean creativity.
            This is what we refer to when we say 
                "Julie Taymor has a great imagination"
            or when someone says 
                "I have no imagination."
        Another sense of imagination is 
            the act of picturing something in your mind. 
        And this kind of imagination happens all the time. 
            It happens when we 
                reason hypothetically,
                think about the past,
                think about how things might have been,
                think about the future,
                dream when we're asleep,
                and daydream, 
                    which we spend about 20% of our lives doing.
        In other words,
            imagination is incredibly important. [stars start]
        In fact, they've found that visualizing doing sports 
            makes you better at actually doing that sport.


My goal is to understand imagination. 
    I want to get under the hood
            and figure out how it works.
    When you read a book, or hear a story,
        and you picture what is happening,
        where do those pictures come from?
        How does our mind know what stuff goes in 
            and where to put it?
So when I became a professor I set up the     [logo animation]
    Science of Imagination Laboratory.
According to my theory, 
    what we imagine is constrained by three classes of things:
        First, the environment we are in when we imagine [current environment],
            including, for example, someone asking you to imagine something. 
                or the book you're reading
        second, what we know about the world and how it works,
            what we call our model of the world [world understanding]
        and finally, the history of our visual memories. [visual memory]
            that is, everything we've ever seen. 
                That's the part my work is focused on. 
                    How does our perceptual history affect our imagination?
Which might sound a little strange. 
    How can you possibly use science to study the imagination?
    It seems to be so elusive and mysterious,
        one of the things least accessible to objective measure. 
Well today I'm going to talk to you about how it's possible. 
    How people have done it, 
    How I've done it, 
    And what the future holds for a science of imagination.


Here are some ways that imagination can be studied scientifically.
        Each of them have a weaknesses, 
            so we use all of them and 
                see if their results converge.
    First, you can have people can draw what they imagine.
        You might ask someone to imagine a triangle,
            and then draw what they imagine. 
                We've done this in my lab and found that 
                    people tend to imagine triangles and other shapes with a flat side down,
                        Our hypothesis is that this reflects the stability in the real world. [show falling shapes]
                You might be thinking that
                    maybe we only imagine shapes with a flat side down
                        because we see them like that all the time 
                            in our environment. 
                    I think that might be right too. 
                        But it doesn't answer the question of why
                            the shapes we see in the world 
                                also have this remarkable consistency.
                        I hypothesize that we see shapes in our environment
                            with a flat side down 
                                for the same reason
                            we imagine them that way:
                                because it represents stability in a world with gravity. 
                        Sometimes cultures are the way they are 
                            because of historical circumstance,
                                but sometimes there are deeper reasons. 
    Over at McGill,
        Gosselin and Schynns did a clever experiment where they got to see 
            pretty directly 
        what was in the imaginations of people.
            They had people in the lab look at random dot patterns,
                    about twenty thousand of them.
                They were told that some of these had the letter 's' hidden in them, 
                    and others did not. 
                        of course, none of them really did. 
                            It was all noise.      
        but by averaging all of the ones people said "yes" to
            (because if you look hard enough you'll see something-- we call that top-down processing)
        You can see the ghostly image of the 's' that they thought they saw. [ghostly s]
            Each person even had a slightly different font! [show crisper 's']
                Look at that! 
                    This is the most direct window 
                        into someone's actual imagination 
                            I've ever seen. 

There's also a field called Embodied Cognition
    That sheds light on how we imagine things.
    We imagine good motion as being left to right 
        in our visual field.
     That's in our culture.
        In places where people write right to left,
            it's the opposite.
        and you'll notice, if you watch the Matrix movies, 
            every time Neo gets into a fight,
                he's running left to right on the screen
                and his enemy is running at him 
                    from right to left. 
    Good seems to have an up direction,
        bad is down,
            and this is reflected in our language.
        I'm feeling down today, 
            things are looking up. 
            She came out on top. 
    Psychologists have found that verbs have direction-- 
        When given choices like these [show slide from Richardson and Spivey]
            People will choose the same direction for a verb
                pretty consistently 
        respect is up, 
        giving is left to right
        destruction is down.
    So when we imagine these activities, 
        we are likely to imagine them happening in a certain direction.
        So, for example, when we picture someone
            giving her mother a gift, 
                we would probably imagine the gift moving from left to right,
                    across our visual field. 


    Now, my lab runs psychological experiments, but we also [blank]
        try to simulate human imagination on computers. 
            That is, we try to make a computer program to imagine
                the same way the average person would. 
    And one thing that we can do very easily is imagine things of different sizes. 
    Can you imagine a tiny Cheshire cat?      [cat's smile, followed by the whole cat]
        Sure you can, even though you've probably never seen anything in your life that someone has labeled 
                as a "tiny Cheshire cat."
    One question we ask is this: how do you know 
        exactly how tiny 
            to make that cat? [blank]
    One of the students in my lab, 
        Jonathan Gagne, created a program called Visuo that predicts
            heights, sizes, 
                anything that can be described with a number.
        It can imagine a tree, or a long tree, 
            even if it has never experienced anything 
                that was labeled "long tree."
    How can it do this? [start crows flying]
    Well, Let's suppose you've seen many crows, 
        According to our theory, in your memory
            you store a distribution of the sizes of each one. [all crows distribution]
                Here we have the distribution 
                        of all of the crows you've seen [crows flying into distribution]
        But some of the crows are labeled as, say, "large."  [crows flying through all crows into large and small]
            These large crows add to the crow distribution, 
                but also add to a special distribution for "large crows."
        Let's say you've never seen a raven that was labeled as large.
            You've just seen a bunch of ravens
        You've also seen buildings, ocelots, and improv groups.    [show a few distributions along top]
        Now, you are called upon to imagine a "large raven." 
            How do you know how big to make it? [question mark]*
        According to Visuo's theory,
            you choose the thing in memory that is the most related to raven
                    in meaning.
                in this case, crow is more related than building or ocelot.    [other choices disappear]
            Then you find out how to transform the distribution of all crows 
                to the distribution of large crows      [indicate transformation]
            then apply it to the distribution of "ravens."      [indicate generation]
            What you end up with is an imagined distribution of large ravens,
                from which you can imagine one. [have a large raven fly out of the new large raven distribution, takes center of screen and flys]
            Here we have a program that imagines realistic sizes of 
                things it's never seen before,
                    another step toward computer creativity,    
                        and, hopefully, a step toward understanding it in human beings. 
    [Peekaboom mining]

    If imagination reflects what we've seen, [blank] 
        we need a database of what people have seen.
           Well, do don't have that, exactly, but
             we have a proxy for that: 
                we take images form the web, 
                    and associate them with labels.        [image showing what peekaboom has]
        We have a database collected from a game called Peekaboom,
                invented by Luis von Ahn, 
                    the same guy who invented capchas,
            and this database has about fifty thousand images, 
                with labeled regions. 
                So the computer can know not only what's in the image, 
                    but where it is.
     we are mining this for spatial relationships
            so the computer can automatically learn that 
                roads are typically below cars, and that 
                skies are typically above cars, 
                    all without having to tell the computer these facts explicitly.
                        It can just learn them for itself. 
       With this we can create a model 
            that can imagine for us.
             so if you ask it to imagine a large bird above a house, 
                We can use the database to know what else is in the image, and where.
                not just the bird and the house, but other stuff as well. 
                   we might also assume the image will have ground, and grass, and sky.


    By looking at what labels co-occur in images, 
         we can predict what other things will be in the image and where they will be. 
    For example, [blank]
        another student of mine, Cesar Astudillo, 
        has created a website we call the image oracle
        we type  "computer" into it, 
        the system tells us there should also be the following objects, 
            and how probable it would be to find them.
screen 0.279102 
window 0.183731 
man 0.129032 
windows 0.102384 
keyboard 0.0911641 
monitor 0.0827489 
laptop 0.0687237 
woman 0.0603086 
desk 0.056101 
people 0.0434783

    Not only that, 
        but because we have the spatial information,
        we know where those objects are likely to be in relation to the others. [blank]

    So you might ask the computer program to imagine a big tree.
        Visuo will be able to determine how big the tree should be, 
            we will know what other things should appear in the image with the tree,
             we will know where they should go in the image,
            Then the system can go into the database and pull out those pixels 
                to create an entirely new, imagined image.
        and then it's just a matter of stitching them all together,
            which is a problem 
                the graphics community has made great progress on. 
    We're on our way to creatively generating new 2d images based on a small user input.

    And then we'll do it in 3d.
    [computer vision]

    now, one of the great applications of all of this is in computer vision. [blank]
    computer vision is getting computer programs to see things in pictures and videos.
    It's got countless applications, 
            from driving cars for us
                to recognizing disease in medical imaging.
        One kind of computer vision is object recognition,
                which is just trying to find out what objects are in a picture. 
        So take a look at this picture. What do you think is missing?
            A sky? 
                    Or a monitor? 
                        What if I told you that the missing piece looked like the sky?
                            you might suggest that maybe it's a picture of the sky on a monitor. 
        The way object recognition works is that 
            it passes a window over an image 
                and tries to detect the objects in it. [pass over, detect face]
            What it doesn't do is use 
                the other objects in the image to constrain its choices. [blank]
        But with knowledge of what kinds of objects tend to co-occur in an image, 
            and their spatial relations to one another, 
            we can help object recognition systems make better guesses about what's in the image. 
        The image oracle I mentioned earlier
            can tell you how likely it is that two things
                appear in the same image.
        So if we type in "computer,"
            and ask it for the probability of seeing the "sky,"
            we can see that the probability is quite low: less than one percent.
        The system can suggest that it's probably a screen,
            which has a much higher probability 

screen 0.279102 
window 0.183731 
man 0.129032 
windows 0.102384 
keyboard 0.0911641 
monitor 0.0827489 
laptop 0.0687237 
woman 0.0603086 
desk 0.056101 
people 0.0434783


    We've also built an understanding of spatial relationships into computer programs.
        If you ask our system to give an image of a cat over a tree,
            or a car close to a sign, 
                the system can return images with those relations in them.
                    So when we ask the program to give us an image with 
                        a cat below a tree, it returns this
                            [show picture with caption "cat below tree 0.99"] 
                    and hand occluding book returns
                            [show picture with caption "hand is-occluding book 0.75"]
                    here's an image returned with a spoon close to a fork:
                            [show picture with caption "spoon close-to fork 1.0"]

    [creativity] <-show
    Sometimes, when I talk about this stuff,
        people express that they don't think of imagining a cat and a tree real imagination.
        it's not creative.
        they are getting back to the first definition of imagination
            I talked about at the beginning. 
        Well, even if you think of some of the most creative things you've ever seen,
            you'll notice that it still has a pretty firm basis in reality.
        There is some kind of physics, 
            often characters that act like people.
        In fact, created worlds often only differ from the real world 
            in a couple of interesting ways.
        What this means is that making a computer program
            that imagines creative new worlds
            requires a thorough understanding of what the real world is like.
        From there,
            we can teach it to tweak it in certain ways to get 
                the kind of imaginative things we find in movies and other arts.
        You have to know what's real 
            to create compelling fantasy.

    [individual differences] <-show
    People often wonder about why different people imagine things
    We're obsessed with how people differ from one another. 
    But again, 
        we take for granted the similarities we all have 
            in our imaginations.
    When I describe a scene in a kitchen, 
        most people imagine the kitchen 
            containing the same kinds of things.
                sinks, a fridge.
    It's only in particularly creative contexts
        that people put unusual things in there.
    like trying to imagine what an alien kitchen would look like. 
    And even then, 
        we need to understand how the average person imagines things
            before we can make the system be really creative. 

    [conclusion] <-show

    To conclude, 
        imagination is all around us;
            we use imagination all the time
        and there are scientific tools out there for understanding it, [stars]
            and even automating it. 
        This can help us 
            do automatic illustration,
            design sets for film,
                create diagrams and intuitive visualizations
                    for education and scientific publications.
            In the far future, 
                perhaps they will create entire movies for us. 
                    Call me back for a TED talk in 20 years, 
                        I'll let you know how that progress is doing.

        And what will future computer systems come up with? [logo]
            right now, 
                we can only imagine. 
                thank you. 

JimDavies ( jim@jimdavies.org )