Lytro's fancy and focus-free camera explained

At The Economist, Glenn Fleishman reports on Lytro's first-to-market implementation of computational photography. The result: you can refocus the shot after taking it.
a novel approach to photographic imaging is making its way into cameras and smartphones. Computational photography, a subdiscipline of computer graphics, conjures up images rather than simply capturing them. More computer animation than pinhole camera, in other words, though using real light refracted through a lens rather than the virtual sort. The basic premise is to use multiple exposures, and even multiple lenses, to capture information from which photographs may be derived. These data contain a raft of potential pictures which software then converts into what, at first blush, looks like a conventional photo.
I still don't quite get the talk about ray tracing. The part that makes sense to me, however, seems to explain it all: the camera has a wide-open aperture and an infinite depth of field on the main optics, but a bubble-wrap like plane of different lenses in front of the sensor, which thereby ends up capturing a fly-eye myriad of differently-focused fragments of the same scene. The software assembles a final composite depending on which of these you later focus on in post. It improves upon established focus stacking techniques because every image is taken simultaneously as a single exposure, at the cost of dividing up the sensor's megapixelage between them. Something like that, anyway. I'm going to play Minecraft. Previously: Lytro promises focus-free shooting


  1. What I’m wondering is: if the main reason you want a fast lens is for the narrow depth of field at wide exposures, would this technology make that a non-issue?

    1. I’m thinking no, because this would still let you select _which_ area you want to selectively focus.

      Also, the other major benefit of wide aperture is being able to get usable shots in low light.

      A win-win, if they can turn this into an actual usable product.

    2. It might, but I think anyone sophisticated enough to think about those tradeoffs would probably find the Lytro inadequate to their needs. The first version will be low res, and very snapshot oriented. I love the notion that you simply don’t have to adjust anything. This could be the Flip of still cameras.

  2. Their demo only shows two really obvious focus points. Background and foreground. Nothing about having it all in focus. (Landscape photography, for example)

    1. Here’s the magic of reporting! I spoke to the company at length, read an enormous amount of background information, consulted some photographers, talked to one of the professors who founded this particular field of research.

      The Web site demo shows selective narrow depth-of-field focusing, but you can watch video of Ren Ng demoing other things. The same formulae that allow ray tracing to create a different focal point also allow them to fiddle with the theoretical aperture and change the depth of field. In practice, it should go from f/2.2 to f/40 or even f/56 (depends on some particulars the company hasn’t revealed yet).

      But because all the light is captured at f/2.2 (or a similar wide aperture), you can simulate f/40 with a paper-thin depth of field (as in the demos) without suffering from noise from low-light conditions with pushed ASA, or from motion blur because you did a long exposure to reduce CCD noise.

      Neat, huh?

    2. I’m sure it can stitch multiple layers together if you want infinite focus. That is, if it doesn’t already provide an infinite focus version of the image.

  3. Wonder if we could gut a camera and make a digital pin hole brownie? That’s about what most folks can handle anyway.

    1. Absolutely. (By the way, the Canon G12 offers this as a camera effect with a single lens, and it’s scarily good in the right circumstances.)

      You should also be able to do foveated imgaing.

      The real limit on computational photography is getting a programmable camera, which doesn’t really exist. Smartphones are the closest thing, and their cameras and APIs limit what’s possible. Dr Lavoy’s Frankencamera is pretty interesting, because he’s pushing both a standard API and the notion of a fully programmable camera, whether a smartphone, consumer, or SLR.

  4. Rob, I am no expert, so I may be wrong, but here’s my mental model of why the bit about raytracing makes sense.

    Imagine a standard, classical raytracer: you start with an image of X by Y pixels, which represents the sensor area of a camera in a scene graph of some sort. For each pixel, your raytracer engine draws a ray straight forward, calculating reflections from objects in the scene graph until it either hits a light source (in which case you get a light value of some kind, and thus a lit pixel) or nothing (in which case you get a black pixel); the tracer in Graham’s “ANSI Common Lisp” is a good example of this style (see If you add colour values to the objects in your scene graph and to your lights, and update your reflection code accordingly, you get a coloured image instead of grayscale.

    But what about this kind of scene: you’re standing at the corner of a building in the night, and you can see the light from a street lamp right around that corner, but you have no direct or reflected line of sight back to the source of that light. The simpleminded tracer described above would look unrealistic. Now you know why so much sleep has been lost to considerations of occlusion, global illumination, indirect lighting, specular reflections, subsurface scattering, and so on: instead of being able to care just about the straight-line rays which make up your pixels, you have to worry about _all_ the rays in your scene going in all directions. Your scene computation has gone from hugely simply to indescribably complex.

    Most importantly, though, you can no longer assume that the rays arrive at right angles to the surface of your sensor area.

    The Lytro camera works by measuring the angle at which various bits of light hit its compound sensor, or so it seems from some of the more specific articles popping up about this device. The reason ray-tracing factors into the equation is that the camera must then use the results of its colour, intensity, and direction sampling to compose something which we recognize as an image; and to do so, it is performing some variation on the more complicated ray tracing described above. The neat bag of Lytro tricks all stem from the extra directional information gathered by the sensor: in the same way that a software ray tracer can figure out what a scene looks like from a different position by just recalculating some rays (and a _smart_ raytracer can figure out how to get away with doing as little work as possible – see the links at for a variety of elegant and filthy techniques), the Lytro can mine the set of rays it has acquired to compose a somewhat different set of images. (see for background)

    This family of devices is going to be _fascinating_.

    1. This was my thinking to- the camera captures light ray direction as well as position. This allows you to compute the following: What *would* the image have looked like (where would the rays of light have hit the sensor) if it were taken with the lens in a different position?

      All we need now is a good way to record phase information, and we’ll have a holographic camera.

      1. AnthonyC, you will be delighted to know that the Lytro folks (in earlier research) have already been able to make a decent hologram from the current technology. A 3D camera is in the future, already on their road map.

  5. Ok, so a lens bends light coming from all directions so that it converges at a single point, right? And traditional cameras put the sensor at that point. To focus at different distances you move the lens back and forth, causing the light from different objects to converge at the sensor. What’s in focus depends on how the light is bent through the lens. Imagine if the lens itself was a sensor, capturing the light before it was bent. A computer then calculates how that light needs to be bent in order to focus at specific distances. So, in other words, it calculates how each ray of light must be bent through the lens the same way ray-tracing software calculates how each ray of light travels through a simulated environment.

    So, no. It’s not just multiple exposures at different focal lengths that are compiled into a single photo. It’s raw, directional light data that is interpreted using physical laws to recreate a scene.

  6. When I first saw this story a couple days ago, I thought this was similar to an Adobe project from a few years back, that essentially took 19 simultaneous pictures at different focal lengths and combined them into a single picture. I don’t remember the project name, but here’s CNETs article from back then.

    But now that I’m reading this Economist article, this is clearly different. This camera has a single linear array of lenses like a traditional camera, but stuck in between the focal plane and the lens is a separate planar array of these “microlenses” which are kind of misnamed, because they really sound like microsensors (that also allow a portion of light to pass through them).

    So, it’s basically taking two pictures, one picture at the microlenses, and another picture at the normal focal plane. Then software is comparing the two images together to discover how the pixels match between them and creating the virtual light vectors that would have been required to cause both images. Once they have the vectors, they can trace them backwards through the lens array to identify where each beam of light existed when it struck the camera and the direction that beam was oriented in.

    This allows them to make a separate virtual lens with any focal length and ray trace the virtual beams of light through it to focus on a different depth within the scene. It sounds like this adds an additional level of uncertainty to the end result, since it’s reconstructing the image based on calculated light paths at a resolution much worse than the natural resolution of photons (except for the image at the default focal length, which would be real). However, the uncertainty is probably not terribly noticeable at macro-scales, though I could see it having difficulties imaging crystals, as can be seen in the image on their page that has some mirrors and glass overlooking a street intersection. Though, I’m surprised at how well it already handles reflections. The default fast shutter speed will probably work well with overcoming motion blur.

    Increasing resolution would then have a dual dependency on traditional physical limitations (fitting in more microlenses) as well as on processing power to reconstruct the virtual light rays per pixel.

    Bottom line, I would be willing to be an early adopter for the current camera (example pictures at 0.25 MPx) at between the $500-$1000 price point, but I’m an imaging buff, and they’d probably need to get to 1 MPx at $399 or less to see these things fly off the shelf.

    1. Hey, PlaneShaper, that’s a great set of reasoning, but it’s really microlenses, not microsensors. They use a conventional CCD sensor array and interpose some kind of etched (possibly etched in a silicon fab, they’re not saying, but it’s possible) microlens “sheet”. The only image data is captured on the main sensor array.

      But because the microlenses are fixed, the software knows their x,y position in the 2D array, and knows which sensors have light falling on them that originates from the main lens through which microlens. That allows the creation of a four dimensional values which allow approximating the ray’s path through an ideal space (which is close enough to work). (It’s actually more than 4D if you add light intensity and wavelength.)

      The rest of your explanation is absolutely dead on. Very good inferred insight.

  7. Ah – I’d wondered how they got multiple focal planes. Using a fly-eye style lens on the image sensor makes sense. I wonder how useful it will be in the real world, however – the amount of knowledge to understand how to manipulate aperture to get the depth of field right kind of minimizes the market share.

    All in all, it’s a neat technology, and I would love to see them license it to RED so we can do some absurd things with 4K video, but the low pixel count it seems this would need would preclude using it for professional still image work. I hope it doesn’t turn into an expensive toy / flash in the pan.

  8. When light moves through a small enough aperture, it moves in parallel. A light hitting a point on the sensor came from the same point or area on the other side of the aperture. This is why pinhole cameras work. Each grain of silver, or single pixel sensor catches the light from one area.

    Now imagine that you have a second pinhole letting light shine on the same point. If one is bright and the other dark, you get a medium pixel, but you couldn’t tell which pinhole was bright and which was dark. Maybe they were both medium. When the aperture is big, every pinhole that makes it up could be shining on the same grain or pixel. This ambiguity is what makes a big hole not create an image like a pinhole does.

    The new sensor works by adding angle to the intensity measurements. With microlenses and associated sensors, they can determine which pinholes shone on which sensors in which intensities. Using similar math to raytracing, they can say “this light came from there, and this light came from there…”

    Etc. I’m leaving how lenses work as an exercise for the reader.

  9. So it’s basically just doing the equivalent of taking countless photos at all possible focal lengths and then compositing them as desired on demand. Except it does it in a smarter way than that. Neat.

  10. I don’t understand any of what you guys are on about but that guy on the photo is *not* Chuck Norris.

  11. Depending on price I certainly see a market for it.

    Cam pics. Yeah I went there. But good grief, you have no idea how many out of focus self shots I see on 4chan daily. It’s not like we are back in the mid 90’s when 640×480 was great for a digital camera… I’m not saying everything needs to be taken with a D90, but being able to have a camera actually find the correct focus after the fact would make a lot of internet self-pic’ers happy.

  12. it’s just math guys. learn some math.

    depth of field is related to aperture size, where the aperture limits the angle of incoming light waves. the focal plane is where all of the spherical incoming waves converge to a point.

    by allowing for multiple combinations of incoming spherical waves, you allow for multiple focal planes.

    boom. no big. the idea is as old as optical calculations. this is just a guy spinning his thesis work into a product.

  13. So then, does this new development suggest that insects with segmented eyes such as dragonflies might actually have the ability to selectively focus on objects using brain-processing? Perhaps insect eyes are much more functional and advanced than previously believed?

    1. Great inference. I had this precise conversation with Ren Ng, Lytro’s founder. They tested insect-lens cameras, but found focus was too poor to reconstruct. Without a main lens doing primary lightgathering and broad focus, they couldn’t get the microlenses to work. But great insight here.

  14. Doesn’t the use of raytracing imply that a depth model has been made and then painted? Or, is this some other obscure use of the word?

    1. Same process: the Lytro records 4D data about rays then uses that to ray trace back to an arbitrary image plane from which a focal plane is derived allowing ranges of depth of field and points of focus. So it’s as if you feed animation data into the camera to get it to “fake” a scene, except the inputs are real light.

Comments are closed.