Making light work of matrix multiplication

Meeting the massive energy demands of the AI boom may require an outside-the-box solution. Some researchers believe the answer lies in light.

The idea of computing with light rather than electricity has been around since the 1960s, but physicists and engineers have recently renewed their efforts in this area, hoping to decrease the energy required to execute machine learning programs. Building a light-based or “optical” computer requires precise engineering that has taken decades to develop, but the basic principle behind such devices is something you’ve seen in grade school.

Think of that time you put a pencil in a cup of water and the pencil looked broken. It looked broken because air and water interact with light differently. Put in scientific terms, the air and water have different refractive indices, and any time a wave of light encounters a change in refractive index, it will bend. Therefore, one can precisely manipulate the path of light through a material by controlling the refractive index. In particular, if one could change the refractive index of a material independently at specific locations, an “input” wave of light would bend differently at each location, creating a completely new “output” wave.

The bending of an “input” wave into a distinct “output” wave is one example of a light-based computation. It is, in fact, the easiest type of computation to perform with light. But the reason light-based computing has gotten so much attention over the past few years is that, mathematically, this bending can represent the most important computation for machine learning: matrix-vector multiplication.

A matrix is a table of numbers, and a vector is a column of numbers. Multiplying a matrix and a vector involves multiplying every number in the table by every number in the column and adding up the results to make a new vector. The vectors that ChatGPT uses are thousands of numbers long, and to produce every new word it writes, it must perform hundreds of matrix-vector multiplications. With millions of users asking the chatbot to write thousands of words per day, you can quickly see how increasing the energetic efficiency of matrix-vector multiplication is critical.

The light-based computing device I described above is one way to implement matrix-vector multiplication with light, and it’s roughly equivalent to one that Professor Peter McMahon and his colleagues designed last year. The team, based at Cornell University, published their findings in Nature Physics last December. In their device, the matrix is physically encoded as a grid in a slab of special material, where each cell has a different refractive index. The grid is created by a temporary and programmable process, allowing the same material to represent any matrix.

One way to understand the computational ability of their device is through a geometric interpretation of matrices and vectors. You can consider a vector not as a column of numbers but as an arrow pointing in some direction. Multiplying by a matrix rotates and stretches that arrow to form a new arrow with a different size and direction. Similarly, the wave of light traveling across the device can be viewed as an arrow in a high-dimensional space. The changes in refractive index across the material bend the wave of light at every location and essentially act as one big, complicated rotation. By tuning the grid of refractive indices, one can replicate any type of rotation and therefore any matrix.

The geometric interpretation of matrix-vector multiplication also reveals why a light-based approach can be more energetically efficient than an electronic approach. The bending of light in response to a change in refractive index is a natural phenomenon. Once the device is made, additional energy input is not required for rotation. In contrast, a traditional electronic computer would simply multiply and add a series of numbers. Each numerical operation requires charging and discharging a set of wires, carrying with it an intrinsic energy cost.

Professor McMahon explains that the propagation of light through a material “can be, in principle, completely lossless,” meaning that no energy is lost during the computation. Some energy will be dissipated in an optical matrix-vector multiplication from the light being absorbed by the device, but it’s negligible compared to the energetic cost of electronic computations. It’s for mainly this reason that 30 years ago there was already a major shift away from electronics and towards optics in the field of communication. Today, almost all long-distance communication is done with optical signals (https://www.noaa.gov/submarine-cables) because much less energy is lost when light travels through a fiber-optic cable than when electricity travels through a wire. Computing, however, is more complicated than communication. Several technological challenges could prevent optical computing from proliferating as quickly as optical communication once did.

One major concern is that it requires a significant amount of energy to convert light into an electrical signal and vice versa. This would be necessary if the optical computing device performed only matrix-vector multiplication within a larger program, which many researchers view as the most feasible near-term use for optical computing. Even with this conversion cost, optical computing could still be more energetically efficient than electronic computing, but crucially only at a certain size of input vector. For smaller vectors, the conversion cost outweighs the benefits. Luckily, modern AI algorithms do use massive vectors and matrices—large enough to almost surely dwarf the conversion cost. But that brings us to our next practical challenge: size.

A typical wavelength of light used in optical computing is more than 100 times larger than a modern transistor. That wavelength sets the scale for the size of the optical device, meaning it is difficult to make optical processors as small as their electronic counterparts. On the millimeter scale, McMahon’s team demonstrated that their device could process input vectors of length 49—the largest demonstration in a device of this kind. Some researchers are optimistic that millimeter-scale devices could one day handle input vectors near the size used in modern AI programs, but it has yet to be shown experimentally.

To make large-scale optical devices feasible, researchers utilize another key advantage of optics: the high clock rate. Clock rate is the term computer scientists use to describe the number of mathematical operations a computer performs per second. The charging and discharging of wires in a digital computer not only dissipates heat but also puts a limit on the clock rate. In fact, clock rate and heat dissipation are linked: the faster you charge and discharge the wires in an electronic computer, the more energy it will use. Optical computing, however, doesn’t have this problem. One can easily achieve a clock rate 10–100 times faster than a digital computer without dissipating significant energy.

This clock rate advantage is the key idea behind the startup Opticore, founded by UC Berkeley professors Zaijun Chen and Mengjie Yu along with MIT research scientist Ryan Hamerly. They believe that optical processors will be used in large-scale AI platforms in the near future. Their technology relies on using time as one of the dimensions of the matrix. For example, in one of Opticore’s devices, the part of the processor that represents the matrix changes over time. Thus, they essentially perform a large matrix-vector multiplication by breaking it up into several smaller multiplications, all executed in extremely quick succession.

Professor Chen predicts that optical processors will be used in large scale applications “within five to ten years.” He notes that major companies like Nvidia are already beginning to integrate optics into their computer chips to transport data. He feels that this will pave the way for light not only to move the data but to manipulate it. Other researchers in the field, however, are more cautious. Martin Stein, a postdoctoral researcher working on optical computing with Professor McMahon at Cornell University, points out that “many of the smartest and best-paid people in the world are trying to make machine learning more energy efficient with the hardware that’s currently available, and they are advancing with an incredible pace that is very difficult for us to keep step with.” His concerns are rooted in the fact that no one has yet demonstrated an optical matrix-vector multiplier large enough to be practical. And even if one were built, it would still be extremely costly to alter existing architectures to integrate it.

If widely implemented in AI, optical computing could cause a substantial paradigm shift in technology, perhaps most akin to the widespread adoption of the optical cables that now power the internet (https://www.submarinecablemap.com). But, as with the early stages of any new technology, there are both doubters and true believers in the optical computing field. It is likely too early to tell who will ultimately be right, but one can only hope that the specter of skyrocketing energy costs may be enough incentive for both researchers and industrial leaders to make some kind of creative computing solution work.

This article is part of the Spring 2026 issue.

Making light work of matrix multiplication

Efficient computing to speed up AI

Related Articles

Food for bots

The Redwood Center gets its twentieth ring

Leading the charge to better batteries

Most Popular

The rains are coming

More than just a t-shirt

The stories that serve the science

Related Articles

The Redwood Center gets its twentieth ring

Leading the charge to better batteries

The stories that serve the science