HDR EyeGlass: From Cyborg Welding Helmets to Wearable Computing in Everyday Life

12926 Views, 157 Favorites, 0 Comments

HDR EyeGlass: From Cyborg Welding Helmets to Wearable Computing in Everyday Life

This Instructable is not a lesson on how to use existing HDR (High Dynamic Range) software. Instead it gives you a DIY (Do-It-Yourself) approach to writing your own HDR software, and creating your own systems that can potentially go beyond what's out there already, or at the very least give you a sense of personal fulfillment that can't be had simply by using an existing product. Long live DIY and power to the people!

Seen through the Glass, Darkly

My grandfather taught me how to weld when I was 4 years old, and it was a wonderful fun experience, exhilarating, but in some ways a bit terrifying, because you need to wear a helmet with a darkglass (a single pane of very dark glass through which both eyes see the world). The whole world seems almost completely black except for a little pinprick of blinding bright light. So from an early age I was giving a great deal of thought to how we can see, hear, and more generally sense and make sense of the world. I spent a great deal of my childhood inventing, designing, and building wearable computers to mediate my senses, through something I called "mediated reality".

Mediated Reality allows us to augment some aspects of the world, diminish other aspects, and more generally, modify the view of the world.

So I wanted to build a digital eye glass that could do these three things:

Augment where appropriate: e.g. see in complete darkness plus annotate with virtual markers or make visible otherwise invisible fields and sensory information;
Diminish where appropriate: e.g. tame down the bright areas of the scene such as the electric arc of my TIG welder, the glare of light bulbs, the sun, glints of specular objects, etc.; and
Modify: help people see by modifying the visual field, not merely adding to it. This included things like sensory substitution, e.g. seeing into the infrared to be able to see where a workpiece was heating up, as well as being able to see radio waves, and the like.

More generally I envisioned the eye glass in conjunction with a general-purpose wearable computer for multisensory integration (HDR audio and video), as well as for synthetic synesthesia (sensory substitution) including the addition of new senses (e.g. adding a "Sixth Sense").

This is what led me toward the invention of HDR (High Dynamic Range) sensing for audio, video, radar, and metasensing.

Understanding HDR (High Dynamic Range) Sensing

There are many really great Instructables that teach you how to do HDR:

https://www.instructables.com/id/HDR-photos-with-t... HDR photos with the GIMP without even using special software (using just an image editor);
https://www.instructables.com/id/HDR-photos-with-t...
https://www.instructables.com/id/Aircraft-HDR-Phot...
https://www.instructables.com/id/HDR-photos-with-t...

So what I'm going to provide here is how to understand HDR. This deeper insight will help you take and make better HDR pictures and videos and "audios" (HDR is great for sound recording too, making the HDR video experience complete!). This insight will also allow you to apply HDR to many other sensing tasks such as radar, sonar, and even metasensing (the sensing of sensors and the sensing of their capacity to sense).

Rather than providing or suggesting specific HDR software (I'm actually involved with a number of companies making software, hardware, and related systems), the purpose of this Instructable is to inspire you to create your own!

Try to think beyond the confines of existing SDKs and APIs, and come up with something unique, original, and fun.

Collect a Plurality of Differently Exposed Records, Sort Them by Exposure, and Compute Comparagrams

A simple example is a set of differently exposed pictures of the same subject matter. But you can apply this philosophy to just about any kind of recording.

For example, in my childhood, in the early days of audio recording, I remember that most audio devices were monophonic. In our household, the record player just had one speaker in it. So did the radio and television receiver. But as an audio hobbyist I had a stereo tape recorder, and I even rigged up a wearable computer to record stereo sound in the late 1970s. When recording monophonic material such as my own voice, I connected the two stereo channels in parallel (both fed from the same microphone), and set the left channel very quiet and the right channel very loud. Thus when I was speaking the left channel never saturated, but the right channel did. But when others far from me were speaking, the left channel was too quiet == lost in background noise. The right channel was just perfect. Later I could combine these two recordings to get a single recording having a massive dynamic range, way beyond what any sound recording device of that day could produce.

I'd discovered something new: a way to combine differently exposed recordings of the same subject matter to obtain extended dynamic range. I also applied this method to photography and video, e.g. to combine underexposed and overexposed video recordings.

I was also fascinated by Charles Wyckoff's pictures of nuclear explosions featured on the cover of Life Magazine. I was fascinated by Wyckoff's work from MIT, so I applied there and was accepted, where I became good friends with Wyckoff, and showed him my HDR audiovisual work.

Dynamic range versus Dynamage range:

In addition to audiovisual work, consider other kinds of sensing or metasensing. The main principle here applies whenever sensors can be overexposed without damage, e.g. whenever their dynamic range is greater than their dynamage range. For example, HDR video was not possible back in the old days when video cameras were easily damaged by exposure to excessive light.

Try to find a situation where a sensor saturates and provides poor readings, in the presence of overexposure, but is not damaged by the overexposure.

Modern cameras are like this, as are many microphones, antennas, sensors, etc..

At the top of this page is an example picture that I made from two differently exposed pictures of the same subject matter. The two pictures appear below it. The one on the left is taken with an exposure suitable for the bright background behind the people in the picture. The one on the right is taken with an exposure suitable for the architectural details of the building.

Saturation and cutoff

In the leftmost image, many of the details are cutoff in the shadow areas.

In the rightmost image, many other details are saturated in the highlight areas.

Try some of the different ways of combining differently exposed pictures, in order to combine these two images to get an image like the one at the top of this page.

Try capturing some of your own datasets in which there are differently exposed records.

Try to understand the mathematical relationship between these differently exposed records.

Let v₁ be the first record (without loss of generality we can sort the records according to exposure, and so let's say v₁ is the record with less exposure). Let v₂ by a record of greater exposure, by a factor of some constant, k. There is an underlying quantity, q, which we are trying to measure, through some sensor response function, f. So we have v₁=f(q(x,y)), let's say (e.g. if it is a picture or image as a function of (x,y) pixel coordinates), and v₂=f(kq(x,y)).

Now we want to try and understand the relationship between v₁ and v₂, two differently exposed images. The fundamental way of doing this is through something called the comparagram, which is a powerful yet simple (fundamental) mathematical tool for comparing differently exposed recordings of the same subject matter.

Understanding the Comparagram Is the Key to Understanding HDR

Once you understand the comparagram, you're well on your way to understanding the fundamental concept behind HDR, and behind comparametric sensing in general. Comparagrams are the key to understanding the relationship between differently exposed recordings through a sensing apparatus of any arbitrarily-shaped response curve.

The first thing to try and understand is the relationship between two images, v₁ and v₂, and then later between more than two images, by considering them pairwise. We get this understanding by computing the comparagram between pairs of images. A comparagram is a joint histogram of two records that differ only in exposure. Compute the comparagram of the two images above. You can write your own program to do this, or use one from our VideoOrbits toolkit, http://wearcam.org/orbits/v1.23.tgz

Alternatively, assuming you're using GNU Linux (like most sane Do-It-Yourselfers), you can compute a comparagram in Octave as follows:

Cg = full(sparse(v1+1,v2+1,1,256,256));
assuming you have two greyscale images that each have 256 greyscale values.

A more computationally efficient way of doing the computation is to use an external C-language file, called from Octave. Here is a simple Octave file that more efficiently (more quickly) computes comparagrams in Octave: http://wearcam.org/comparagram.cc

Compile it as follows:

$ mkoctfile comparagram.cc

You may need to install liboctave-dev if you get the following message:

The program 'mkoctfile' is currently not installed. To run 'mkoctfile' please ask your administrator to install the package 'liboctave-dev'

If you are your own administrator (as most GNU Linux DIY enthusiasts are), then install it:

$ sudo apt-get install liboctave-dev

Here, in our case, since the images are color (RGB) you will get 3 channels of comparagram data, one that compares the red channel of v1 with the red channel of v2, the next channel comparing the green channels, and the third channel comparing the blue channels, thus making the comparagram itself an RGB entity. You can also convert the images to greyscale and the comparagram will thus only have one channel, since the response function of the camera is roughly the same for each of the three channels.

Here is a textbook definition of the comparagram:

"The comparagram between two images is a matrix of size M by N, where M is the number of gray levels in the first image and N is the number of gray levels in the second image. The comparagram, which is assumed to be taken over differently exposed pictures of the same subject matter, is a generalization of the concept of a histogram to a joint histogram bin count of corresponding pixels in each of the two images. The convention is to have each pixel value from the first image plotted on the first (i.e., “x” ) axis, against the second corresponding pixel (e.g., at the same coordinates) of the second image being plotted on the second axis (i.e., the “y” axis). Since the number of gray levels in both images is usually the same (i.e., 256), the comparagram is usually a square matrix (i.e., of dimensions 256 by 256)." [Intelligent Image Processing, S. Mann, 2001].

If you're going to write your own comparagram program (which I suggest you do, so you learn about it better, and also in the true DIY spirit), here is a nice very simple example to help get you started:

Consider two pictures that are each 3 pixels high and 4 pixels wide:
v1=[
1 3 2 3;
3 2 1 2;
0 0 2 0
]

and

v2=[
2 3 3 3;
2 2 2 2;
0 1 3 0
].

The comparagram is a two dimensional array of size M by N where M is the number of grey values in the first image, and N is the number of grey values in the second image, where entry C[m, n] is a count of how many times a pixel in image 1 has greyvalue m and the corresponding pixel in image 2 has greyvalue n. In this case both images have 4 grey values, so the comparagram is a 4 by 4 matrix, given by:

Cg=full(sparse(v1+1,v2+1,1,4,4))

%(above line typed in Octave or Matlab):

Cg=[
2 1 0 0;
0 0 2 0;
0 0 2 2;
0 0 1 2
].

Summing across rows of the comparagram gives the histogram of the first image: h₁ = [3 2 4 3] and summing down columns of the compragram gives the histogram of the second image: h₂ = [2 1 5 4]. Summing all the entries in the comparagram gives 12, which is the total number of pixels.

A simple exercise to help you understand comparagrams, what they can do, and how to use them:

Here is a simple exercise that will help you understand comparagrams. Do this simple exercise and I can promise a new world of insight, a kind of "aha" moment that for many of my students has marked the beginning of a new way of looking at the world of comparametric sensing.

Take any image, like the one at the top of this page.

Go into an image editor like GIMP (I prefer Open Source GNU Linux) and select "Curves" from the "Colors" menu.

This lets you "Adjust Color Curves".

Since the image is greyscale (I suggest starting with a greyscale image) you're simply adjusting greylevels.

Create whatever shape you want, on the curve.

Now save the result under a new file name.

Compute the comparagram of this new image against the original image.

What you get (see above) is the curve that you created.

In other words, the comparagram extracts (recovers) the curve from the image data.

Note that in the literature the "X-axis" (first axis) runs from left to right and the "Y-axis" runs from bottom to top, whereas in computer files (arrays in the C programming language) index with the "X-axis" (first axis) going top to bottom, and the "Y-axis" going left to right, so you may have to rotate the comparagram 90 degrees to get it to line up with the Curves.

Try this a couple of times with a couple of differently shaped Curves.

Now you can understand the comparagram as a fundamental tool for understanding the relationship between images of identical subject matter that differing only in tonality or greyscale. Differently exposed images exhibit changes in tonality, which become evident in the coparagram. The comparagram captures the essence of two things simultaneously:

A camera's response function;
The difference in photographic exposure across multiple images.

Align (register) the Records

You have 2 choices at this step:

use a tripod (or surveillance camera) which is fixed; or
us a wearable camera and align the images.

Once aligned, you are ready to use the comparagram as a way of tonally aligning multiple differently exposed pictures of the same subject matter.

The comparagram is a record of how one of the pictures relates to the other. In a sense it is a recipe (lookup table) that allows you to convert one image to the other. If you can find the ridge along the comparagram, that gives you a lookup table to convert one way, along rows of the comparagram, or the other way along columns.

With HDR photography we often use a tripod or otherwise mount the camera securely in the environment. Likewise with surveillance video the camera is often affixed to a building. But with wearable cameras the camera is moving around, as part of an EyeTap or other Digital Eye Glass, for example. In this case, our alignment problem is a problem in spatial as well as tonal registration. The comparagram gives us the tonal alignment, but more generally we also need spatial alignment, e.g. as is done in panoramic imaging, http://wearcam.org/orbits/ Here is a research paper on that topic: http://www.eyetap.org/papers/docs/icip1996.pdf

Orbits are 360-degree maps such as spherical or other projections, that allow the eyeglass wearer to look around and see things from every angle.

Start simple: try an example with just two images.

Different exposures can arise naturally with automatic gain control (AGC) or automatic exposure, e.g. when you point a camera at something really bright like a light source, or, in the example above, an open doorway to a bright outdoor scene, the camera will "dim down" revealing highlight detail. As you swing the camera away from the light source (e.g. to the right in the example above), the camera will "brighten up", revealing shadow detail.

It is in the areas of overlap where interesting things happen. Here we get differently exposed images of the same subject matter, as the camera moves around.

Combine the Aligned Images

An important result in comparametric image analysis is being able to "stitch together" multiple images of the same scene to make sense of the world.

Once you have images that are aligned, you essentially have multiple measurements of the same quantity.

Think of it like a voltmeter where you measure voltages on the different settings of the meter, and then you want to combine them all together into a single reading, at each pixel.

Its kind of like voting, where each image gets to "vote" on what a pixel value should be at a particular location. But not all votes are equal. In the dark areas, we want the brighter images to have a stronger vote because they have a better rendition of those areas. In the darker areas we want to brighter images to have a stronger vote. In the midtones, we want the medium exposures to get the strongest vote but still have a good contribution from the light and dark images.

So we have a weighted sum, as indicated in the formula above. See http://wearcam.org/comparam.pdf

Alternatively, a much better and much faster way of combining the images is to use a method developed by Mir Adnan Ali and I, in which a very simple LUT (Look Up Table) is used. The LUT is much like a comparagram (same dimensions and same axes as a comparagram). In this way each pair of images is combined almost instantly (by quick and simple computation: merely look up the result), so it can run at video frame rates. This method runs about 5,000 times faster than any other HDR algorithm currently in use, and can also be implemented in FPGA.

If the result is being used for computer vision, machine learning, or the like (e.g. face recognition by computer) we're done: just present the recovered q(x,y) to the algorithm. If we want to print or display the result we'll want some spatiotonal mapping (filtering, such as sharpening) to compress the dynamic range down to the print or display medium. There are lots of filtering programs and software available, but in the true spirit of DIY, try to implement or write something of your own. That way you'll learn a lot more, and have more fun, regardless of how well your result turns out.

DIY: Have Fun and Learn by Inventing Something New

By doing it yourself, you have the potential to have a lot more fun and also to learn a lot more.

Moreover, you'll be better prepared to envision and solve totally new problems or dream up completely new ways of applying these concepts. For example, we can apply HDR to scientific sensing and visualization. Above is an example of work undertaken jointly by Ryan Janzen on biological visual metasensing. Metasensing is the sensing of sensing (see my previous Instructable on Metasensing). Here we're visualizing vision. This is like a visual acuity test, and visual acuity varies widely over our field-of-view. So it is the perfect candidate for HDR metasensing. Here we use a pseudocolor scale to visualize a massive dynamic range in vision. With HDR sensing and metasensing, we can finally see and understand many physical phenomena beyond what was previously possible.

Click on "I Made It!" If You Got As Far As the Comparagram

Even if you don't make it all the way through this Instructable, if you got as far as generating a comparagram, please share your results.

Try one or more of the following:

Capture a plurality of differently exposed recordings, such as some differently exposed images;
Construct a comparagram from an image with itself (this will help you understand comparagrams, and the result should be the histogram of the image along the diagonal);
Construct a comparagram from two different images that have the same exposure. Yes take a picture twice but with exactly the same camera settings, and then compute the comparagram of the two images. You should see some off-diagonal elements due to statistical noise. The "diagonal" line has fattened!;
Construct a comparagram from two differently exposed images, and upload the two images and the resulting comparagram under "I made it!";
Use the comparagram to combine the images;
Generate a CCRF from two images. This will have the same dimensions as the comparagram and the axes have the same meaning. The CCRF is a close relative of the comparagram and is an efficient way of representing and computing comparametic information;
Use the CCRF to combine two differently exposed images of the same subject matter. Compare the computational efficiency and image quality with (5) above. You should find that the images look better and are computed thousands of times faster.