×
all 126 comments

[–]Gelsamel 80 points81 points  (21 children)

I wonder if this would be useful in high speed photography where getting enough light is usually an issue.

[–]FionHS 99 points100 points  (16 children)

It seems to me that the way this works extrapolates from incomplete data and fills in the blanks, introducing "best guess" image information to the picture. If that assumption is correct, I think technology like this will be most interesting to consumers and amateurs, since the point of high speed photography or professional equipment is to get the clearest possible image, without what amounts to guesswork. (Of course, I could be completely misunderstanding their process.)

[–]no_new_content 29 points30 points  (10 children)

I think you're right. Look how smooth all the surfaces are after the processing is applied, definitely some kind of interpolation. Also does it mention process time (maybe I missed it)? It would only be practical if it can be run in real time.

[–]gerryn 10 points11 points  (2 children)

I would have liked to see an actual video in the end, I'm not convinced this technique can be used on actual video without looking weird/jittery whatever.

[–]theblacklounge 14 points15 points  (0 children)

It can. This version probably can't, but within a few months you'll see papers which adapt this model with well known temporal cohesion techniques.

[–]Trevski 0 points1 point  (0 children)

could be neat in a music video or something though.

[–]theblacklounge 4 points5 points  (1 child)

You're not wrong, it's a fully convolutional network. By definition a deep neural network is interpolation layer upon interpolation layer.

But it's not much like traditional interpolation methods. The reason it's so smooth is because there is no other information. The smoothness is an achievement, not a problem: most ANN approaches up till now use deconvolutions which result in weird repeating patchwork-like artifacts. Compared to more traditional approaches it's also a big improvement, they will often produce smoothed noise in stead, which looks terrible.

[–]kwizzles 0 points1 point  (0 children)

This sounds right

[–]khoggatt 4 points5 points  (3 children)

The process time should only be an issue when the algorithm is done via software. But if the image pipeline is done within an ASIC then it should be negligible.

[–]t0b4cc02 3 points4 points  (1 child)

i want to add that usually with the technology they use (a neuronal network) training is the time intensive part and using it for calculations should require less time by orders of magnitude

[–]mimomusic 0 points1 point  (0 children)

Yup, current worst cases for inference ("feeding it data and getting the results from a trained network") is probably time-dependent synthesis, i.e. producing audio or video which requires a sample at certain intervals. Even so, we're still talking orders of magnitude less time and those numbers are rapidly going down on top.

[–]tdgros 1 point2 points  (0 children)

Replacing a large gpu by a large asic is possible but processing time is not negligible at all. It can be similar to what we get today in gpus but with much less power consumed...

[–]mimomusic 0 points1 point  (0 children)

Well, sort of. The cool thing about these techniques is how we get increasingly believable scene features: reflections are captured and preserved (in fact, light transport, in general, seems to work exceedingly well), textures are synthesized from basically nothing and larger semantic entities are guessed to significantly boost visual fidelity. This is basically a denoiser, as described in the paper - it's just that it's really good.

https://data.vision.ee.ethz.ch/aeirikur/extremecompression/

As for speed:

Another opportunity for future work is runtime optimization. The presented pipeline takes 0.38 and 0.66 seconds to process full-resolution Sony and Fuji images, respectively; this is not fast enough for real-time processing at full resolution, although a low-resolution preview can be produced in real time.

We expect future work to yield further improvements in image quality, for example by systematically optimizing the network architecture and training procedure. We hope that the SID dataset and our experimental findings can stimulate and support such systematic investigation

Definitely not bad and bound to get better quickly (duh). You could write an indie horror film in this specific time niche where the protagonist uses his phone that infers at 10s/picture and it would be believable. They are talking about pretty big images though, so that should be adjustable as well. I can't say too much about the actual "consumer device high-speed photography" discussion, but I'd be very surprised if that didn't pan out one way or another, right now there are few things that don't make sizeable strides in terms of development and innovation.

This is just the ridiculous beginning of something so much larger, creative work in particular is going to (and has been) get great boosts everywhere, from ADR to conventional VFX. Just imagine the consequences of tedious roto-work not being a thing anymore, or at least very superficially supervised.

[–]frizbplaya 4 points5 points  (1 child)

I was thinking the same thing. There will almost certainly be a loss of detail in this process but maybe it can be so low that we don't perceive it.

[–]MartinSchou 0 points1 point  (0 children)

While there's a loss of detail compared to real life, compared to the other methods it adds details. Use the photo with the Digestive biscuits and Fiber something as reference (starts at 2:00) - before this method, I certainly couldn't tell that it was Fiber (I could see the iber) or even that there was a package of Digestive in the picture.

[–]artifex0 4 points5 points  (0 children)

As a graphic designer, this looks incredibly useful for my work as well.

I mean, ideally I'm only working with photographs taken by experienced professionals- but in practice, I'm sometimes doing things taking a frame from a video with terrible lighting and spending hours struggling to make it look halfway decent on a poster.

[–]Ominusx 0 points1 point  (0 children)

It's not quite like you'd think. It's a convolutional neural network that is likely trained with the inputs as the low light level picture and the intended outputs are the same image but taken with higher light levels, and then then the connections between the neurons have their weights and biases shifted based on a gradient descent algorithm.

[–]ReadMyHistoryBitch 0 points1 point  (0 children)

It hurts reading “ideas” from non technical people.

[–]extracoffeeplease 0 points1 point  (0 children)

I'm sure some photographers will have an issue with it, as it's (at least in essence) like applying graphics effects to your photo.

[–]Tango_Mike_Mike 0 points1 point  (0 children)

It eventually will be, technology like this will be in something like a Sony A7SIV OR V

[–][deleted] 0 points1 point  (0 children)

I recently worked in a live sports broadcast in which I had to install a SHITLOAD of lighting. The answer to my obvious question: slow mo.

[–]MadHatter69[S] 65 points66 points  (10 children)

[–]t0f0b0 22 points23 points  (7 children)

The first guy listed... Is his first name Chen and his last name Chen, or is his first name Chen and his last name Chen?

[–]ashisme 37 points38 points  (1 child)

Yes.

[–]t0f0b0 9 points10 points  (0 children)

Thank you.

[–]bubbrubb22 17 points18 points  (1 child)

Please, Chen is my father's name. Call me Chen.

[–][deleted] 2 points3 points  (0 children)

Reminds me of my Asian friend John who married his American wife Joan. His family in China were very confused on why he married someone with the same exact name. Hilarious.

[–]OldMcFart 1 point2 points  (1 child)

The comma between the names would indicate his first name is Chen.

[–]t0f0b0 0 points1 point  (0 children)

You're right. I wasn't thinking.

[–]ukkoylijumala 1 point2 points  (1 child)

[–]qeadwrsf -2 points-1 points  (0 children)

for me 2.7 is just better because you can write:

print 'hello'

instead of

print ('hello')

[–]Aliforansoco 88 points89 points  (8 children)

Fantastic work! This is a breakthrough with lots of real-world applications!

[–]root88 6 points7 points  (3 children)

I have wanted to upgrade from a crop to a full frame camera for a long, long time now. I always assumed that the second I did it, a graphine sensor would come out. This could definitely hold me over.

[–]merryman150 4 points5 points  (2 children)

Is there anything graphene can't revolutionize? I didn't even know about this application.

[–]tryfap 3 points4 points  (0 children)

Is there anything graphene can't revolutionize?

Yes, it can't revolutionize getting out of the lab.

[–]nadmaximus 0 points1 point  (0 children)

All I can think about are all the wonderful applications of asbestos. It's so useful!

[–]Praesumo 0 points1 point  (0 children)

Does it work live, or is it only for images, and each image takes 5 minutes of processing?

[–]AvailableConcern 32 points33 points  (5 children)

Important caveat is that it needs to be trained for the particular sensors.

[–]londons_explorer 4 points5 points  (1 child)

particular type of sensor, or particular sensor itself?

eg. is it learning defects in the leakage current of the sensor and nonlinearities in the electrical properties of each pixel?

[–]AvailableConcern 10 points11 points  (0 children)

Particular sensor itself*

Check out their paper given in the video description

[–]naughty_ottsel 0 points1 point  (0 children)

Reason the issues on the Repo, they used the Sony dataset for the iPhone results. This could be due to them having similar sensors (can’t be certain.) but then for many people this sort of model on smartphones would produce good enough results

[–]IndiaNgineer 26 points27 points  (12 children)

Its essentially a denoiser. Its denoising it in a non-linear space instead of in a linear space like BM3D.I'd be interested to see what parameters were used in BM3D, and how their net was trained.

Edit: The smoothness you see is likely because its a fully convolutional net. The process time is negligible, but the training time will be substantial. They're using L1 loss, so its likely learning a sparse combination of kernels.

Their paper says 'During training, the input to the network is the raw data of the short-exposed image and the ground truth is the corresponding long-exposure image in sRGB space (processed by libraw, a raw image processing library). We train one network for each camera.' So it is essentially a denoiser, and different denoisers for each camera.

[–]Hydropos 8 points9 points  (5 children)

The process time is negligible

If this is the case, I wonder if this could be used for a sort of night-vision sight/googles. It wouldn't work in pitch darkness, but most of the time there is faint ambient light.

[–]titanmaster4 12 points13 points  (4 children)

Negligible relative to training time does not mean process time is actually fast, it could take 3 minutes, and that would be 'negligible' compared to the 3 weeks (probably more) it took to train the model.

[–]DarkTussin 0 points1 point  (3 children)

Neural nets work pretty quickly, right?

Basically a series of complex if statements at the compiler level.

[–]cheekyyucker 3 points4 points  (0 children)

yea, and the universe is basically gravity and magnets, sure!

[–]titanmaster4 4 points5 points  (0 children)

When the output space is a single classification, yes, in this case I doubt they can mutate the entire image within the fraction of a second necessary to make a real time display. It'd be cool if I'm wrong tho

[–]serg06 0 points1 point  (0 children)

http://cv-tricks.com/wp-content/uploads/2017/03/xalexnet_small-1.png.pagespeed.ic.EZPeJF1qqb.webp

That was a breakthrough neural network in 2012. 62.3 million parameters. And that's with every image downsized to 224x224 on input, losing a ton of info which is probably required for this application. Could easily be a billion parameters that need to be used in multiple calculations. (I guess that's not too long, but the number of parameters grows extremely quickly with each layer, and with resolution.)

[–]nate6259 0 points1 point  (3 children)

Dumb question time: Is this at all similar to what my Google Pixel (or similar smartphones) do with the HDR processing mode? I often notice a similar noise level while it is processing, then it looks much clearer a few seconds later. Maybe not as extreme as in these examples, but definitely a noticeable improvement.

[–]iainmf 0 points1 point  (0 children)

IIRC the HDR processing from Google is using multiple frames and combining them in a smart way to reduce noise, but OP tech is just one frame.

[–]IndiaNgineer 0 points1 point  (0 children)

No, HDR is a different system.In this system,they're essentially replacing the enhancement and subsequent denoising by one non linear process that does both simultaneously.

[–]iainmf 0 points1 point  (1 child)

What do you think about the comparison between using BM3D at the end of a conventional image pipeline vs this method of replacing the whole pipeline? I suspect BM3D would get better results if it was earlier in the pipeline (eg immediately after demosaic).

[–]IndiaNgineer 0 points1 point  (0 children)

Thats a good point. Its possible, but it depends on the type of noise. The SSIM results are more interesting,if you see the CAN set has a higher SSIM, which is more significant. An increase in SSIM is more significant since it is more correlated to visual perception. PSNR/MMSE is insensitive to a lot of these variations.

[–]mymomdressesmefunny 33 points34 points  (3 children)

Enhance!

[–]nate6259 1 point2 points  (0 children)

Those wipes were so satisfying. Giving us that pause in the middle to keep up the suspense.

[–]LeFuneh 0 points1 point  (0 children)

Furreal. That's what this is going to be, one day.

[–]midnight-souls 6 points7 points  (0 children)

Having worked a lot with noisy photos, I can't even describe how incredible this technology is. Amazing! I wonder how it can be applied to astrophotography too. I would love to see more of this, and how versatile this method is. How much training is required I wonder?

[–]cybaritic 22 points23 points  (5 children)

"This image should look like this, I can tell from some of the pixels and from seeing quite a few images in my time"

Machine learning in a nutshell

[–]cench 1 point2 points  (1 child)

"This image should look like this, I can tell from some of the neurotransmissions and from seeing quite a few images in my time"

Human brain in a boneshell

[–][deleted] 0 points1 point  (0 children)

woa

[–]Steelkatanas 2 points3 points  (2 children)

They don't think it be like it is, but it do

[–][deleted] 0 points1 point  (1 child)

old-ass meme reference comment

[–]Javlington 0 points1 point  (0 children)

And also old-ass original comment!

[–]bmcnns 4 points5 points  (1 child)

Wow! The difference is night and day.

[–]Jugg3rnaut1 1 point2 points  (0 children)

Damn it dude. Just 1 time I'd like to be the first to the comment section with my joke

[–]Rifta21 4 points5 points  (0 children)

Okay while this is undeniably awesome, you are using the A7sii wrong. No way its that noisy if you know what you are doing. I get that the pictures were taken on a tripod but you didnt want to take advantage of long exposure, but still. The A7sii still looks amazing even at 12,000 iso. Not even taking into account that it can go up to 409,000.

[–]lactoseracism 12 points13 points  (1 child)

Maybe this can make dc movies less dark.

[–]Juicy_Brucesky 1 point2 points  (0 children)

dc movies being dark is the least of their concern

MARTHA!!!!!!!!!!!!

[–]Armanlex 5 points6 points  (0 children)

Holy shit this is mindblowing. Wow.

[–]swordo 2 points3 points  (0 children)

hoping that someday ML will understand what is going on in the scene and then redraw it based on information that knows, not just what it sees at that moment. for example: a car in a dark room may have a low SNR but ML will know what a car looks like under ideal lighting conditions and can use that information to augment the scene

[–]seprock 4 points5 points  (3 children)

as a photographer, this is so cool and amazing!

[–]HelloTosh 1 point2 points  (0 children)

Teaching the Terminators to see in the dark without infrared?

[–]Hrolfgard 1 point2 points  (0 children)

Please please PLEASE someone apply this algorithm to grainy photos of Bigfoot.

It's really hard to find high-quality images of gorilla costumes in practical use.

[–]lenovosucks 1 point2 points  (0 children)

Damn the end result almost looks like CGI.. That's insane.

[–]Uwe_Tuco 2 points3 points  (0 children)

That's really impressive!

[–]wizardeyejoe 1 point2 points  (0 children)

someone show this to whoever did the lighting on a handmaid's tale

[–]ChalkyTannins 1 point2 points  (4 children)

Marc Levoy posted an excellent slide deck years ago how the pipeline works.

http://graphics.stanford.edu/talks/seeinthedark-public-15sep16.key.pdf

And a video https://www.youtube.com/watch?v=S7lbnMd56Ys

The technology has been in your phones for years, not sure what's different about this approach.

[–]Armanlex 6 points7 points  (1 child)

He said, "it accumulates frames". This neural network, I think, only needs one image. The one you linked takes multiple pictures and simulates the long exposure technique. Still really neat, but both approaches have different ups and downs.

[–]ChalkyTannins 0 points1 point  (0 children)

Google's camera does both, although it is 'tuned' more specifically to hdr (removing unwanted artifacts/ghosts in dynamic scenes) and high frequency edge preservation than it is exclusively low-light.

[–]Flying-Artichoke 0 points1 point  (1 child)

These are two very different technologies. Both are really great and Google is applying more ML into their pipeline, but not quite to this magnitute yet. They mostly just use it for image segmentation and scene recognition

[–]ChalkyTannins 0 points1 point  (0 children)

https://groups.csail.mit.edu/graphics/hdrnet/

and compared to posted:

http://web.engr.illinois.edu/~cchen156/SID/examples/30_hdr.html

Tuned more towards preserving detail and edge enhancement than noise reduction and low-light, and also VERY light on resources

[–]t0f0b0 0 points1 point  (0 children)

Cool!

[–]Warlaw 0 points1 point  (0 children)

I want to take a bunch of night pictures now.

[–]aManPerson 0 points1 point  (0 children)

so this can be done with old photos too? yay :).

from the video it sounds like they may have been starting with RAW format pictures.

[–]shwoople 0 points1 point  (0 children)

Pretty impressive! What are the practical implications of this? How could this be produced as a product? Would today's flagship phones have enough processing power to run the algorithms necessary to produce such a result? And if so, how long would it take to process one photo?

[–]vxxed 0 points1 point  (0 children)

This certainly looks like magic

[–]maheshvara_ 0 points1 point  (1 child)

Wonder if this is in use, in the pixel series of phones?

[–]DonMahallem 0 points1 point  (0 children)

As far as I can tell they do use multiple shots for the same effect: https://ai.googleblog.com/2018/02/introducing-hdr-burst-photography.html

[–]Mentioned_VideosApproved Bot 0 points1 point  (0 children)

Other videos in this thread: Watch Playlist ▶

VIDEO COMMENT
enhance! sleep mode! +4 - Enhance!
Aretha Franklin Chain of Fools +1 - Chen Chen Chen
Darth Vader NO! +1 - Required python (version 2.7) Noooooooooooooooooooooooooooooooooooooooooooooooo.....
SeeInTheDark +1 - Marc Levoy posted an excellent slide deck years ago how the pipeline works. And a video The technology has been in your phones for years, not sure what's different about this approach.

I'm a bot working hard to help Redditors find related videos to watch. I'll keep this updated as long as I can.


Play All | Info | Get me on Chrome / Firefox

[–]benzokenzobar 0 points1 point  (0 children)

This could be a huge breakthrough for indy filmmakers where low-light conditions make filming a challenge.

[–]spoons2380 0 points1 point  (0 children)

I've been using Neat Video denoiser for years to achieve similar results, which you could use for stills also. Seems to me the denoiser used in the comparison example appears to be poor quality and isn't really a could example of whats already out there.

[–]chaosfire235 0 points1 point  (0 children)

Ha! You have no more power over us, spooky-ghost-in-a-grainy-corner-of-the-frame!

[–]GregIsUgly 0 points1 point  (0 children)

The way the noise seamlessly disappears is amazing to see

[–]simonmales 0 points1 point  (0 children)

Enhance!

[–]kwizzles 0 points1 point  (2 children)

Take my money!

[–]MadHatter69[S] 0 points1 point  (1 child)

and shut up?

[–]kwizzles 0 points1 point  (0 children)

And deliver

[–]rudyv8 0 points1 point  (0 children)

This would be really useful if applied to a smokey environment for fire trucks.

[–]inuit7 0 points1 point  (0 children)

I would pay a lot of money for this.

[–]IAM_Deafharp_AMA 0 points1 point  (0 children)

Is there a subreddit that shows off cool tech demos like this?

[–]pwillia7 0 points1 point  (0 children)

I'd like to see this applied with high speed cameras.

[–]WingerRules 0 points1 point  (0 children)

Even the blur it causes has a nice quality to it.

[–]rush22 0 points1 point  (0 children)

Can't you just use more sensitive film?

[–]ILOIVEI 0 points1 point  (0 children)

So many aliens are about to get revealed y'all

[–]readmond 0 points1 point  (0 children)

My guess that even minor camera shake would f this up. Maybe it could be improved by more training. Anyhow, this is impressive.

[–]Chaosrains 0 points1 point  (0 children)

fork handles

[–][deleted] 0 points1 point  (0 children)

So much information there if you find it.

[–]Javlington 0 points1 point  (0 children)

Well, that was uninformative! I learnt nothing :(

[–]guernica88 0 points1 point  (6 children)

in b4 google buys this

[–]ChalkyTannins 7 points8 points  (5 children)

Google's had this for years thanks to Marc Levoy

https://www.youtube.com/watch?v=S7lbnMd56Ys

[–]Armanlex 4 points5 points  (0 children)

Isn't this more like long exposure but in software? Not the same way op's video does it but still pretty impressive!

[–]IMPORTANT_FROG 0 points1 point  (3 children)

Then why don't they implement this in google photos?

[–]ChalkyTannins 5 points6 points  (0 children)

It is, to an extent with HDR+

To see farter into the dark like these demos, Google probably has deemed the (consistency of) quality of images, processing time, and human interface usability not up to an acceptable level to ship on their flagship devices.

It'll get there eventually.

[–]PublicMoralityPolice 0 points1 point  (0 children)

Because this specific algorithm requires RAW images (ie, direct bayer sensor readings), which most phone and handheld cameras don't provide.

[–]alecs_stan -1 points0 points  (0 children)

Where do you place the wizard?

[–]deepmaus -1 points0 points  (0 children)

Last time I got impressed by one of these was with waifu2x

[–]cfryant -1 points0 points  (0 children)

"Curse them! They make the night brighter than the day!"

[–]whozurdaddy -1 points0 points  (0 children)

let me know when the IPO is filed. insta-millionares. absolutely incredible.