“Learning to See: Gloomy SUnday” by Memo Atken (2018)

 

“A deep neural network making predictions on live camera input, trying to make sense of what it sees, in context of what it’s seen before. It can see only what it already knows, just like us.

“(not ‘style transfer’!)

“memo.tv/learning-to-see-you-are-what-you-see/

“Music: Diamanda Galas – ‘Gloomy Sunday’

“code based on (but more evolved version of)
github.com/memo/webcam-pix2pix-tensorflow

“model (ie training + inference) based on
github.com/affinelayer/pix2pix-tensorflow

“In turn based on
phillipi.github.io/pix2pix/
arxiv.org/abs/1611.07004

“In turn based on
arxiv.org/abs/1406.2661

“and
github.com/Newmu/dcgan_code
arxiv.org/abs/1511.06434

“In turn based on
github.com/goodfeli/adversarial
arxiv.org/abs/1406.2661

“In turn based on
people.idsia.ch/~juergen/deep-learning-overview.html

How Does Your Phone Know This Is A Dog?

Last year, we (a couple of people who knew nothing about how voice search works) set out to make a video about the research that’s gone into teaching computers to recognize speech and understand language.

Making the video was eye-opening and brain-opening. It introduced us to concepts we’d never heard of – like machine learning and artificial neural networks – and ever since, we’ve been kind of fascinated by them. Machine learning, in particular, is a very active area of Computer Science research, with far-ranging applications beyond voice search – like machine translationimage recognition and description, and Google Voice transcription.

So… still curious to know more (and having just started this project) we found Google researchers Greg Corrado and Christopher Olahand ambushed them with our machine learning questions.

More Here

via