Applications
When we first are introduced to deep learning, we see it as a better machine learning classifier. Alternatively, we could subscribe to the hype that it is 'brain-like' neuro-computing. In the former instance, we grossly underestimate the kinds of applications we can build with this. In the later instance, we grossly overestimate its capabilities and as a consequence overlook the kind of applications that are not general artificial intelligence, but applications that are more realistic and pragmatic.
It is best to look at applications of deep learning from the perspective of improving human computer interaction. This is perhaps the most natural approach. Deep learning systems do appear to have capabilities that approximate the capabilities of biological brains. As such, they can be most effectively used to augmenting tasks that humans or even animals have been employed to perform. It is important to remember that deep learning systems are very different from traditional symbolic computing platforms. Just as humans think very different from how a computer computes, deep learning similarly different.
However, deep learning systems are already intrinsically built from traditional computational technology. So that the tireless mechanistic efficiency inherent in computers are also present with deep learning. Computers are much more capable than humans in performing accurate symbolic computation and inference. Though, deep learning systems are not yet capable of performing complex symbolic computation. They are however by default already linked to this capability.
Applications built using deep learning seems to be straight out of science fiction. Here is a partial sample of some of the incredible applications that have been developed so far:
Photo Captioning for the Blind
Facebook has developed a mobile app that is able to describe a photograph to people who are blind. http://www.wired.com/2015/10/facebook-artificial-intelligence-describes-photo-captions-for-blind-people/
Realtime Speech Translation
Microsoft Skype is able to translate voice into different languages in realtime. Something straight out of the universal translator in Star Trek. http://blogs.skype.com/2014/12/15/skype-translator-how-it-works/
Automated Email Replies
Google Mail is able to automatically respond to email on your behalf. http://www.wired.com/2015/11/google-is-using-ai-to-create-automatic-replies-in-gmail/
Object Identification
Moodstocks (acquired by Google) is able to identify common objects using your mobile phone. http://www.slideshare.net/CdricDeltheil1/moodstocks-mobile-image-recognition-paris-tech-talks-6
Location Identification from Photographs
Google is able to identify the location of where a photograph is taken just my analyzing the scene. https://www.technologyreview.com/s/600889/google-unveils-neural-network-with-superhuman-ability-to-determine-the-location-of-almost/
Organizing Collections of Photographs
Google Photos is able to autmatically organize your photographs into collections with common shared themes. https://www.youtube.com/watch?v=JuFtW1PSYAU
Classifying Photographs
Yelp is able to automatically classify photographs into different business relevant categories. http://engineeringblog.yelp.com/2015/10/how-we-use-deep-learning-to-classify-business-photos-at-yelp.html
Self-Driving Cars
A hobbyist is able to teach his car to self-drive in a few hours. http://www.bloomberg.com/features/2015-george-hotz-self-driving-car/
https://arxiv.org/pdf/1604.07316v1.pdf End to End Learning for Self-Driving Cars
We have empirically demonstrated that CNNs are able to learn the entire task of lane and road following without manual decomposition into road or lane marking detection, semantic abstraction, path planning, and control. A small amount of training data from less than a hundred hours of driving was sufficient to train the car to operate in diverse conditions, on highways, local and residential roads in sunny, cloudy, and rainy conditions. The CNN is able to learn meaningful road features from a very sparse training signal (steering alone). The system learns for example to detect the outline of a road without the need of explicit labels during training.
Music Composition
Music can be composed based on different composer styles. http://web.mit.edu/felixsun/www/neural-music.html
Painting based on Artists Styles
Painting can be created based on famous artist painting styles. https://nucl.ai/blog/neural-doodles/
Discovery of New Materials
New materials are discovered with the help of deep learning. http://www.nature.com/articles/srep02810
Playing Video Games
Google DeepMind is able to create video game playing systems that learn how to play well by just watching the game. http://www.wired.co.uk/article/google-deepmind-atari
Playing Championship Level Go
Google DeepMind has created a Go playing system that is able to learn new strategies by playing against itself. http://www.scientificamerican.com/article/how-the-computer-beat-the-go-master/
Face Identification
Face recognition is so common that it is no longer surprising.
https://cmusatyalab.github.io/openface/
http://gitxiv.com/posts/fDJ7nHHou57aLEjBQ/the-megaface-benchmark-1-million-faces-for-recognition-at
Click-bait Headline Generation
A RNN is trained to generate click-bait headlines.
https://larseidnes.com/2015/10/13/auto-generating-clickbait-with-recurrent-neural-networks/
Colorization of Black and White Photographs
A system is trained to convert black and white photographs into color. http://richzhang.github.io/colorization/
http://demos.algorithmia.com/colorize-photos/ is a service that lets you try this out on your own photos!
Realtime Translation of Images of Text
Google has a mobile app that translates the text found in a photo into text that you can understand.
https://research.googleblog.com/2015/07/how-google-translate-squeezes-deep.html
Predictive Keyboards
Swiftkey is building keyboards for mobile phones that make it easier and faster for you to type. http://www.slashgear.com/swiftkey-neural-alpha-predicts-what-youll-type-08408912/
Predict the Future
Well, that's the claim by these folks at MIT: http://news.mit.edu/2016/teaching-machines-to-predict-the-future-0621
3D Object Classification
http://3dshapenets.cs.princeton.edu/
Gesture Recognition
Learning the meaning of different hand gestures is likely going to be how we interact with devices that don't have screens.
https://engineering.purdue.edu/cdesign/wp/deephand-robust-hand-pose-estimation/ https://atap.google.com/soli/
Deep Learning for Electromyographic Hand Gesture Signal Classification by Leveraging Transfer Learning
https://arxiv.org/abs/1801.07756
Converting Photos of People to make them Smile SmileVector is able to take an image of an image of a person and transform it into an image of the person smiling.
https://www.engadget.com/2016/06/27/twitter-bot-plasters-creepy-smiles-on-celebrities-faces/
Human Like Conversation
Google has create a messaging application that has more natural conversational capabilities.. https://research.googleblog.com/2016/05/chat-smarter-with-allo.html
Augmented Reality - Face Tracking Baidu created a mobile app that is able to track faces using Deep Learning. The app overlays a 3D image over one's face.
http://research.baidu.com/happy-halloween-baidu-research-introduces-faceyou/ https://www.technologyreview.com/s/602091/baidu-is-bringing-intelligent-ar-to-the-masses/
Warehouse Optimization A Deep Learning system is trained to learn an optimal way of pick and placing items in a warehouse. This system is faster than the more traditional operation research optimization approach.
https://devblogs.nvidia.com/parallelforall/optimizing-warehouse-operations-machine-learning-gpus/
Sketch to Search Sketch an image as a query to a visual search.
https://news.developer.nvidia.com/using-sketches-to-search-for-products-online
Prosetheses Control
http://arxiv.org/pdf/1602.05702v3.pdf
EEG-informed attended speaker extraction from recorded speech mixtures with application in neuro-steered hearing prostheses.
Accelerating Fluid Simulation
http://cims.nyu.edu/~schlacht/CNNFluids.htm
Leveraging convolution networks to create fast and highly realistic fluid simulations.
Personalization
Amazon drives its personalization capabilities using Deep Learning. http://blogs.aws.amazon.com/bigdata/post/TxGEL8IJ0CAXTK/Generating-Recommendations-at-Amazon-Scale-with-Apache-Spark-and-Amazon-DSSTNE
Brain Tumor Detection
Results reported on the 2013 BRATS test dataset reveal that the 802,368 parameter network improves over published state-of-the-art and is over 30 times faster.
https://arxiv.org/abs/1505.03540
Reducing your Electric Bill
Google is using technology from the DeepMind artificial intelligence subsidiary for big savings on the power consumed by its data centers.
Stocking Shelves
Amazon sponsored researchers used deep learning to analyze 3D scans of objects that their robot had to pick and replace.
http://www.theverge.com/2016/7/5/12095788/amazon-picking-robot-challenge-2016
Mapping Streets
Facebook is using Deep Learning to create more accurate and current maps from satellite imagery.
http://forum.openstreetmap.org/viewtopic.php?id=55220
Voice Printing
Identifying people through their voice.
https://www.technologyreview.com/s/537101/deep-learning-machine-solves-the-cocktail-party-problem/
Infrared Colorization
Users may more quickly and accurately comprehend infrared images that have been colorized.
http://arxiv.org/abs/1604.02245v3
3D Design
Taking a 3D voxel representation of a shape and a semantic deformation intention (e.g., make more sporty) as input and then generate a deformation flow at the output.
Sketch to Generate Realistically Photos
Convert face sketches to synthesize photorealistic face images.
https://arxiv.org/pdf/1606.03073v1.pdf
Predicting Clinical Events
A RNN trained on time stamped EHR data from 260 thousand patients and 14,805 physicians over 8 years. The network is able to make multilabel predictions (one label for each diagnosis or medication category). The system can perform differential diagnosis with up to 79% recall, significantly higher than several baselines.
http://arxiv.org/pdf/1511.05942v9.pdf
Skin Evaluation and Recommendation
http://www.glossy.co/making-it-personal/olay-built-a-skin-evaluation-tool-to-help-drugstore-shoppers
Using Deep Learning to determine a customer’s “skin age,” identify problem areas and offer a regimen of products meant to address those issues.
Bioinformatics
http://www.mdpi.com/1422-0067/17/8/1313/htm
Drug design, virtual screening (VS), Quantitative Structure–Activity Relationship (QSAR) research, protein structure prediction and genomics (and other omics) data mining.
Art
http://iq.intel.com/getting-creative-ai-and-machine-learning/
Reducing Risk in Agriculture due to Climate Change
http://www.slideshare.net/ErikAndrejko/deep-learninginagriculture
Mapping Poverty using Satellite Data
https://news.developer.nvidia.com/deep-learning-and-satellite-data-helping-map-poverty
Discover New Compression Algorithms
http://arxiv.org/abs/1608.05148 Full Resolution Image Compression with Recurrent Neural Networks
This is the first neural network architecture that is able to outperform JPEG at image compression across most bitrates on the rate-distortion curve on the Kodak dataset images, with and without the aid of entropy coding.
http://www.theverge.com/2016/9/26/13055938/ai-pop-song-daddys-car-sony Writing a pop song.
Transfiguring Portraits
Place your face into another portrait.
http://homes.cs.washington.edu/~kemelmi/Transfiguring_Portraits_Kemelmacher_SIGGRAPH2016.pdf
Speech Synthesis
https://deepmind.com/blog/wavenet-generative-model-raw-audio/
Blur Out Background in Photographs
http://www.theverge.com/2016/9/8/12839838/apple-iphone-7-plus-ai-machine-learning-bokeh-photography
Predicting Corporate Bankruptcies
http://onlinelibrary.wiley.com/doi/10.1111/jbfa.12218/full
YouTube Recommendations
http://research.google.com/pubs/pub45530.html
Sorting Cucumbers
Reducing Traffic
Reverse Engineering Biological Processes
http://phys.org/news/2015-06-planarian-regeneration-artificial-intelligence.html
Realtime Facial Transfer
Research at Stanford shows how you can transfer your expressions into someone else's face. This is not a deep learning application, but I would not be surprised if a deep learning system could do something similar. http://graphics.stanford.edu/~niessner/thies2015realtime.html realtime facial transfer. Not realtime, but using Deep Learning: https://arxiv.org/pdf/1610.05586v1.pdf
Fast Face-swap Using Convolutional Neural Networks
https://arxiv.org/abs/1611.09577
Swap Nicholas Cage and Taylor Swift into another person's face.
Virtual Assistant
https://x.ai/a-peek-at-x-ais-data-science-architecture
Analysis of Disaster Damage
Realtime Conversational Assistance
http://www.huffingtonpost.com/adi-gaskell/machine-learning-and-the-_b_12652122.html
Detect Fashionable Clothing
http://qz.com/821512/artificial-intelligence-for-fashion/
Baby Sleep Monitor
https://blogs.nvidia.com/blog/2016/10/30/babbycam-baby-monitor-deep-learning/
Voice Conversion
https://arxiv.org/abs/1610.08927v1 Voice Conversion using Convolutional Neural Networks
Music Genre Classification
Photorealistic Facial Texture Inference
Music Classification
https://arxiv.org/abs/1611.09827v1
DeepHealth
http://www.nature.com/articles/srep26094
Image Editing
https://www.youtube.com/watch?v=KXmZ39brkzE
https://arxiv.org/pdf/1702.06683.pdf Using Deep Learning and Google Street View to Estimate the Demographic Makeup of the US
Story Points (Task Estimation)
https://arxiv.org/abs/1609.00489
Other Vision Applications https://github.com/kjw0612/awesome-deep-vision
Scene Text Erase
https://arxiv.org/abs/1705.02772v1
Visual Product Discovery
https://arxiv.org/abs/1702.04680
Spatial-Temporal Recurrent Neural Network for Emotion Recognition
https://arxiv.org/abs/1705.04515
Facial Animation
https://arstechnica.com/gaming/2017/08/nvidia-remedy-neural-network-facial-animation/
Crowdturfing
https://arxiv.org/pdf/1708.08151.pdf Automated Crowdturfing Attacks and Defenses in Online Review Systems
Watermark Removal
https://watermark-cvpr17.github.io/
The Conditional Analogy GAN: Swapping Fashion Articles on People Images
https://arxiv.org/pdf/1709.04695v1.pdf
Inspection
See Behind Walls
Neural network identification of people hidden from view with a single-pixel, single-photon detector
https://arxiv.org/abs/1709.07244
Chemical Synthesis
Learning to Plan Chemical Syntheses https://arxiv.org/pdf/1708.04202.pdf
Smart Mirror Makeup
https://arxiv.org/pdf/1709.07566.pdf
https://www.linkedin.com/pulse/your-expertise-longer-needed-sincerely-deep-ben-taylor-ai-hacker
Reading Text in the Wild http://www.robots.ox.ac.uk/~vgg/research/text/#sec-models
http://linkis.com/www.nextplatform.com/inFee
https://estranhosidade.wordpress.com/2016/02/20/the-automation-of-the-technical-part-of-art-the-use-of-artificial-intelligence-in-the-artistic-creation/ THE AUTOMATION OF THE “TECHNICAL” PART OF ART: THE USE OF ARTIFICIAL INTELLIGENCE IN THE ARTISTIC CREATION
https://news.ycombinator.com/item?id=13159908
http://www.yaronhadad.com/deep-learning-most-amazing-applications
http://www.cim.mcgill.ca/~mrl/pubs/saul/egsr04.pdf Sketch Interpretation and Refinement Using Statistical Models
https://arxiv.org/abs/1709.05424v1 NIMA: Neural Image Assessment
https://arxiv.org/abs/1802.02511v1 DeepHeart: Semi-Supervised Sequence Learning for Cardiovascular Risk Prediction
https://www.biorxiv.org/content/early/2018/02/14/265231.full.pdf+html END-TO-END DIFFERENTIABLE LEARNING OF PROTEIN STRUCTURE
https://arxiv.org/abs/1802.06006 Voice cloning
http://www.cs.columbia.edu/cg/fontcode/
https://users.cg.tuwien.ac.at/zsolnai/gfx/gaussian-material-synthesis/
Watermark removal
https://arxiv.org/pdf/1803.04189.pdf
https://arxiv.org/abs/1811.08009v1 Logo Detection
Caricature Drawing https://ai.stanford.edu/~kaidicao/carigan.pdf
Lung cancer
https://github.com/ncoudray/DeepPATH
Comixify https://arxiv.org/abs/1812.03473
Comment Generation https://www.twosixlabs.com/automatically-generating-comments-for-arbitrary-source-code/
Brain to Speech https://www.sciencemag.org/news/2019/01/artificial-intelligence-turns-brain-activity-speech
Generative Creativity FontCode: Embedding Information in Text Documents using Glyph Perturbation