Collective Learning Patterns
Collective Learning describes patterns that employs more than one deep learning network to achieve a goal.
Neural Networks are equivalent to stateless functions. The training process of course is not stateless, the states are the model representation that is being learned. However when deployed, the model remains static and a Neural Network is equivalent to a stateless function.
Computer algorithms have 3 fundamental constructs; namely assignment, selection and iteration. Neural networks traditional have the former two, however when you add iteration into the mix then it takes the cognitive ability of these networks to a whole new level. This chapter covers patterns for using neural networks in combination with iterative algorithms.
Note that Fitness and Similarity are relevant in this context. Interesting to not about evolution, inheritance and fitness.
Adversarial Training (Dueling Networks)
Generative Model (moved to Explanatory)
Attention Moved to Memory Patterns
Beam Search Should be moved to learning
Deep Clustering Should be moved to learning
Value Policy RL Reinforcement Learning (Value and Policy Nets)
Combinatorial Feature Selection
Hyper-Parameter Tuning - Should be moved to Learning
Graph Based Semi-Supervised Learning
One-Shot Learning (Semi-supervised Learning) - Should be moved to Memory Patterns
Imitation Learning - Not sure if this is a pattern.
Wide and Deep Learning (Cooperative Regularization)
In_Layer_Regularization Layerwise Regularization, Hidden Layer Regularization
Recurrent Reinforcement Learning
References
http://biorxiv.org/content/biorxiv/early/2016/06/13/058545.full.pdf Towards an integration of deep learning and neuroscience
We hypothesize that (1) the brain optimizes cost functions, (2) these cost functions are diverse and differ across brain locations and over development, and (3) optimization operates within a pre-structured architecture matched to the computational problems posed by behavior. Such a heterogeneously optimized system, enabled by a series of interacting cost functions, serves to make learning data-efficient and precisely targeted to the needs of the organism.
https://arxiv.org/abs/1708.02556 Multi-Generator Gernerative Adversarial Nets
The training procedure is formulated as a minimax game among many generators, a classifier, and a discriminator. Generators produce data to fool the discriminator while staying within the decision boundary defined by the classifier as much as possible; classifier estimates the probability that a sample came from each of the generators; and discriminator estimates the probability that a sample came from the training data rather than from all generators. We develop theoretical analysis to show that at equilibrium of this system, the Jensen-Shannon divergence between the equally weighted mixture of all generators' distributions and the real data distribution is minimal while the Jensen-Shannon divergence among generators' distributions is maximal. Generators can be trained efficiently by utilizing parameter sharing, thus adding minimal cost to the basic GAN model. We conduct extensive experiments on synthetic and real-world large scale data sets (CIFAR-10 and STL-10) to evaluate the effectiveness of our proposed method. Experimental results demonstrate the superior performance of our approach in generating diverse and visually appealing samples over the latest state-of-the-art GAN's variants.