Introduction to Weakly Supervised Learning
Supervised Machine Learning relies on labelled data that consists of data and pairs of expected outputs. For example an image of dog that is labelled a dog. ...
In this post I would like to explain why zero training loss in NN's are bad and how we could mitigate this via an technique called flooding. Note this is not exactly the same as overfitting.
Deep Neural Networks (DNN's) are powerful function approximators which could approximate interaction of variables in a manner never before.However Given a powerful enough or larger network than required we could memorize anything , literally anything. As show by this paper .... , neural networks are capable of memorizing random numbers. Most deep learning problems use gradient based optimization. However , every gradient optimization problem isnt a deep neural network problem. In Deep Learning , we train neural networks to learn patterns of a given problem rather than just find optimal parameters for a problem. Say in a cats/dog classifier , a image recognition algorithm learning to recognize dogs by the ears , face. The phenomenon of a DNN being a able to classify data outside the training data is called generalization. This is ideally evaluated by measuring how accurately a DNN performs on data outside training data (data its not seen during optimization). So its necessary the DNN will have to achieve a very low error on training data (very close to zero). But , the training and test error will have to very low. While there have been several techniques in past known as regularizers which have mitigated this effect well such as dropout , L2/L1 regularization and data augmentation. I would to elaborate on another regularizer which helps prevent zero training error. Its quite common for DNN's to be very large compared to the dataset (called overparameterization) , which at times allows it to attain zero training error. But , isnt zero training error good? Not so much since it could lead to very high confidence predictions which could be problemsome from a privacy perspective since it could typically allow users to reconstruct input data via black box attacks during inference (insert link).
Supervised Machine Learning relies on labelled data that consists of data and pairs of expected outputs. For example an image of dog that is labelled a dog. ...
Training neural networks for a single task requires several thousands of examples for a each class when training a model from scratch. This is typically not ...
Conventionally pandas allows you to analyze datasets that are present locally on your PC , that is when you are given access to a given dataset. But , there...
In conventional machine learning we train an algorithm on input/output pairs (Supervised learning) or only on raw data(Unsupervised learning) and make predic...
Conventional machine learning required the practitioner to manually look at images/text and handcraft appropriate features. Deep Learning models are powerful...
Having gone through the importance of differential privacy and its definition , this article motivates the theory with a practical example to make it more in...
Personal data is a personal valuable asset , it could be used for economic , social or even malicious benifits. Most internet companies survive on personal d...