What frameworks are commonly used in deep learning?
Common frameworks in deep learning include TensorFlow, PyTorch, Keras, and MXNet.
How do you handle overfitting in deep learning models?
Overfitting can be handled by using techniques such as dropout, data augmentation, L1/L2 regularization, and early stopping.
Can you explain the differences between CNNs and RNNs?
CNNs (Convolutional Neural Networks) are used primarily for image data, capturing spatial hierarchies, whereas RNNs (Recurrent Neural Networks) are used for sequential data, capturing temporal dependencies.
How do you decide the architecture of a neural network?
The architecture is decided based on factors such as the type of data, the complexity of the task, computational resources, and experimentation with hyperparameters.
What is transfer learning in deep learning?
Transfer learning involves using a pre-trained model on a new, but related, problem, which can reduce the amount of training data and time required to train the model.
What role does activation function play in a neural network?
Activation functions introduce non-linearities into the network and help it learn complex patterns by deciding whether a neuron should be activated or not.
How do you implement model evaluation and validation?
Model evaluation and validation involve splitting the dataset into training, validation, and testing sets, using metrics like accuracy, precision, recall, and using cross-validation techniques.
Why is batch normalization used in deep learning?
Batch normalization helps in stabilizing learning by normalizing the inputs of each layer in the middle of training, which accelerates convergence and sometimes improves model accuracy.
What is a hyperparameter in deep learning?
Hyperparameters are external parameters of the model, like learning rate, batch size, and number of epochs, that need to be set before training and are not learned from data.
How is gradient descent used in training a neural network?
Gradient descent is used to update the weights of the neural network iteratively by minimizing the loss function, allowing the model to learn from the data.