I have been following Andrew NG's videos on neural networks. In these videos, he doesn't associate a bias to each and every neuron. Instead, he adds a bias unit at the head of every layer after their activations have been computed and uses this bias along with the computations to calculate the activations of the next layer (forward propogation).
However, in some other blogs on machine learning and videos like this, there is a bias being associated with each neuron. What and why is this difference and what are it's implications?