I found that the initial guess step length for the line search in NLCG and L-BFGS is determined by
"old_old_fval = old_fval + np.linalg.norm(gfk) / 2", (line 1621 in https://github.com/scipy/scipy/blob/master/scipy/optimize/_optimize.py).
And the note is "Sets the initial step guess to dx ~ 1".
What's the principle behind it? Could you give me some guides?