begin from
P(T)=P(T+1)/P(1)*h(T)
h here is the so called perturbation function which only depends on the time to maturity, the beginning time is always 1. This indicates h(T) is actually h(1,T). (of course you can also take it as beginning from 0, since P(0)=1, we just ignore it).
In this case, h(T) can never goes forward, such as we can not write P(T)=P(T+2)/P(2)*h(T),because in this case, h(T) goes from 2.
Instead we can do the following:
P(T+1)=P(T+2)/P(1)*h(T+1)
P(1)=P(2)/P(1)*h(1) ( forgive my massy in notation, P(1) on the left is not the same as P(1) on the right. They begins in different node, but I think you know what I mean, P(1) on the left is actually the forward 1 yr discount factor, while the right are all spot.)
so put the P(T+1) and P(1) into P(T)=P(T+1)/P(1)*h(T), you get the result. For expression neat, I omit all i+1 in the subscript, I take a sreenshot from a book, maybe it is much more clear:
It is very good to go through Shreve's book and read original papers. But you know the model on the text books are all very old and have little space to improve. What I suggest is that you find some very new models and implement it and find the draw backs of them. Then, you can make some improvements. The ultimate goal for phd, I thinks is one day you can build your own models.
best,