Compute gradients of different part of model in Tensorflow

In Tensorflow, we could use Optimizer to train model:

But sometimes, model need to be split to two parts and trained separately, so we need to compute gradients and apply them by two steps:

Then how could we delivery gradients from first part to second part? Here is the equation to answer:

\frac{\partial Loss} {\partial W_{second-part}} = \frac{\partial Loss} {\partial IV} \cdot \frac{\partial IV} {\partial W_{second-part}}

The IV means ‘intermediate vector’, which is the interface vector between first-part and second-part and it is belong to both first-part and second-part. The W_{second-part} is the weights of second part of model. Therefore we could use tf.gradients() to connect gradients of two parts:

Leave a Reply

Your email address will not be published. Required fields are marked *


This site uses Akismet to reduce spam. Learn how your comment data is processed.