sharing and Reusing Tensorflow variables

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP



sharing and Reusing Tensorflow variables



I have been struggling with an issue for few days now and I can not understand whats going on , I have developed an seq2seq model , in one function I create some Tensorflow operations and variables then return them to the caller , I would like that function to reuse all the variables and no matter what I do in scopes I do not seem to get it right , below is the function :


def create_complete_cell(rnn_size,num_layers,encoder_outputs_tr,batch_size,encoder_state , beam_width ):

with tf.variable_scope("InnerScope" , reuse=tf.AUTO_REUSE):
encoder_outputs_tr =tf.contrib.seq2seq.tile_batch(encoder_outputs_tr, multiplier=beam_width)
encoder_state = tf.contrib.seq2seq.tile_batch(encoder_state, multiplier=beam_width)
batch_size = batch_size * beam_width
dec_cell = tf.contrib.rnn.MultiRNNCell([create_cell(rnn_size) for _ in range(num_layers)])

attention_mechanism = tf.contrib.seq2seq.BahdanauAttention(num_units=rnn_size, memory=encoder_outputs_tr )

attn_cell = tf.contrib.seq2seq.AttentionWrapper(dec_cell, attention_mechanism , attention_layer_size=rnn_size , output_attention=False)
attn_zero = attn_cell.zero_state(batch_size , tf.float32 )
attn_zero = attn_zero.clone(cell_state = encoder_state)
return attn_zero , attn_cell



and below is the code calling the above function :


with tf.variable_scope('scope' ):
intial_train_state , train_cell = create_complete_cell(rnn_size,num_layers,encoder_outputs_tr,batch_size,encoder_state , 1 )
with tf.variable_scope('scope' ,reuse=True):
intial_infer_state , infer_cell = create_complete_cell(rnn_size,num_layers,encoder_outputs_tr,batch_size,encoder_state , beam_width )
print("intial_train_state" , intial_train_state)
print("intial_infer_state" , intial_infer_state)



the print outputs the below :



first print command outputs:


('intial_train_state', AttentionWrapperState(cell_state=(LSTMStateTuple(c=<tf.Tensor 'scope/InnerScope/tile_batch_1/Reshape:0' shape=(?, 512) dtype=float32>, h=<tf.Tensor 'scope/InnerScope/tile_batch_1/Reshape_1:0' shape=(?, 512) dtype=float32>), LSTMStateTuple(c=<tf.Tensor 'scope/InnerScope/tile_batch_1/Reshape_2:0' shape=(?, 512) dtype=float32>, h=<tf.Tensor 'scope/InnerScope/tile_batch_1/Reshape_3:0' shape=(?, 512) dtype=float32>), LSTMStateTuple(c=<tf.Tensor 'scope/InnerScope/tile_batch_1/Reshape_4:0' shape=(?, 512) dtype=float32>, h=<tf.Tensor 'scope/InnerScope/tile_batch_1/Reshape_5:0' shape=(?, 512) dtype=float32>), LSTMStateTuple(c=<tf.Tensor 'scope/InnerScope/tile_batch_1/Reshape_6:0' shape=(?, 512) dtype=float32>, h=<tf.Tensor 'scope/InnerScope/tile_batch_1/Reshape_7:0' shape=(?, 512) dtype=float32>)), attention=<tf.Tensor 'scope/InnerScope/AttentionWrapperZeroState/zeros_1:0' shape=(100, 512) dtype=float32>, time=<tf.Tensor 'scope/InnerScope/AttentionWrapperZeroState/zeros:0' shape=() dtype=int32>, alignments=<tf.Tensor 'scope/InnerScope/AttentionWrapperZeroState/zeros_2:0' shape=(100, ?) dtype=float32>, alignment_history=()))



and the second print commands outputs :


('intial_infer_state', AttentionWrapperState(cell_state=(LSTMStateTuple(c=<tf.Tensor 'scope_1/InnerScope/tile_batch_1/Reshape:0' shape=(?, 512) dtype=float32>, h=<tf.Tensor 'scope_1/InnerScope/tile_batch_1/Reshape_1:0' shape=(?, 512) dtype=float32>), LSTMStateTuple(c=<tf.Tensor 'scope_1/InnerScope/tile_batch_1/Reshape_2:0' shape=(?, 512) dtype=float32>, h=<tf.Tensor 'scope_1/InnerScope/tile_batch_1/Reshape_3:0' shape=(?, 512) dtype=float32>), LSTMStateTuple(c=<tf.Tensor 'scope_1/InnerScope/tile_batch_1/Reshape_4:0' shape=(?, 512) dtype=float32>, h=<tf.Tensor 'scope_1/InnerScope/tile_batch_1/Reshape_5:0' shape=(?, 512) dtype=float32>), LSTMStateTuple(c=<tf.Tensor 'scope_1/InnerScope/tile_batch_1/Reshape_6:0' shape=(?, 512) dtype=float32>, h=<tf.Tensor 'scope_1/InnerScope/tile_batch_1/Reshape_7:0' shape=(?, 512) dtype=float32>)), attention=<tf.Tensor 'scope_1/InnerScope/AttentionWrapperZeroState/zeros_1:0' shape=(300, 512) dtype=float32>, time=<tf.Tensor 'scope_1/InnerScope/AttentionWrapperZeroState/zeros:0' shape=() dtype=int32>, alignments=<tf.Tensor 'scope_1/InnerScope/AttentionWrapperZeroState/zeros_2:0' shape=(300, ?) dtype=float32>, alignment_history=()))



I was expecting that both output would be the same since I'm reusing the variables but as you can see that for example in the first variable the output has something like this
scope/InnerScope/tile_batch_1/Reshape_1:0



and in the second variable



scope_1/InnerScope/tile_batch_1/Reshape_1:0



I do not know why _1 is added to scope in the second call , and I'm a bit confused if the variable is being shared or not , and if not what should I do to return the same variable ( shared) .



thank you




1 Answer
1



I notice that this question is left unanswered , I'm just reposting the answer I got on the same subject from Tensorflow GitHub site .



source : https://github.com/tensorflow/tensorflow/issues/12916



to summarize , the values
scope_1/InnerScope/tile_batch_1/Reshape_1:0
scope/InnerScope/tile_batch_1/Reshape_1:0



are not variable names , instead these are just the nodes created by tensorflow for the reshape operation , each Operation in Tensroflow graph has its unique name .



this does not mean that the variables are not shared






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

Firebase Auth - with Email and Password - Check user already registered

Dynamically update html content plain JS

How to determine optimal route across keyboard