sharing and Reusing Tensorflow variables

I have been struggling with an issue for few days now and I can not understand whats going on , I have developed an seq2seq model , in one function I create some Tensorflow operations and variables then return them to the caller , I would like that function to reuse all the variables and no matter what I do in scopes I do not seem to get it right , below is the function :

def create_complete_cell(rnn_size,num_layers,encoder_outputs_tr,batch_size,encoder_state , beam_width ): with tf.variable_scope("InnerScope" , reuse=tf.AUTO_REUSE): encoder_outputs_tr =tf.contrib.seq2seq.tile_batch(encoder_outputs_tr, multiplier=beam_width) encoder_state = tf.contrib.seq2seq.tile_batch(encoder_state, multiplier=beam_width) batch_size = batch_size * beam_width dec_cell = tf.contrib.rnn.MultiRNNCell([create_cell(rnn_size) for _ in range(num_layers)]) attention_mechanism = tf.contrib.seq2seq.BahdanauAttention(num_units=rnn_size, memory=encoder_outputs_tr ) attn_cell = tf.contrib.seq2seq.AttentionWrapper(dec_cell, attention_mechanism , attention_layer_size=rnn_size , output_attention=False) attn_zero = attn_cell.zero_state(batch_size , tf.float32 ) attn_zero = attn_zero.clone(cell_state = encoder_state) return attn_zero , attn_cell

and below is the code calling the above function :

with tf.variable_scope('scope' ): intial_train_state , train_cell = create_complete_cell(rnn_size,num_layers,encoder_outputs_tr,batch_size,encoder_state , 1 ) with tf.variable_scope('scope' ,reuse=True): intial_infer_state , infer_cell = create_complete_cell(rnn_size,num_layers,encoder_outputs_tr,batch_size,encoder_state , beam_width ) print("intial_train_state" , intial_train_state) print("intial_infer_state" , intial_infer_state)

the print outputs the below :

first print command outputs:

('intial_train_state', AttentionWrapperState(cell_state=(LSTMStateTuple(c=<tf.Tensor 'scope/InnerScope/tile_batch_1/Reshape:0' shape=(?, 512) dtype=float32>, h=<tf.Tensor 'scope/InnerScope/tile_batch_1/Reshape_1:0' shape=(?, 512) dtype=float32>), LSTMStateTuple(c=<tf.Tensor 'scope/InnerScope/tile_batch_1/Reshape_2:0' shape=(?, 512) dtype=float32>, h=<tf.Tensor 'scope/InnerScope/tile_batch_1/Reshape_3:0' shape=(?, 512) dtype=float32>), LSTMStateTuple(c=<tf.Tensor 'scope/InnerScope/tile_batch_1/Reshape_4:0' shape=(?, 512) dtype=float32>, h=<tf.Tensor 'scope/InnerScope/tile_batch_1/Reshape_5:0' shape=(?, 512) dtype=float32>), LSTMStateTuple(c=<tf.Tensor 'scope/InnerScope/tile_batch_1/Reshape_6:0' shape=(?, 512) dtype=float32>, h=<tf.Tensor 'scope/InnerScope/tile_batch_1/Reshape_7:0' shape=(?, 512) dtype=float32>)), attention=<tf.Tensor 'scope/InnerScope/AttentionWrapperZeroState/zeros_1:0' shape=(100, 512) dtype=float32>, time=<tf.Tensor 'scope/InnerScope/AttentionWrapperZeroState/zeros:0' shape=() dtype=int32>, alignments=<tf.Tensor 'scope/InnerScope/AttentionWrapperZeroState/zeros_2:0' shape=(100, ?) dtype=float32>, alignment_history=()))

and the second print commands outputs :

('intial_infer_state', AttentionWrapperState(cell_state=(LSTMStateTuple(c=<tf.Tensor 'scope_1/InnerScope/tile_batch_1/Reshape:0' shape=(?, 512) dtype=float32>, h=<tf.Tensor 'scope_1/InnerScope/tile_batch_1/Reshape_1:0' shape=(?, 512) dtype=float32>), LSTMStateTuple(c=<tf.Tensor 'scope_1/InnerScope/tile_batch_1/Reshape_2:0' shape=(?, 512) dtype=float32>, h=<tf.Tensor 'scope_1/InnerScope/tile_batch_1/Reshape_3:0' shape=(?, 512) dtype=float32>), LSTMStateTuple(c=<tf.Tensor 'scope_1/InnerScope/tile_batch_1/Reshape_4:0' shape=(?, 512) dtype=float32>, h=<tf.Tensor 'scope_1/InnerScope/tile_batch_1/Reshape_5:0' shape=(?, 512) dtype=float32>), LSTMStateTuple(c=<tf.Tensor 'scope_1/InnerScope/tile_batch_1/Reshape_6:0' shape=(?, 512) dtype=float32>, h=<tf.Tensor 'scope_1/InnerScope/tile_batch_1/Reshape_7:0' shape=(?, 512) dtype=float32>)), attention=<tf.Tensor 'scope_1/InnerScope/AttentionWrapperZeroState/zeros_1:0' shape=(300, 512) dtype=float32>, time=<tf.Tensor 'scope_1/InnerScope/AttentionWrapperZeroState/zeros:0' shape=() dtype=int32>, alignments=<tf.Tensor 'scope_1/InnerScope/AttentionWrapperZeroState/zeros_2:0' shape=(300, ?) dtype=float32>, alignment_history=()))

I was expecting that both output would be the same since I'm reusing the variables but as you can see that for example in the first variable the output has something like this
scope/InnerScope/tile_batch_1/Reshape_1:0

and in the second variable

scope_1/InnerScope/tile_batch_1/Reshape_1:0

I do not know why _1 is added to scope in the second call , and I'm a bit confused if the variable is being shared or not , and if not what should I do to return the same variable ( shared) .

thank you

1 Answer
1

I notice that this question is left unanswered , I'm just reposting the answer I got on the same subject from Tensorflow GitHub site .

source : https://github.com/tensorflow/tensorflow/issues/12916

to summarize , the values
scope_1/InnerScope/tile_batch_1/Reshape_1:0
scope/InnerScope/tile_batch_1/Reshape_1:0

are not variable names , instead these are just the nodes created by tensorflow for the reshape operation , each Operation in Tensroflow graph has its unique name .

this does not mean that the variables are not shared

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

搜尋此網誌

Sfyjdyy

sharing and Reusing Tensorflow variables

sharing and Reusing Tensorflow variables

1 Answer
1

Popular posts from this blog

Firebase Auth - with Email and Password - Check user already registered

Dynamically update html content plain JS

How to determine optimal route across keyboard

sharing and Reusing Tensorflow variables

sharing and Reusing Tensorflow variables

1 Answer 1

Popular posts from this blog

Firebase Auth - with Email and Password - Check user already registered

Dynamically update html content plain JS

How to determine optimal route across keyboard

1 Answer
1