yield n files from disk

I am trying to read file from the disk and then split it into [features and labels]

[features and labels]

def generator(data_path): x_text= counter=0 _y= for root, dirs, files in os.walk(data_path): for _file in files: if _file.endswith(".txt"): _contents = list(open(data_path+_file, "r", encoding="UTF8",errors='ignore').readlines()) _contents = [s.strip() for s in _contents] x_text=x_text+_contents y_examples=[0,0,0] y_examples[counter]=1 y_labels = [y_examples for s in _contents] counter+=1 _y=_y+y_labels return [x_text, _y]

I have huge 3.5GB of data in the disk and I cant read it into the memory at the same time. How can I modify this code to generate n files at a time for processing.

for X_batch, y_batch in generator(data_path): feed_dict = X: X_batch, y: y_batch

Is there an more efficient way to read this huge data in tensorflow?

stackoverflow.com/questions/6475328/…
– halfelf
Aug 6 at 1:21

The question shouldn't be about efficiency. It's about whether the tensorflow model can work with the data in pieces, or must it have all the data in memory at once.
– hpaulj
Aug 6 at 2:32

@hpaulj Can you please guide me how to approach this problem(perhaps a tutorial). I am new to tensorflow
– Rohit
Aug 6 at 4:47

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

搜尋此網誌

Sfyjdyy

yield n files from disk

yield n files from disk

Popular posts from this blog

Firebase Auth - with Email and Password - Check user already registered

Dynamically update html content plain JS

How to determine optimal route across keyboard