-
Notifications
You must be signed in to change notification settings - Fork 24
How to use SlowFS? #30
Comments
This related to Can we open files by "FloatTensor" instead of "io.popen" when testing "test_StreamedDataset.lua" ? #33 Thank you. |
Hi, Can you describe what you are trying to do? There's no default It doesn't make sense to hook it up to a tensor. Can you elaborate on what Thanks, On Tuesday, May 17, 2016, Chien-Lin Huang 黃建霖 [email protected]
|
Hi, SlowFS is meant to be used for a slow and remote file system, like Hadoop I think what you want is more a long the lines of the IndexDirectory setup. The big difference is that IndexDirectory does not currently support files A simple fix would be to store all your data in a directory, where each Hope this helps. On Wed, May 18, 2016 at 10:35 AM, Chien-Lin Huang 黃建霖 <
|
Hi, I have about one thousand hour speech data. After feature extraction, I have four hundred million vectors and the dimension is about 500 for each floating vector. I would like to use these vectors to train NNs. In examples of mnist.lua and cifar10.lua, I can indicate a small data file and process it. Because now the data is huge, I need to make several thousand chunking files and them can be loaded dynamically as the training/sampler progresses through them, and get FloatTensor values. SlowFS should be a solution but do not know how to use it. There is an example at 'test/test_StreamedDataset.lua' but it returns the binary content instead of FloatTensor values. By modifying 'test/test_StreamedDataset.lua', 'IndexSlowFS.lua' and 'Reader.lua', now thousand chunking files can be loaded dynamically and return FloatTensor values. Getters.lua
IndexSlowFS.lua
Reader.lua
Although it works now, this seems not a formal solution. Do you have any idea about these changes? I got 4 times faster than a single GPU when using 6 GPUs on testing the small dataset. However, it became much slower when using SlowFS on testing the big dataset. Is it correct? Thank you, |
Thank you Zak :D It works well using IndexDirectory.lua . However, I need to modify Reader.lua in line:77, res[i] = torch.load(item.url) to make it works. The reason why did I modify line:77 is that the return value is the binary content instead of Torch FloatTensor using IndexDirectory.lua. How can I have the return value like IndexTensor.lua when using IndexDirectory.lua? Thank you, |
Thank you, the discussion and answer can be found at #34 |
Hi,
I would like to use SlowFS but dont know how to use it. Originally, we got data by using following scripts. We indicated a data file and read it to the memory.
I try to make files can be loaded dynamically as the training/sampler progresses through them. The scripts changed as followings to match our purpose.
The problem is 'res' should be a 'torch.*Tensor' but it is a 'string' now. Do you have any idea?
Thank you.
The text was updated successfully, but these errors were encountered: