-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
memory allocation error if strings too long #138
Comments
How large is this file, in bytes? |
Previous comment was a little short on lines; I've updated it to read 10 million (1e7 lines * 100 chars ~= 1GB). You may need to vary the file size a bit to reproduce the error, but
|
OK, thanks for the notice. That's why |
No pressure, but is this still on the TODO list? |
Sure, but I'm quite busy right now... |
Actually, it's > x <- charToRaw(stringi::stri_dup("a", 2**29))
> y <- stringi::stri_encode(x, NULL, "utf-8")
Error in stringi::stri_encode(x, NULL, "utf-8") :
memory allocation or access error |
The same with > x <- stringi::stri_rand_strings(1, 2**29)
Error in stringi::stri_rand_strings(1, 2^29) :
memory allocation or access error |
Fix on the way; changing buf size type to > x <- stringi::stri_rand_strings(1, 2**31-1)
> stringi::stri_length(x)
[1] 2147483647
|
Well.., I'll give it a thought tomorrow again 🤔 x <- paste0(sample(letters, 100, replace = TRUE), collapse = "")
lines <- rep.int(x, 1e7)
writeLines(lines, "bigfile.txt")
lines <- stringi::stri_read_lines("bigfile.txt")
##Error in stri_encode(txt, encoding, "UTF-8") :
## Start of codes indicating failure. (U_ILLEGAL_ARGUMENT_ERROR) |
After fixing a few bugs, which were there, encoding conversion for in-memory data of size 672 MB works fine. A few MBs more and we will get the following error (which, at least, is now more informative):
I would have to rewrite the whole A workaround is to open a file connection and read data in mini-batches (say, of 0.5 GBs of size). |
Here's a large text file with 10 million lines, each containing 100 characters.
If I try to read it (tested on a modern desktop PC with 16GB RAM under Windows 7), R crashes with the error:
For comparison, I can read the file using base-R's
readLines
anddata.table
'sfread
.The text was updated successfully, but these errors were encountered: