-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add progress fields to rsync-s parser output #612
Comments
Thanks for reporting this! Could you simplify the rsync output by removing some options like https://github.com/kellyjonbrazil/jc/blob/master/tests/test_rsync.py I can probably extend the output support if you could send me a larger sample. Thanks! |
Sure, no problem! These are some of the options I tried it with. All of these have
The option I'd REALLY like to see working is primarily:
I believe the |
This one seems to work as designed:
Instead of using the |
Sorry if I wasn't clear! I'm not opposed to using ANY or none of the flags. They're meaningless to me. My primary complaint is that the JSON output is incomplete. Ideally, what we would be able to get are progress markers. I'd like to see these:
Hope that clarifies the issue. My example although incomplete at least gives you current file being worked on and progress of current file. |
Gotcha - I'm not sure what the use case is here, though. Progress info is typically for human consumption, right? JSON is typically to hand off to another process. Why would you want to encode the progress into the serialized JSON? What would you do with that information in an automated fashion? |
Perhaps I misunderstood the purpose of this tool. My understanding was that For me, i'd then use that JSON to present to the user in a webpage UI. As it stands, I'm doing all of the processing without the use of this tool. I'd prefer to use the right tool for the job - especially when tooling can output various different ways depending on platforms. Here's an example of how I'm currently using let lastItem
child.stdout.on('data', (dataObj) => {
const data = dataObj.toString()
showDebug && console.log(`stdout: ${data}`)
if (typeof data !== 'string')
return
// FIXME:
/* rsync: [Receiver] failed to connect to ... (...): Network unreachable (101)
rsync error: error in socket IO (code 10) at clientserver.c(138) [Receiver=3.2.4]
*/
if (data.includes('error:')) {
showDebug && console.log(`error: ${data}`)
return this.error(data)
}
// https://regex101.com/
// https://javascript.info/regexp-groups
// Match rsync --progress output. Example: '\r 22,347,776 2% 193.51kB/s 1:27:09 '
const regexMatch = /\s+(([0-9,]?)+\s+)(?<progress>.+?(?=%))/
const matches = data.match(regexMatch)
// TODO: Grab ETA download time, even if we can't use it, another UI might be able to. /stats
// TODO: Grab speed also
if (!matches) {
lastItem = data
return this.emit('log', data)
}
if (matches && matches.groups) {
this.downloading = true
this.emit('progress', { item: lastItem, value: Number(matches.groups.progress) })
}
}) What I'd ideally be able to do is pipe `jc` and refactor my code into something like this: child.stdout.on('data', (jcOutput) => {
const jc = JSON.parse(jcOutput)
// Return progress result to user:
event.emit('progress', {
file: jc.filename,
progress: jc.progress,
totalFiles: jc.totalChunks,
totalProgress: jc.totalProgress,
})
}) |
Standard It is possible to do with the Streaming parsers, like |
You would be correct. If you reference my initial post, you'll see I was attempting to use |
The point of the streaming parsers is to reduce memory consumption. If you are rsyncing millions of files, bundling those all up in a single JSON array could take Gigs of RAM. By streaming, the conversion process only takes Kilobytes, no matter how many files are synced. The end-result would be preserved as a record or saved to a database, etc. More of a batch or automated/non-interactive process. This is an interesting new use-case. Seems like it wouldn't be too difficult to add the progress fields into the |
I believe there's been a misunderstanding. All I'm wanting are progress markers. Not an entire file list. It appears your suggestion to add "progress fields" to the parser is what I'm asking for. Edit:
I hope that helps! |
I was looking into how https://www.youtube.com/watch?v=-eELDwqIyGg It looks like I can parse the progress info from |
Unfortunately I can see from this video that even the Some of the data will be useless, like % copied (will always be 100%) and ETA (will always be 0:00:00), but it looks like I could grab the transferred bytes, throughput speed, xfer#, and to-check data after the linefeed character is output. |
Out of curiosity, are you sure it's not "line by line"? I think the terminal operator is presenting it as the same line - probably using some sort of terminal modifier ascii. In nodejs (javascript) land, Example: // For rsync --progress --partial --append
child.stdout.on('data', (dataObj) => {
const data = dataObj.toString()
// Output: 14,581,760 0% 2.20MB/s 0:13:22
}) This regex: const regexMatch = /\s+(([0-9,]?)+\s+)(?<progress>.+?(?=%))/
// Will correctly retrieve the progress percentage. I can play with the regex to retrieve the other information if you'd like some help with regex matching. Looking at this stack question, I can see that they're piping |
Trying to use
jc
to parse the output from the following command and the command just hangs with no output. Not sure what's going on.Command:
rsync --timeout=60 --progress --bwlimit=2000 --recursive --mkpath --partial --append -i --stats --rsh="/usr/bin/sshpass -fsecrets ssh -o StrictHostKeyChecking=no" sshuser@sshhost:/remotePath /localPath/ | jc --rsync-s
I've tried both
rsync
andrsync-s
with no luck.I added
--itemize-changes
and--stats
, with the same result.Simulating the output from
rsync
, there is no output fromjc
.This is the regex I was using in javascript to parse the progress.
rsync version 3.3.0 protocol version 31
jc version: 1.25.3
Edit: After trying many things, there was ONE time I got this:
The text was updated successfully, but these errors were encountered: