Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression with command substitution when compiled with SHOPT_SPAWN #104

Closed
McDutchie opened this issue Aug 1, 2020 · 14 comments
Closed
Labels
blocker This had better be fixed before releasing bug Something is not working

Comments

@McDutchie
Copy link

After fixing #79 there is another regression that shows up when ksh is compiled with SHOPT_SPAWN to use sh_ntfork() which uses AST spawnveg(3) which uses posix_spawn(3). A minimal reproducer for the regression involves a combination of command substitution, a here-document, and an external command:

output=$(
cat <<EOF
$(ls /dev/null)
EOF
)
echo "$output"

Actual output: (empty)
Expected output: /dev/null

I can reproduce this regression on a 93u+ 2012-08-01 compiled to use SHOPT_SPAWN, so 93u+m did not introduce this bug.

If ksh is compiled with -DSHOPT_SPAWN=0 the bug does not occur.

If the command substitution is forked, e.g. with ulimit -t unlimited, the bug does not occur:

output=$(
ulimit -t unlimited
cat <<EOF
$(ls /dev/null)
EOF
)
echo "$output"

Output as expected: /dev/null

So this is a bug that is, in part, related to non-forking subshells. There must be a bug somewhere in sh_ntfork(), or some function called directly or indirectly by it, that causes I/O not to be set up correctly.

@McDutchie McDutchie added bug Something is not working blocker This had better be fixed before releasing labels Aug 1, 2020
@McDutchie McDutchie changed the title Regression when compiled with SHOPT_SPAWN Regression with command substitution when compiled with SHOPT_SPAWN Aug 1, 2020
@McDutchie
Copy link
Author

When disabling the use of posix_spawn(3) and vfork(3) in src/lib/libast/comp/spawnveg.c, leaving only the fork(2) version, the bug still occurs. So the bug has nothing to do with that.

@JohnoKing
Copy link

JohnoKing commented Aug 2, 2020

This bug was fixed in ksh93v- 2013-10-10-alpha. Red Hat has a patch that backports the bugfix:
https://git.centos.org/rpms/ksh/blob/c8/f/SOURCES/ksh-20120801-macro.patch

There is a different bug with command substitution that is affected by that patch. The following command has incorrect output in both ksh93u+ and ksh93v-, although it differs in each version due to the bugfix (this was reported in att#1479):

# Correct output (tested with dash, bash, yash, zsh and mksh)
$ bash -c $'echo "<$(sh -c \'(sleep 1; echo after) &\'; echo now)>"'
<now
after>
# Incorrect output, 'after' appears before 'now' (occurs in ksh93u+ and ksh93u+m)
$ ksh93u -c $'echo "<$(sh -c \'(sleep 1; echo after) &\'; echo now)>"'
<after
now>
# Also incorrect output, 'after' never appears (occurs in ksh93n-, ksh93v-, ksh2020
# and when ksh-20120801-macro.patch is applied to ksh93u+)
$ ksh93v -c $'echo "<$(sh -c \'(sleep 1; echo after) &\'; echo now)>"'
<now>

There is also another bug that affects ksh93u+ and ksh2020:

# Correct output
$ bash -c 'echo "<$( ( (sleep 1; echo after) &);  echo now)>"'
<now
after>

# Incorrect output
$ ksh -c 'echo "<$( ( (sleep 1; echo after) &);  echo now)>"'
<now>
$ <after>

@McDutchie
Copy link
Author

McDutchie commented Aug 2, 2020

Thanks for the pointer to the patch. It certainly improves things, so I'll apply it. We should file another issue for the other command substitution bug.

@McDutchie
Copy link
Author

Actually, on further testing, the Red Hat patch causes two regressions in signal.sh:

test signal begins at 2020-08-02+22:29:29
	signal.sh[273]: subshell ignoring signal does not send signal to parent
	signal.sh[276]: subshell catching signal does not send signal to parent
test signal failed at 2020-08-02+22:29:37 with exit code 2 [ 48 tests 2 errors ]

Diff below. Any ideas?

Author: Martijn Dekker <[email protected]>
Date:   Sun Aug 2 19:50:38 2020 +0100

    Apply Red Hat fix for cmd substitutions invoked from here-documents
    
    If ksh was compiled with SHOPT_SPAWN, non-forked command
    substitutions invoked from here-documents were broken: the output
    of any external command they ran was not captured. Reproducer:
            output=$(
                    cat <<-EOF
                            $(ls /dev/null)
                    EOF
            )
            echo "$output"
    Expected output: /dev/null
    Actual output: (empty)
    
    src/cmd/ksh93/include/io.h,
    src/cmd/ksh93/sh/xec.c,
    src/cmd/ksh93/sh/subshell.c:
    - Apply ksh-20120801-macro.patch from Red Hat which backported a
      fix for this bug from the abandoned ksh 93v beta. Source:
      https://git.centos.org/rpms/ksh/blob/c8/f/SOURCES/ksh-20120801-macro.patch
    - Rename iounpipe() to sh_iounpipe() for consistency with the other
      externs, as was also done in ksh 93v.
    
    src/cmd/ksh93/tests/subshell.sh:
    - Test running an external command in a command substitution in a
      here-document. Test $() and `` variants of command substitution.
    
    Fixes: https://github.com/ksh93/ksh/issues/104

diff --git a/src/cmd/ksh93/include/io.h b/src/cmd/ksh93/include/io.h
index f82c5ca..5011a03 100644
--- a/src/cmd/ksh93/include/io.h
+++ b/src/cmd/ksh93/include/io.h
@@ -81,6 +81,7 @@ extern void 	sh_iosave(Shell_t *, int,int,char*);
 extern int 	sh_iovalidfd(Shell_t*, int);
 extern int 	sh_inuse(Shell_t*, int);
 extern void 	sh_iounsave(Shell_t*);
+extern void	sh_iounpipe(Shell_t*);
 extern int	sh_chkopen(const char*);
 extern int	sh_ioaccess(int,int);
 extern int	sh_devtofd(const char*);
diff --git a/src/cmd/ksh93/sh/subshell.c b/src/cmd/ksh93/sh/subshell.c
index 5b12f92..4b6295e 100644
--- a/src/cmd/ksh93/sh/subshell.c
+++ b/src/cmd/ksh93/sh/subshell.c
@@ -176,7 +176,7 @@ void sh_subfork(void)
 {
 	register struct subshell *sp = subshell_data;
 	Shell_t	*shp = sp->shp;
-	int	curenv = shp->curenv;
+	int	curenv = shp->curenv, comsub=shp->comsub;
 	pid_t pid;
 	char *trap = shp->st.trapcom[0];
 	if(trap)
@@ -208,7 +208,7 @@ void sh_subfork(void)
 		shp->subshell = 0;
 		shp->comsub = 0;
 		sp->subpid=0;
-		shp->st.trapcom[0] = trap;
+		shp->st.trapcom[0] = (comsub==2?NULL:trap);
 		shp->savesig = 0;
 		/* sh_fork() increases ${.sh.subshell} but we forked an existing virtual subshell, so undo */
 		SH_SUBSHELLNOD->nvalue.s--;
@@ -750,7 +750,6 @@ Sfio_t *sh_subshell(Shell_t *shp,Shnode_t *t, volatile int flags, int comsub)
 		shp->coutpipe = sp->coutpipe;
 	}
 	shp->subshare = sp->subshare;
-	shp->comsub = sp->comsub;
 	shp->subdup = sp->subdup;
 	if(shp->subshell)
 	{
@@ -779,7 +778,12 @@ Sfio_t *sh_subshell(Shell_t *shp,Shnode_t *t, volatile int flags, int comsub)
 	if(nsig>0)
 		kill(getpid(),nsig);
 	if(sp->subpid)
+	{
 		job_wait(sp->subpid);
+		if(comsub>1)
+			sh_iounpipe(shp);
+	}
+	shp->comsub = sp->comsub;
 	if(comsub && iop && sp->pipefd<0)
 		sfseek(iop,(off_t)0,SEEK_SET);
 	if(shp->trapnote)
diff --git a/src/cmd/ksh93/sh/xec.c b/src/cmd/ksh93/sh/xec.c
index baa8d6d..e68cb34 100644
--- a/src/cmd/ksh93/sh/xec.c
+++ b/src/cmd/ksh93/sh/xec.c
@@ -151,26 +151,31 @@ static /*inline*/ double timeval_to_double(struct timeval tv)
  * temp file.
  */
 static int      subpipe[3],subdup,tsetio,usepipe;
-static void iounpipe(Shell_t*);
+void		sh_iounpipe(Shell_t*);
 
-static int iousepipe(Shell_t *shp)
+int iousepipe(Shell_t *shp)
 {
-	int i;
+	int fd=sffileno(sfstdout),i,err=errno;
 	if(usepipe)
 	{
 		usepipe++;
-		iounpipe(shp);
+		sh_iounpipe(shp);
 	}
 	if(sh_rpipe(subpipe) < 0)
 		return(0);
 	usepipe++;
-	fcntl(subpipe[0],F_SETFD,FD_CLOEXEC);
-	subpipe[2] = sh_fcntl(1,F_DUPFD,10);
-	fcntl(subpipe[2],F_SETFD,FD_CLOEXEC);
+	if(shp->comsub!=1)
+	{
+		subpipe[2] = sh_fcntl(subpipe[1],F_DUPFD,10);
+		sh_close(subpipe[1]);
+		return(1);
+	}
+	subpipe[2] = sh_fcntl(fd,F_dupfd_cloexec,10);
 	shp->fdstatus[subpipe[2]] = shp->fdstatus[1];
-	close(1);
-	fcntl(subpipe[1],F_DUPFD,1);
-	shp->fdstatus[1] = shp->fdstatus[subpipe[1]];
+	while(close(fd)<0 && errno==EINTR)
+		errno = err;
+	fcntl(subpipe[1],F_DUPFD,fd);
+	shp->fdstatus[1] = shp->fdstatus[subpipe[1]]&~IOCLEX;
 	sh_close(subpipe[1]);
 	if(subdup=shp->subdup) for(i=0; i < 10; i++)
 	{
@@ -184,14 +189,23 @@ static int iousepipe(Shell_t *shp)
 	return(1);
 }
 
-static void iounpipe(Shell_t *shp)
+void sh_iounpipe(Shell_t *shp)
 {
-	int n;
+	int fd=sffileno(sfstdout),n,err=errno;
 	char buff[SF_BUFSIZE];
-	close(1);
-	fcntl(subpipe[2], F_DUPFD, 1);
-	shp->fdstatus[1] = shp->fdstatus[subpipe[2]];
+	if(!usepipe)
+		return;
 	--usepipe;
+	if(shp->comsub>1)
+	{
+		sh_close(subpipe[2]);
+		while(read(subpipe[0],buff,sizeof(buff))>0);
+		goto done;
+	}
+	while(close(fd)<0 && errno==EINTR)
+		errno = err;
+	fcntl(subpipe[2], F_DUPFD, fd);
+	shp->fdstatus[1] = shp->fdstatus[subpipe[2]];
 	if(subdup) for(n=0; n < 10; n++)
 	{
 		if(subdup&(1<<n))
@@ -223,6 +237,7 @@ static void iounpipe(Shell_t *shp)
 		else if(errno!=EINTR)
 			break;
 	}
+done:
 	sh_close(subpipe[0]);
 	subpipe[0] = -1;
 	tsetio = 0;
@@ -1528,10 +1543,14 @@ int sh_exec(register const Shnode_t *t, int flags)
 			if(shp->subshell)
 			{
 				sh_subtmpfile(shp);
-				if(shp->comsub==1 && !(shp->fdstatus[1]&IONOSEEK))
-					unpipe=iousepipe(shp);
 				if((type&(FAMP|TFORK))==(FAMP|TFORK))
-					sh_subfork();
+				{
+					if(shp->comsub && !(shp->fdstatus[1]&IONOSEEK))
+					{
+						unpipe = iousepipe(shp);
+						sh_subfork();
+					}
+				}
 			}
 			no_fork = !ntflag && !(type&(FAMP|FPOU)) && !shp->subshell &&
 			    !(shp->st.trapcom[SIGINT] && *shp->st.trapcom[SIGINT]) &&
@@ -1586,7 +1605,7 @@ int sh_exec(register const Shnode_t *t, int flags)
 				if(parent<0)
 				{
 					if(shp->comsub==1 && usepipe && unpipe)
-						iounpipe(shp);
+						sh_iounpipe(shp);
 					break;
 				}
 #else
@@ -1623,7 +1642,7 @@ int sh_exec(register const Shnode_t *t, int flags)
 					if(shp->topfd > topfd)
 						sh_iorestore(shp,topfd,0);
 					if(usepipe && tsetio &&  subdup && unpipe)
-						iounpipe(shp);
+						sh_iounpipe(shp);
 					if(!sh_isoption(SH_MONITOR))
 					{
 						shp->trapnote &= ~SH_SIGIGNORE;
@@ -1823,7 +1842,7 @@ int sh_exec(register const Shnode_t *t, int flags)
 						shp->exitval = type;
 				}
 				if(shp->comsub==1 && usepipe)
-					iounpipe(shp);
+					sh_iounpipe(shp);
 				shp->pipepid = 0;
 				shp->st.ioset = 0;
 				if(simple && was_errexit)
@@ -2871,7 +2890,7 @@ pid_t _sh_fork(Shell_t *shp,register pid_t parent,int flags,int *jobid)
 			{
 				if(shp->topfd > restorefd)
 					sh_iorestore(shp,restorefd,0);
-				iounpipe(shp);
+				sh_iounpipe(shp);
 			}
 		}
 		return(parent);
@@ -3183,8 +3202,7 @@ static void sh_funct(Shell_t *shp,Namval_t *np,int argn, char *argv[],struct arg
 	struct funenv fun;
 	char *fname = nv_getval(SH_FUNNAMENOD);
 	struct Level	*lp =(struct Level*)(SH_LEVELNOD->nvfun);
-	int		level, pipepid=shp->pipepid, comsub=shp->comsub;
-	shp->comsub = 0;
+	int		level, pipepid=shp->pipepid;
 	shp->pipepid = 0;
 	sh_stats(STAT_FUNCT);
 	if(!lp->hdr.disc)
@@ -3227,7 +3245,6 @@ static void sh_funct(Shell_t *shp,Namval_t *np,int argn, char *argv[],struct arg
 	lp->maxlevel = level;
 	SH_LEVELNOD->nvalue.s = lp->maxlevel;
 	shp->last_root = nv_dict(DOTSHNOD);
-	shp->comsub = comsub;
 #if 0
 	nv_putval(SH_FUNNAMENOD,shp->st.funname,NV_NOFREE);
 #else
diff --git a/src/cmd/ksh93/tests/subshell.sh b/src/cmd/ksh93/tests/subshell.sh
index 71ff9af..b6a4544 100755
--- a/src/cmd/ksh93/tests/subshell.sh
+++ b/src/cmd/ksh93/tests/subshell.sh
@@ -757,5 +757,18 @@ SHELL=$SHELL "$SHELL" -c '
 ' | awk '/^DEBUG/ { pid[NR] = $2; }  END { exit !(pid[1] == pid[2] && pid[2] == pid[3]); }' \
 || err_exit "setting PATH to readonly in subshell triggers an erroneous fork"
 
+# ======
+# Test command substitution with external command in here-document
+# Ref.: https://github.com/ksh93/ksh/issues/104
+expect=$'/dev/null\n/dev/null'
+actual=$(
+	cat <<-EOF
+		$(ls /dev/null)
+		`ls /dev/null`
+	EOF
+)
+[[ $actual == "$expect" ]] || err_exit 'Command substitution in here-document fails' \
+	"(expected $(printf %q "$expect"), got $(printf %q "$actual"))"
+
 # ======
 exit $((Errors<125?Errors:125))

@hyenias
Copy link

hyenias commented Aug 3, 2020

I went clicking around the referenced CentOS patches for ksh and saw other patches making modifications to the files above and more. As a shot in the dark, what happens if you use their ksh93 build with all of their supplied patches? Does their build resolve the issue? It may be the case that the referred to patch builds on top of others.

@McDutchie
Copy link
Author

McDutchie commented Sep 21, 2020

I've now gone through all the remotely plausible Red Hat patches to try if they fix that signals.sh regression, and none of them fixed it. Instead, after a lot of detective work, I've found a fix myself. This block in sh_exec() in xec.c:

 			if(shp->subshell)
 			{
 				sh_subtmpfile(shp);
 				if((type&(FAMP|TFORK))==(FAMP|TFORK))
				{
					if(shp->comsub && !(shp->fdstatus[1]&IONOSEEK))
					{
						unpipe = iousepipe(shp);
						sh_subfork();
					}
				}
 			}

needs to be changed to:

 			if(shp->subshell)
 			{
 				sh_subtmpfile(shp);
 				if((type&(FAMP|TFORK))==(FAMP|TFORK))
				{
					if(shp->comsub && !(shp->fdstatus[1]&IONOSEEK))
						unpipe = iousepipe(shp);
					sh_subfork();
				}
 			}

IOW, if the lexical type flag indicates something that should fork the current subshell, we should always fork it and not just if we're in a command substitution that is connected to an unseekable file. The original 93u+ code always forked in this circumstance, so I've changed it back to doing that. This change fixes the signals.sh regression without introducing other regressions (at least as far as the ksh and modernish regression tests show).

@McDutchie
Copy link
Author

Now that I have access to some of the relevant private Red Hat bugs, I was able to find that this Red Hat patch introduces another bug as well. The following fails after the patch:

cat /dev/null | /bin/echo `ls /dev/null`

(expected: /dev/null; actual: no output)

However, for this one I've found another Red Hat patch that fixes it – it's ksh-20120801-fd2lost.patch which was the fix for rhbz#994251 and rhbz#1036802 (those bug reports are still closed to the public as of this writing, but I was given access). I'll apply that patch with this one in combination.

@avih
Copy link

avih commented Sep 21, 2020

IOW, if the lexical type flag indicates something that should fork the current subshell, we should always fork it and not just if we're in a command substitution that is connected to an unseekable file

I'm completely unfamiliar with the ksh code base, but I do recall that ksh has optimizations to avoid fork on some cases of subshells, and the change you're describing sounds like it might unoptimize one of those cases.

Do you also run performance regression tests? Even if not, it might be worth testing this change for performance.

The original 93u+ code always forked in this circumstance

I thought this project is based on ksh93u+ ? To quote README.md: "This repository is used to develop bugfixes to the last stable release (93u+ 2012-08-01) of ksh93", so where was the bug introduced? during some earlier bugfix of this project after ksh93u+ ?

@McDutchie
Copy link
Author

McDutchie commented Sep 21, 2020

Correct, it is based on ksh93u+. Which is probably why that change broke things.

I do run performance regression test using shbench.

The bug was introduced in ksh 93u+ or before. The 93v- beta fixes it. The upcoming commit backports the fix.

@avih
Copy link

avih commented Sep 21, 2020

Not sure I follow.

As far as I understand the fix is to "always fork, instead of only sometimes fork" in this case, but you also say that ksh93u+ always forked in this case, and also that it already has the bug.

So how can it be that it always forked (i.e. should not be buggy) but you also say that it was still buggy?

@McDutchie
Copy link
Author

What you're missing is that we're talking about two different bugs.

The bug reported in this issue (#104) was always present. The upcoming commit will fix it.

The other bug, which my change to the patch fixes, was introduced by the original patch. It will never make it into this repo because I have fixed it before committing anything.

@avih
Copy link

avih commented Sep 21, 2020

Thanks. I'll take some time to look closer at the code and changes.

I'm assuming you're right of course, and it's only my understanding for now which is at fault.

@McDutchie
Copy link
Author

Good luck. Took me about half a year of working intensively with this code to begin to feel like I understand some of it.

@avih
Copy link

avih commented Sep 21, 2020

Got lost trying to trace it with the multiple references and patch sources and recent changes. I'll take your word ;)

Thanks again for the patience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocker This had better be fixed before releasing bug Something is not working
Projects
None yet
Development

No branches or pull requests

4 participants