Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse returns error instead of terminating with log.Fatalf #19

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

its-luca
Copy link
Contributor

@its-luca its-luca commented Aug 4, 2022

This is my attempt at solving #18 . Feedback is welcome.
(I used the golang.org/x/tools/cmd/goyacc version of goyacc)

bibtex.y Outdated
Comment on lines 89 to 95
select {
case err := <-l.Errors:
return nil,err
default :
return nil, err

}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get it, why do we need to do this?

Copy link
Owner

@nickng nickng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really cool, thank you - I didn't know it's possible to inject a yacc error like this.
Just one question on the nested select - why? And is there an example where that the error is more friendly?

| longstring tPOUND tIDENT { $$ = NewBibComposite($1); $$.(*BibComposite).Append(NewBibConst($3))}
| longstring tPOUND tBAREIDENT { $$ = NewBibComposite($1); $$.(*BibComposite).Append(bib.GetStringVar($3)) }
| longstring tPOUND tBAREIDENT { $$ = NewBibComposite($1); v,err := bib.GetStringVar($3); if err != nil {bibtexlex.Error(fmt.Sprintf("%v",err))} else {$$.(*BibComposite).Append(v)} }
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above

Suggested change
| longstring tPOUND tBAREIDENT { $$ = NewBibComposite($1); v,err := bib.GetStringVar($3); if err != nil {bibtexlex.Error(fmt.Sprintf("%v",err))} else {$$.(*BibComposite).Append(v)} }
| longstring tPOUND tBAREIDENT { $$ = NewBibComposite($1); v, err := bib.GetStringVar($3); if err != nil { bibtexlex.Error(err.Error()) } else {$$.(*BibComposite).Append(v)} }

@@ -60,9 +61,9 @@ preambleentry : tATSIGN tPREAMBLE tLBRACE longstring tRBRACE { $$ = $4 }
;

longstring : tIDENT { $$ = NewBibConst($1) }
| tBAREIDENT { $$ = bib.GetStringVar($1) }
| tBAREIDENT { v,err := bib.GetStringVar($1); if err != nil { bibtexlex.Error(fmt.Sprintf("%v",err)) } else {$$ = v} }
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whitespaces and I'd probably use the Error() to convert to error to string than fmt it.

Suggested change
| tBAREIDENT { v,err := bib.GetStringVar($1); if err != nil { bibtexlex.Error(fmt.Sprintf("%v",err)) } else {$$ = v} }
| tBAREIDENT { v, err := bib.GetStringVar($1); if err != nil { bibtexlex.Error(err.Error()) } else {$$ = v} }

bibtex.y Outdated
@@ -82,8 +83,16 @@ func Parse(r io.Reader) (*BibTex, error) {
select {
case err := <-l.Errors: // Non-yacc errors
return nil, err
case err := <-l.ParseErrors:
return nil, err
case err := <-l.ParseErrors: //yacc errors
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't technically true I think? The error returned here now can be yacc (syntax) error or variable not found

@its-luca
Copy link
Contributor Author

Sorry for being unresponsive, I was on vacation.

I Introduced the nested select because I noticed that the TestUnexpectedAtSign would sometimes fail as both channels emitted an error. In that case, we get racy behaviour regarding the returned error message. To my understanding, the errors in l.Errors are all handcrafted and thus more user friendly.

While using this PR in a project, I noticed two remaining issues that need to be addressed before this can get merged

  1. Sometimes more than one Error is added to either l.ParseErrors leading to a deadlock, as the channel only has a capacity of one. Is there any reason why we use a channel instead of, i.e. a slice in the first place? Otherwise we could just use a slice and either return all errors or only the first one ( Sth. like this https://gist.github.com/its-luca/0f90ea071e8ce74c2f1005205c71fbda )
func TestParser_NoDeadlockAnMultipleErrors(t *testing.T) {
	//this entry raises multiple errors that lead to a deadlock on capacity one channels
	badEntry := `@article{badEntry,
	  title={Bad Entry Title},
	  journal=should Be Quoted,
	}`

	_, err := Parse(strings.NewReader(badEntry))
	if err == nil {
		t.Fatal("Expected parser error got none")
	}
}
  1. This is the more serious problem. Somehow, there is persisted state between two parse runs. Parsing a malformed entry can lead a second parse run to fail. E.g. this test fails:
// TestParser_ParseGoodEntryAfterBadEntry checks that parsing a bad entry does not influence further calls to parse
func TestParser_ParseGoodEntryAfterBadEntry(t *testing.T) {
	goodEntry := `@article{goodEntry,
  title={Good Entry Title},
}`
	badEntry := `@article{badEntry,
	  title={Bad Entry Title},
	  journal=should Be Quoted,
	}`

	_, err := Parse(strings.NewReader(badEntry))
	if err == nil {
		t.Fatal("Expected parser error got none")
	}

	t.Logf("Done with badEntry")
	_, err = Parse(strings.NewReader(goodEntry))
	if err != nil {
		t.Fatalf("Did not expect parser error but got %v", err)
	}
}

(Note that this requires you to either increase the ParseErrors and Errors channel capacity in lexer.go or use the patch referenced in the gist. Otherwise you run into the deadlock error.

@nickng
Copy link
Owner

nickng commented Aug 18, 2022

Is there any reason why we use a channel instead of, i.e. a slice in the first place

The yacc generated parser don't return the idiomatic (token, error), which was why we set up a separate channel for error(s) for the lexer to return the error in a back-channel. I'm not against putting it in a slice though, it makes sense.

@nickng
Copy link
Owner

nickng commented Aug 18, 2022

I've updated the error reporting and handling -- if you wouldn't mind merging those changes here and see if that works with your change to return error 🙏

@its-luca its-luca force-pushed the featureErrorHandlingInParse branch from 84ede81 to 26afbc3 Compare August 23, 2022 06:27
@its-luca
Copy link
Contributor Author

Thanks. I merged your changes. I still need to figure out why the error state persists between parser runs though

@nickng
Copy link
Owner

nickng commented Aug 24, 2022

Thanks. I merged your changes. I still need to figure out why the error state persists between parser runs though

I don't necessarily think it's state persisting between parser runs. Even if you ignore the errors like v, _ := bib.GetStringVar($3) you still get the same issue of failing to parse a valid bibtex. I think the function signature returning (v, error) is interfering with yacc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants