Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving Developer Experience: C# Source Generator #330

Open
davidngjy opened this issue Jan 2, 2025 · 5 comments
Open

Improving Developer Experience: C# Source Generator #330

davidngjy opened this issue Jan 2, 2025 · 5 comments
Labels
enhancement New feature or request

Comments

@davidngjy
Copy link

Hi,

I’d like to suggest a way to improve the developer experience when converting schemas to C# models. As I'm aware, there is currently a CLI tool to generate the C# model, which can be checked in manually or automated via a pipeline.
(If there are any other approaches I'm unaware of, please let me know!)

With the advent of C# source generators and more recently incremental source generators, I see an opportunity for the C# model to be generated at compile time from the .avsc file (as long as it's discoverable by the project – for example, by including it as part of the build).

This approach could greatly enhance the developer experience, as the C# model would be automatically available once the .avsc file is discovered by the source generator, with the proper annotations in place as well (making it foolproof).

The Idea:

  1. A NuGet package that contains .avsc files, using AdditionalFiles to include them as part of the build process.
  2. A NuGet package that discovers any available .avsc files and generates the corresponding C# models.
  3. A host (web, console, etc.) project that references both NuGet packages, making the generated C# models available.
    Note: It’s possible to skip step (1) if the service isn’t sharing schemas and has them included as part of the host project.

Does this feature align with the vision of the library? I'd be happy to contribute! I have a small proof of concept that works, though it has limited support for types. If you're interested, I’d be glad to share it for the discussion.

@dstelljes
Copy link
Member

Hey David, we're definitely open to this and would love to see the proof of concept. The CLI tool fronts the Chr.Avro.Codegen library, and I'd imagine much of that would be reusable.

@dstelljes dstelljes added the enhancement New feature or request label Jan 7, 2025
@davidngjy
Copy link
Author

Hey @dstelljes,

Here's the POC that currently works with boolean and string.
Given a schema and set it to include as AdditionalFiles it'll generate the equivalent C# model (update as the schema changes).

The example in the Console app generates the following C# class (as record type)
image
It uses some of the new features (required, nullable, init) that only supports .NET 7 and above.
We could look into changing it if we need to support lower frameworks.

I've tried to use the Chr.Avro.Codegen you've pointed out as we could leverage the generated output but I'm getting some analyzer warning when I reference the project through the new source generated project.
image
This is my first time dabbling with roslyn, I assume the Workspace is not compatibility with Source Generator?

@dstelljes
Copy link
Member

Nice, thanks for sharing!

It uses some of the new features (required, nullable, init) that only supports .NET 7 and above.
We could look into changing it if we need to support lower frameworks.

I'd like to make this type of thing configurable if we can. There are some other codegen enhancement ideas (particularly #261 and #149) where it makes sense to support additional configuration, and there are folks who might prefer records with primary constructors rather than classes with required init-only properties, for example.

Big fan of the Pascal-case name conversion you're doing; that's a longstanding feature request here as well (#167).

This is my first time dabbling with roslyn, I assume the Workspace is not compatibility with Source Generator?

This is just a guess, but we could probably get rid of the Microsoft.CodeAnalysis.Workspaces dependency. At present, the codegen library only relies on an AdhocWorkspace for formatting. To get this working, we could move formatting to the CLI and limit the codegen dependency to Microsoft.CodeAnalysis.CSharp.

@davidngjy
Copy link
Author

davidngjy commented Jan 18, 2025

I think there is a out of box support to provide configuration/properties the source generator.

I've given a stab at pulling the Microsoft.CodeAnalysis.Workspaces into the CLI project and removed it from the Codegen project.
Here's a working poc using the Codegen output and write it to the final source generator output.

This schema

{
  "name": "something_created",
  "namespace": "important.service.domain.contract",
  "type": "record",
  "fields": [
    {
      "name": "nullable_string",
      "type": ["null", "string"],
      "default": null
    },
    {
      "name": "string_with_default",
      "type": "string",
      "default": "abc"
    },
    {
      "name": "string_without_default",
      "type": "string"
    },
    {
      "name": "nullable_boolean",
      "type": ["null", "boolean"],
      "default": null
    },
    {
      "name": "boolean_with_default",
      "type": "boolean",
      "default": true
    },
    {
      "name": "boolean_without_default",
      "type": "boolean"
    }
  ]
}

Generates
Image

Most of the incremental source generator tutorials I come across generate codes using direct strings, I'm not familiar with the SyntaxFactory, does it allow more flexibility in terms of the code it can generate?

To have the property names in Pascal, would it be ideal to have the annotation set to avoid any problem with the "fuzzy" matching logic?

@dstelljes
Copy link
Member

This looks really good! Whenever you're ready, feel free to open a PR.

Most of the incremental source generator tutorials I come across generate codes using direct strings, I'm not familiar with the SyntaxFactory, does it allow more flexibility in terms of the code it can generate?

The big advantage is that we can manipulate the syntax tree after it's generated. For instance, codegen currently spits out fully-qualified type names, then walks the tree afterward to simplify. It'd be possible to do the same thing without Roslyn, but the separation wouldn't be as clean.

To have the property names in Pascal, would it be ideal to have the annotation set to avoid any problem with the "fuzzy" matching logic?

I think that's a good idea. It'd also be good to have a way to disable property name rewriting (especially since codegen currently doesn't do any rewriting and folks may depend on that behavior); when running in that configuration we could omit the annotations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants