Discuss: File format in the .NET Interactive VS Code extension #2736
Replies: 20 comments 1 reply
-
If If it's intended to be more widely adopted, is this file format extendable and easily convertible so that other language communities can extend their notebook experience in VSCode? |
Beta Was this translation helpful? Give feedback.
-
The file format can be extended to the degree that .NET Interactive itself can be extended, whether programmatically within the script or via a NuGet extension. |
Beta Was this translation helpful? Give feedback.
-
I think that's a good list of technical goals. It's fine to contemplate a new file format but it should be 100% optional until it is an established thing - which roughly means online and client tooling for the format is very widespread (including scripting executors for it), stable and trusted, and the format has been emotionally accepted by most potential users especially in team or collaborative settings. Also TBH Microsoft have a reputation when it comes to file formats, we have to be aware of that and be cautious here. Note that acceptance of new file formats is a social/network/human/emotional/trust thing - these things are not technical things under our control though we can make technical decisions that will affect whether effective acceptance is eventually achieved (and I don't see any particular technical problem with the technical decisions above). However by their nature all new file formats start at "acceptance level of minus 10000" sadly.... I speaking from experience with I think this means loading, saving, editing and executing .ipynb is necessary in very many scenarios (potentially by internally converting to this format on load, but that should not be user-visible). |
Beta Was this translation helpful? Give feedback.
-
The file format is optional. The benefits are related to polyglot functionality and automation (e.g. running in production), neither of which are directly supported by Jupyter. Not all users will care about these goals, but many do, and they're becoming more common as notebooks become popular in new areas. For these cases, this format might provide some convenience. The file format arose organically because it's just the .NET Interactive submission format: magic commands delineating blocks of code in potentially different languages. Since .NET Interactive is submission-centric rather than cell-centric, the following (one cell): #!pwsh
< PowerShell code >
#!fsharp
< F# code > ...is equivalent to this (two cells): #!pwsh
< PowerShell code > #!fsharp
< F# code > This is why the file format started here. We're simply concatenating all of the cells. |
Beta Was this translation helpful? Give feedback.
-
I agree entirely. Assuming we can sort out the default experience for a given file extension such that we're not in conflict with the VS Code Python extension's Jupyter support, an intuitive workflow might be to allow you to open and work with either format, and when you save, it saves it back to the same file in the same format. You never have to think about the other file format until or unless you want to explicitly convert between them. |
Beta Was this translation helpful? Give feedback.
-
I apply this snippet in Jupyter config Then Jupyter automatically saves Python notebook into Isn't this functionality already working in Jupyter? Why do we need another file format? I can only imagine the use case for polyglot notebook. It's hard to decide whether it's |
Beta Was this translation helpful? Give feedback.
-
This looks like a good workflow and for Jupyter it's nice and simple. You don't need an alternative file format and it's not my intention to give the impression that we're requiring one.
Yes, the alternative file format helps with polyglot scenarios (which not everyone has). But people might also find benefits when editing with plain text editors or using code review tools. These considerations motivated a similar approach in the VS Code Python extension allowing execution of "cells" within a But this isn't a language-specific tool, so we wanted to explore a format that doesn't bias toward one language. Conceptually, .NET Interactive's polyglot support includes HTML and JavaScript and might grow to other languages like SQL, bash (#465), and so on. These aren't just magic commands but are implemented as in-process subkernels. These are user-extensible. Here's a sample ( |
Beta Was this translation helpful? Give feedback.
-
It would be quite nice if VSCode simply prompted in this case. Any idea what actually happens? Also I guess VSCode can't probe into the file to check if an extension claims ownership? |
Beta Was this translation helpful? Give feedback.
-
@dsyme I just tried it and currently our extension steals the association; there's no prompt and no way to re-associate with the Python extension (short of uninstalling), and since we can really only handle a subset of Jupyter notebooks (i.e., those using the .NET Interactive kernel), we shouldn't take the association outright. There's an open issue at microsoft/vscode#94408 that tracks this. As soon as this is fixed then we can safely associate wtih As for probing, for text files an extension can say whether it takes the association outright or if it should simply be added to the consider list, but nothing yet for notebooks. |
Beta Was this translation helpful? Give feedback.
-
I kinda like the dib approach... This enables me to run multiple languages in one notebook file. The other thing that I like about it is that it doesn't require me to install python/jupyter. Please keep dib as it's a lean approach (we'll convert files to ipynb if needed..e.g. run in jupyter ecosystem)... |
Beta Was this translation helpful? Give feedback.
-
I think it might be better to work on how to be able to call the current line or cell magics from with the already established .ipynb extension. That is already well establish by Jupyter and their community. I'd rather have the ability to call the different kernels from within one as an option to be tied to only one. |
Beta Was this translation helpful? Give feedback.
-
@bwilsonms The question of magic commands is orthogonal to the file format, I believe. We do have data sharing across kernels on our roadmap, but not direct invocation, which gets a bit thorny from a dependency standpoint. In effect, the Jupyter frontend (or e.g. VS Code or Azure Data Studio) would need to be aware of the different kernels and route submissions accordingly, so it's a bit out of .NET Interactive's scope to be aware of other kernels. There's also a syntax issue. The reason .NET Interactive's magic commands use the In case I've misunderstood, can you give a more specific example of what you have in mind? |
Beta Was this translation helpful? Give feedback.
-
I've come to this as a PowerShell person who wants to author PowerShell notebooks, preferably with VS code.
The current experience is you can Open a notebook and Run some code which generates output; Closing the notebook will prompt to Save changes, but later Re-opening the notebook shows the changes were NOT saved. That's referenced in #500 and it's the difference between an interesting proof of concept and something which can be used in the real world. I'm not clear what the objective for DIB files is.
And that's not an aim I'd normally take issue with. But this is a format for doing notebooks in VSCode so "edit anywhere" is not the priority it might be for other types. JSON, XML etc lend themselves to machine parsing, |
Beta Was this translation helpful? Give feedback.
-
@jhoneill If you want the result stored in the output file, I'd recommend using
We explicitly want to be interoperable with the larger ecosystem of tools that support
I think we've only partially solved it, as the magic command approach isn't great for tooling. For example, with this approach, the frontend doesn't know the language of the cell without parsing the code, which impacts things like code coloring and potentially completions (since for example rich language services aren't implemented by IPython and so an external service is often used to provide them). There are numerous other notebook file formats, many of which have different goals not achievable in a single format. You can see a good overview of use cases in the Jupytext docs: https://jupytext.readthedocs.io/en/latest/formats.html. We might not keep |
Beta Was this translation helpful? Give feedback.
-
See also: #739. |
Beta Was this translation helpful? Give feedback.
-
@jonsequitur -thanks the response: ipynb is established , and using the file format allows sharing between machines with different toolsets but doesn't guarantee the language(s) in the notebook will be supported. Giving the world a new format only makes sense if there is something which can't be done in or added to an existing format - innovating is a problem because if add something like polyglot support it's easy enough to add the JSON schema, but then everyone else needs to adopt the change, otherwise they the file parses properly but doesn't execute as expected. Hence you may be stuck with the partial solution you have now. |
Beta Was this translation helpful? Give feedback.
-
An improvement to this has been merged and we should have a new version of the extension published today. Looking forward to feedback on it.
Agreed. Regarding the file format aspect, one example is that certain organizations explicitly don't want the |
Beta Was this translation helpful? Give feedback.
-
As a fellow powersheller who has been seriously loving the simple and effective DIB format, I see HUGE value to storing the output in the file. otherwise, I'm going to have to use Start-Transcript or some other method of saving the output somewhere when used in an automated production environment. I only happened upon this discussion because I was lamenting dib not having saving of results and decided to go do some googling about it.
Why not? the dib format is awesome. As a powersheller who dabbles in .net/c#, dib is rockingly cool. Inability to save is just KILLER for my hopes. DreamingMind if I share a dream? Imagine that instead of having a powershell script in production, you have dib notebook instead. The script is broken down into cells with documentation living right with the actual code. A finance employee who depends on this script notices that it didn't produce any output and opens the file in VSCode and sees after each code block the last 3 sets of outputs that were generated by that block. This finance user is able to scroll through this notebook and identify exactly which cell didn't produce similar output on its most recent invocation (by comparing with the other ones present). They are able to quickly determine that the cell could not find the file that is expected in a certain folder, resolve the issue (put the file in that folder), and re-run the notebook. They see everything work perfectly like it should before they close it. How cool is that? No IT needed. The business unit MASSIVELY empowered by having historical output + code + documentation all living in a single place. Implementing Output Preservation SupportIn my mind, there are several ways that you could implement such a feature, and I can't find good documentation easily about the dib file format to see if any of this is already present or in a roadmap. One approach that seems attractive to me is to introduce a cell level option that indicates the number of prior executions to preserve. You could call it I'm imagining something like this: #!notebook --preserveOutputCount 1
#!powershell
Write-Output "Hello from PowerShell!"
#!fsharp --preserveOutputCount 0
let hello = "Hello from F#"
hello |> Console.WriteLine
#!csharp
#r "nuget:Humanizer,2.8.1"
using Humanizer;
#!share --from fsharp hello
Console.WriteLine(hello.Replace("F#", "C#").Transform(To.TitleCase)); Which would save only the most recent execution for each cell immediately below it (except for the fsharp cell, which would not save any output at all when ran. You could even let the default be that when RUNNING a notebook in an automated fashion or interactive fashion that it DOES NOT UPDATE the notebook file after execution. Then, to enable saving the output into the file after execution, add another option to the file level. Something like: Lastly, the HOW to save output. You don't need to save the STATE, the output doesn't need to be resumable. You can just let it be a plain text field like what markdown is, just without any formatter/markdown support. Maybe call it Rolling this all together, after running each cell twice (notice that my c# cell errors the 2nd time that I run it from VSCode), if I saved the file, it would look like this in notepad: #!notebook --preserveOutputCount 2
#!powershell
Write-Output "Hello from PowerShell!"
#!output
Hello from PowerShell!
#!output
Hello from PowerShell!
#!fsharp --preserveOutputCount 0
let hello = "Hello from F#"
hello |> Console.WriteLine
#!csharp
#r "nuget:Humanizer,2.8.1"
using Humanizer;
#!share --from fsharp hello
Console.WriteLine(hello.Replace("F#", "C#").Transform(To.TitleCase));
#!output
Error: Humanizer version 2.8.1 cannot be added because version 2.8.2 was added previously.
#!output
Hello From C# Honestly, I think its super cool to have the idea of lightweight notebooks in VSCode that can bring documentation, code, and historical output all together for use in a real world production environment. |
Beta Was this translation helpful? Give feedback.
-
Thanks for sharing this dream, @Szeraax! I think it would actually be fairly simple to prototype with our existing APIs. There are dreams over here that I think are related:
Area-Automation
We're trying to come up with a compelling automation story that I think will often be orthogonal to file formats. As part of this we're asking questions about what different file formats are useful for and what they provide versus what capabilities should be independent of file formats. Should we support additional formats as we go? How will those fit in? We'll try to make this more concrete in the new year. |
Beta Was this translation helpful? Give feedback.
-
Ok, I threw down my thoughts on those two that you linked and [I also commented] on the dream of saving outputs from another person: #839 When I wrote my original comment here, I didn't know about the proposal to compile notebooks. Lets assume that you:
Back to the compiled notebooks: obviously those wouldn't be editable for storing output cells. So either its drop the capability to store output cells into the notebook when running compiled notebooks, create a 2nd feature that lets you output to another file regardless of method (VSCode,dotnet,compiled), or pick only one of those two features to implement. I would advise against ONLY creating the feature that saves output into the notebook itself. So either only do that 2nd feature to output to external file, or do both. (hopefully everytime I hit "run all cells" in VSCode, it doesn't have to create a file with all outputs :P) I would like there to be something richer than just stout for viewing output cells from the compiled exe. My vision is for business units to increase their programming skills. I WANT them to be seeing code. I want them to be able to figure out root causes on their own (with time, of course!). Rather than my accounting person coming to me and saying, "block 5 failed on this exe, fix it please", I want that accounting person to be able to AT LEAST look at the markdown + code + output and see if they can tell what is wrong before they ever talk to me. Lets raise our sights and get ALL business units doing the tech work instead of heaping it onto IT :) Honestly, I'd be fine if the only feature added between saving output in dib file vs saving output + code + markdown to external was the external one. As for implementation, what about just creating the #!output cell type that I talked about and throwing all cells out to file? Viewing these files would generally be done in VSCode, so you don't need to do any rich text syntax highlighting. Just do a copy of the script and its #!output cells to a file and bango, you're done. |
Beta Was this translation helpful? Give feedback.
-
The primary reason for the experimental file format supported by .NET Interactive Notebooks is that there are a few design decisions to be worked out to enable the Microsoft Python extension for Visual Studio Code and the .NET Interactive Notebooks extension to work well together. Until a design is agreed upon, we want to avoid introducing potential compatibility issues into
.ipynb
.Here are some of the challenges:
.ipynb
file format that also takes into account:ipynb
file, and Python clearly takes precedence.I'd like to get people's thoughts on the experimental
.dib
file format (example below) that the .NET Interactive Visual Studio Code extension supports..ipynb
is also supported, as well as conversion between the two.These are the goals we've had in mind that led to the current state of the design:
It should be trivial to author and read in a simple text editor without special tooling and without concern about escaping, e.g. not JSON, XML, etc.
It should be copy-pasteable between different contexts, e.g. a bare script file, a notebook cell, or another .NET Interactive host (e.g. https://github.com/CodeConversations/CodeConversations)
It should be able to contain multiple languages, including languages not yet supported by .NET Interactive.
It should be amenable to a good tooling experience, including completions and other language services even when multiple languages are present.
It should be able to be opened as a notebook (as currently implemented by the .NET Interactive VS Code extension).
It should be executable as an automation script. In the simple case this need not be a polyglot script, but it should be able to take advantage of the full capabilities of .NET Interactive, e.g. magic commands, NuGet support, extensibility, variable sharing.
When in automated mode, it should be able to advertise and consume command line arguments in a readable, simple-to-author manner.
When in interactive mode (e.g. notebook, REPL), the command line inputs should be able to be collected from the user via a prompt or trivially configured inline the code.
Here's a simple example of the format:
Beta Was this translation helpful? Give feedback.
All reactions