-
I'm using parameterized pages and data loaders to create a report for each experiment I run, unfortunately the simplest approach I've found will duplicate each data file. These can be quite large and thus it's fairly wasteful to do this. In particular, I've got an experimental pipeline that dumps data files for review into To generate a report for each of these results, I have a folder import {parseArgs} from "node:util";
import {createReadStream} from "node:fs";
import {join} from "node:path";
const {values: {date, name}} = parseArgs({options: {name: {type: "string"}, date: {type: "string"}}});
const filename = join("src", "data", "results", date, `${name}.parquet`);
createReadStream(filename).pipe(process.stdout); Effectively, just As I understand it, this means that at compile time Observable will run each of these data loaders and make a copy of all of the parquet files referenced in Is there a way to have these data loaders merely link to the already extant data files? Alternatively, is there another way to make this kind of design work? Ideally with good compatibility for DuckDB and Mosaic. Thanks Edit: An additional thing came up in my workflow which could make this useful. In particular, I often rerun experiments over and over. This updates the modified time of the input files, but not of the copies. Thus, the copies remain cached even after they've expired. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 6 replies
-
If I understand correctly, you don't want to create these data files, since they already exist (and might even change over time). In this case data loaders aren't useful, and instead you could reference the files directly with their full URL. The sql front-matter supports loading data from URLs such as You'll have to figure out how to serve your parquet files from these URLs, but this is just standard web server config. You might also serve them from a different URL ( TBH I'm not really sure you need parametrized pages either in this case. You could have a single page with a drop-down menu input that lets you choose which experiment to load and display? |
Beta Was this translation helpful? Give feedback.
-
Is the issue that you can’t reference a parameter in the SQL front matter, and thus can’t compute the relative path from Can you instead generate the data in Alternatively, you could use JavaScript to initialize DuckDB instead of using the SQL front matter, as you describe in #1846. |
Beta Was this translation helpful? Give feedback.
Is the issue that you can’t reference a parameter in the SQL front matter, and thus can’t compute the relative path from
src/reports/[date]/report.md
tosrc/data/results/[date]/[name].parquet
?Can you instead generate the data in
src/reports/[date]/[name].parquet
instead, so you don’t need to copy it? Then you can use./inputs.parquet
rather than needing a data loader.Alternatively, you could use JavaScript to initialize DuckDB instead of using the SQL front matter, as you describe in #1846.