Repo layout proposal

### What is your issue?

As part of the efforts described in https://github.com/pydata/xarray/pull/10039, I added https://github.com/pydata/xarray/pull/10088, and noticed the repo layout has arguably not kept up with the code growth over the past decade. This isn't the most pressing issue, but it does make the returns to refactors lower, since we're moving lines from 11K LOC files to 1K LOC files, rather than anything smaller.

(Even if you think LLMs aren't that useful / aren't going to get better / etc; these changes would still make the repo easier for people to navigate...)

In particular, 2/3 of our code is in `xarray/core` — 66873 LOC vs 97118 LOC in `xarray`

I can imagine splitting this up into a few categories:
- compat — `dask_array_*`, `npcompat`, `pdcompat`, `array_api_compat`
- compute / computation — `computation`, `arithmetic`, `nanops`, `weighted`, the `curvefit` that's currently in `dataset`, `rolling`, `rolling_exp`, maybe `missing`
- reshape / align / merge (need a better name) — `merge`, `alignment`, `concat`

**I'd propose having each of those be paths within `xarray/`**. Then there's more freedom to make new files within those paths relative to the current state, where a new file means adding onto a very long list of files in `xarray/core`.

I'm not confident on how much disruption that can cause to existing PRs. I _think_ if we land them as commits which mostly just move the files, then git will mostly handle merges well. We can start slowly and see how it goes...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Repo layout proposal #10089

What is your issue?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Repo layout proposal #10089

Description

What is your issue?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions