Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An algorithm for pairing #174

Open
giuliogcantone opened this issue Oct 16, 2023 · 0 comments
Open

An algorithm for pairing #174

giuliogcantone opened this issue Oct 16, 2023 · 0 comments
Labels
feature a feature request or enhancement

Comments

@giuliogcantone
Copy link

Consider the following:

tibble(
  id = c(1:8),
  g = rep(c("A","B"),4),
  v1 = rnorm(8),
  v2 = rnorm(8),
  v3 = rnorm(8)
) -> obs

I'd like a method that produces something that would look as follows:

obs %>%
  mutate(cluster = sample(c(1:2),replace = F) %>% rep(2),
         .by = g) %>%
  mutate(pair = str_c(g,cluster)) %>%
  arrange(pair)

As you can notice, these pairs are random. Instead, I want these pair being elicited minimising - even not-optimally- a multivariate distance across vars, for example Mahalanobis.

Then I would expect something like this

obs %>% mutate(cluster = f(input_cols = starts_with("v"),dist="mahalanobis"))

And finally, if the set of obs is odd, one element is not going to be paired.
Notice that there is no supervision and no cl parameter.

Now, according to ChatGPT, this kind of f pairing algorithm does not exist at least within the options of this package.
But I am still doubtful so here I come asking if the machine said the truth, and in case to propose to develop it.

Notice that this is quite similar to how package MatchIt works, with the difference that MatchIt always pairs ("match") discordant rows from a binary x variable, such that the formula would be x ~ starts_with("v").
This is not the case, since there is no x variable in obs.

@EmilHvitfeldt EmilHvitfeldt added the feature a feature request or enhancement label Nov 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement
Projects
None yet
Development

No branches or pull requests

2 participants