Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create neccessary "mountpoints" #3462

Merged
merged 5 commits into from
Mar 9, 2025

Conversation

sjanssen2
Copy link
Contributor

Assume a user wants to set-up a fresh Qiita instance. She/he probably changes the BASE_DATA_DIR and eventually executes qiita-env make.

This process will most likely fail, as the expected sub-directories of the qiita.data_directory relation not yet exist. Therefore, this PR adds a function create_mountpoints, which is called within the patch mechanism.

If instance is in test mode, the sub-directories not only need to exist, but also be populated with the test data, which are located in support_files/test_data/.

@antgonza
Copy link
Member

antgonza commented Mar 6, 2025

This raises an interesting discussion; when/whom should create those folders/mounts.

In practice, in general they are created when a new artifact type is created; remember that this is done/defined as a qtp (qiita type plugin) and implemented here. Thus, we have never needed this. Additionally, in the qiita.ucsd.edu those mounts are maintained (created, backup, etc) by external resources so we have never used that code for "real".

Anyway, just to confirm, are you 100% sure that this is required for a new system to work? Would adding those plugins create them?

@sjanssen2
Copy link
Contributor Author

I think you have to execute qiita-env make BEFORE configuring / starting any plugins. But qiita-env make will fail if pointing with BASE_DATA_DIR to a location that does not hold directories which reflect the "mount points", i.e. yes I feel this is necessary.

@sjanssen2
Copy link
Contributor Author

btw: Anna and I are working in dockerizing qiita and all 16S plugins here: https://github.com/jlab/qiita-keycloak/tree/enable_study_creation
This test-bed confirms the necessity of having the sub-dirs within the qiita-env mechanism.

@antgonza
Copy link
Member

antgonza commented Mar 6, 2025

Got it, thank you for the clarification; but just to make sure I'm on the same page, is the issue that the BASE_DATA_DIR or the folders within BASE_DATA_DIR do not exist? If the former, maybe adding the creation of BASE_DATA_DIR without subdirectories is the way to go. What do you think?

@sjanssen2
Copy link
Contributor Author

it's the later. Subdirs are required to exist. However, I have not considered that the BASE_DATA_DIR itself might not be present. This is handled by a different mechanism in my scenario and will probably raise an error when config is loaded, so the user should be at least aware of this problem.

@antgonza
Copy link
Member

antgonza commented Mar 6, 2025

btw: Anna and I are working in dockerizing qiita and all 16S plugins here: https://github.com/jlab/qiita-keycloak/tree/enable_study_creation This test-bed confirms the necessity of having the sub-dirs within the qiita-env mechanism.

I see, not sure if this would be helpful but we actually "feed" studies during CI: https://github.com/qiita-spots/qiita/blob/master/test_data_studies/commands.sh

@sjanssen2
Copy link
Contributor Author

hm. So what if you change BASE_DATA_DIR for a sec to some other path than /home/runner/work/qiita/qiita/qiita_db/support_files/test_data/ and see "feeding" breaks at line if

qiita-env make --add-demo-user --no-load-ontologies
?

@antgonza
Copy link
Member

antgonza commented Mar 6, 2025

FWIW, I don't think BASE_DATA_DIR should change after the environment is set up. I guess you could delete and recreate the environment if you need to change BASE_DATA_DIR; no? I mean, in reality BASE_DATA_DIR is normally a large storage mount where you can have tons of data.

WHERE active = TRUE"""
qdb.sql_connection.TRN.add(sql)
created_subdirs = []
for subdir in qdb.sql_connection.TRN.execute_fetchflatten():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly, it has been a while since I have worked with the qiita mounts, concatenating the "test_data") is actually given by the subdirectory flag, I think this code illustrates that. Maybe worth relying on that method to get the correct expected folder. An idea, could be to change the query in 130 to select all the unique mount points and then use get_mountpoint to get the expected folder and create it. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A very good idea. I changed accordingly

@sjanssen2
Copy link
Contributor Author

I don't think BASE_DATA_DIR should change after the environment is set up

I totally agree. However, in a containerized context, setting up the environment must be automatized as well. Therefore, I'd like to have this functionality in the place where the environment is actually created to be self contained.

retrieve_all=True):
if not exists(join(qiita_config.base_data_dir, subdir)):
if qiita_config.test_environment and \
exists(get_support_file('test_data', subdir)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of curiosity, why do we need this check? In which scenario that folder will not exist?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. But it actually happened to me, that I have accidentally moved these support files instead of coping and then they were missing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see; on that scenario, it should fail, no? and if I understand this correctly, it will not fail here and not copy anything, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactly

Copy link
Member

@antgonza antgonza Mar 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I'm a little concerned about that behavior but if you think is the correct one, then fine. My concern is that I think that if there is nothing to copy it should error with the copytree error of file not found. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right: better fail with a traceable error message than to silently do workarounds which might be hard to debug / understand for the user

@antgonza antgonza merged commit 0a29ac2 into qiita-spots:master Mar 9, 2025
4 checks passed
@antgonza
Copy link
Member

antgonza commented Mar 9, 2025

Thank you @sjanssen2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants