Refactor code for optimization and readability. #47
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Unused imports: The numpy library is imported but not used anywhere in the code. It can be safely removed.
Redundant function definitions: The extract function is called by both extract_sentbert and extract_originbert functions. Instead of defining two separate functions for these two cases, you can define a single function that takes the model as a parameter.
Redundant code: The dev parameter is used to control whether to store the embeddings for each caption separately or concatenate them and store for each key. However, this parameter is only used in the extract function, and not in extract_sentbert or extract_originbert. Therefore, the dev parameter can be removed from both extract_sentbert and extract_originbert functions.
Progress bar: The tqdm library is used to display a progress bar for the loop that iterates over the caption data frame. However, the progress bar is not shown if dev=False in the extract function. To show a progress bar in both cases, you can move the tqdm initialization to the beginning of the extract function, and use it in both cases.