Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OPIK-797] Implement get log API #1079

Closed
wants to merge 28 commits into from

Conversation

thiagohora
Copy link
Contributor

@thiagohora thiagohora commented Jan 20, 2025

Details

  • Implement get log endpoint

This PR also unveiled two issues:

  1. The LLM providers' setup was based on components, not modules, which made it very hard to replace beans to mock them and avoid calling the actual LLM provider server.
  2. The new Redis stream consumer needs to follow the application lifecycle. For some reason, the consumer is not automatically deleted once the application's context is shut down, which caused side effects between tests. The solution was to make the consumer implement the Managed interface, which added the bean to the application lifecycle and allowed us to close the stream during the application shutdown.

Issues

OPIK-797

@thiagohora thiagohora requested a review from a team as a code owner January 20, 2025 11:12
Base automatically changed from thiagohora/OPIK-796_implement_clickhouse_user_logs to main January 20, 2025 16:34
@thiagohora thiagohora force-pushed the thiagohora/OPIK-796_implement_get_log_api branch from 0bd37c9 to 6ba985b Compare January 20, 2025 17:04
@thiagohora thiagohora self-assigned this Jan 20, 2025
Copy link
Contributor

@ldaugusto ldaugusto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good, I'd like some extra logging for easier debugging, but it's a great work and easy to extend no other log types.

evaluatorId, projectId, workspaceId);
var criteria = LogCriteria.builder().workspaceId(workspaceId).entityId(evaluatorId).size(size).build();
LogPage logs = service.getLogs(criteria).block();
log.info("Found {} logs for automated evaluator: id '{}' on project_id '{}'and workspace_id '{}'", logs.size(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trivial: missing whitespace

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, where?

public static synchronized void init(@NonNull UserLogTableFactory userLogTableFactory, int batchSize,
@NonNull Duration flushIntervalDuration) {
public static synchronized ClickHouseAppender init(@NonNull UserLogTableFactory userLogTableFactory, int batchSize,
@NonNull Duration flushIntervalDuration, @NonNull LoggerContext context) {

if (instance == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe an else log to know its returning the existent instance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "if" only init the instance; the final line always returns the instance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean it means init(...) was called multiple times. Maybe it can be useful to know it was called again and its not creating an instance but reusing the existent one - unless its happening way to often. Maybe the naming sugests me a 2nd execution would be unexpected. If it was called 'getOrCreate' I would expect multiple calls naturaly.

private BlockingQueue<ILoggingEvent> logQueue;
private ScheduledExecutorService scheduler;
private final BlockingQueue<ILoggingEvent> logQueue = new LinkedBlockingQueue<>();
private ScheduledExecutorService scheduler = Executors.newSingleThreadScheduledExecutor();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use final too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scheduler cannot be final due to tests, where a JVM instance the application multiple times. When this happened the scheduler is shutdown after each test, so we need a new one

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it volatile then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used atomic reference, as sonar complains about using volatile in this case.

Copy link
Collaborator

@andrescrz andrescrz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please reuse the existing DAO.

import static com.comet.opik.api.LogItem.LogPage;

@ImplementedBy(AutomationRuleEvaluatorLogsDAOImpl.class)
public interface AutomationRuleEvaluatorLogsDAO {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please re-use the existing DAO: AutomationRuleEvaluatorLogDAO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

private BlockingQueue<ILoggingEvent> logQueue;
private ScheduledExecutorService scheduler;
private final BlockingQueue<ILoggingEvent> logQueue = new LinkedBlockingQueue<>();
private ScheduledExecutorService scheduler = Executors.newSingleThreadScheduledExecutor();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it volatile then.

@thiagohora thiagohora force-pushed the thiagohora/OPIK-796_implement_get_log_api branch from cfffdc9 to 57f232e Compare January 21, 2025 20:13
@andrescrz
Copy link
Collaborator

Let's break this PR in two parts:

  • The fixes for the tests.
  • The get log API implementation.

This is hard to review and it will hard to identify an issue during deployment and to rollback it.

@thiagohora
Copy link
Contributor Author

Closing to favor the one which splits scope

@thiagohora thiagohora closed this Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants