Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial commit to fix log output from AuthzWorker.cpp #1279

Draft
wants to merge 2 commits into
base: devel
Choose a base branch
from

Conversation

JoshuaSBrown
Copy link
Collaborator

@JoshuaSBrown JoshuaSBrown commented Feb 1, 2025

PR Description

Fixes bug where ofstream was going out of scope and no log output was going to file from cpp AuthzWorker. With this fix the file is being created and we are seeing the following result.

2025-02-01T05:03:57.621858Z WARN /datafed/source/repository/gridftp/globus5/authz/source/AuthzWorker.cpp:processResponse:496 { "thread_name": "authz_check-authz_worker", "message": "AuthWorker.cpp Core service did not respond within timeout." }

Tasks

  • - A description of the PR has been provided, and a diagram included if it is a new feature.
  • - Formatter has been run
  • - CHANGELOG comment has been added
  • - Labels have been assigned to the pr
  • - A reviwer has been added
  • - A user has been assigned to work on the pr
  • - If new feature a unit test has been added

Summary by Sourcery

Bug Fixes:

  • Resolve missing log output from the AuthzWorker by managing the lifetime of the output stream correctly.

Copy link

sourcery-ai bot commented Feb 1, 2025

Reviewer's Guide by Sourcery

This pull request fixes a bug where log output was not being written to a file due to the ofstream going out of scope. The fix ensures that the log file is created and that log messages are written to it. Additionally, the fix ensures that cerr is only added once to the logger.

Sequence diagram for the fixed logging process in AuthzWorker

sequenceDiagram
    participant C as Client
    participant AW as AuthzWorker
    participant L as Logger
    participant F as LogFile

    C->>AW: checkAuthorization()
    activate AW
    AW->>L: addStream(cerr) if not added
    Note over AW,L: Only adds cerr stream once
    AW->>F: open log file
    AW->>L: addStream(log_file)
    Note over AW,L: Stream gets unique ID
    AW->>L: log debug message
    AW->>AW: perform authorization check
    AW->>L: log result/errors
    AW->>F: close log file
    AW->>L: removeStream(stream_id)
    AW-->>C: return result
    deactivate AW
Loading

Class diagram showing Logger modifications

classDiagram
    class Logger {
        -list~StreamEntry~ m_streams
        -mutex m_streams_mutex
        -static uint32_t m_stream_id
        -LogLevel m_log_level
        -bool m_output_to_syslog
        +setLevel(LogLevel)
        +uint32_t addStream(ostream&)
        +removeStream(uint32_t)
        +setSysLog(bool)
        -output(LogLevel, string, string, int, LogContext, string)
    }

    class StreamEntry {
        +reference_wrapper~ostream~ stream
        +uint32_t id
        +unique_ptr~mutex~ mutex
        +StreamEntry(ostream&, uint32_t)
    }

    Logger *-- StreamEntry
    note for Logger "Changed from vector to list for better add/remove performance"
    note for StreamEntry "New class to track streams with unique IDs"
Loading

File-Level Changes

Change Details Files
The ofstream was going out of scope, preventing log output to a file. This has been fixed by ensuring the ofstream remains in scope for the duration of the log output.
  • Added a static boolean to track if the log stream has been added.
  • Added a static boolean to track if the cerr stream has been added.
  • The ofstream is now created within a try block and is closed and removed from the logger in a finally block.
  • Added a check to ensure cerr is only added once to the logger.
repository/gridftp/globus5/authz/source/AuthzWorker.cpp
The logger was modified to allow for the removal of streams.
  • The logger now stores streams in a list instead of a vector.
  • The logger now uses a mutex to protect the list of streams.
  • The logger now has a method to remove a stream by id.
  • The logger now returns a stream id when a stream is added.
common/source/DynaLog.cpp
common/include/common/DynaLog.hpp

Possibly linked issues

  • #0: The PR fixes the issue by changing how the ofstream is managed, ensuring it persists and is removed from the logger when done.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!
  • Generate a plan of action for an issue: Comment @sourcery-ai plan on
    an issue to generate a plan of action for it.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@JoshuaSBrown JoshuaSBrown self-assigned this Feb 1, 2025
@JoshuaSBrown JoshuaSBrown added the Type: Bug Something isn't working label Feb 1, 2025
Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @JoshuaSBrown - I've reviewed your changes - here's some feedback:

Overall Comments:

  • Consider replacing the static boolean flags (log_stream_added, cerr_stream_added) with a more robust solution like RAII or a proper singleton pattern for managing logger state. Static flags can be problematic for maintenance and in multi-threaded scenarios.
Here's what I looked at during the review
  • 🟢 General issues: all looks good
  • 🟡 Security: 1 issue found
  • 🟢 Testing: all looks good
  • 🟡 Complexity: 1 issue found
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines 55 to +58
namespace SDMS {

static bool log_stream_added = false;
static bool cerr_stream_added = false;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 suggestion (security): Consider using std::atomic for the static flags to prevent race conditions in multi-threaded environments

These flags are accessed from multiple threads and could lead to race conditions. std::atomic would provide the necessary thread safety.

Suggested change
namespace SDMS {
static bool log_stream_added = false;
static bool cerr_stream_added = false;
#include <atomic>
namespace SDMS {
static std::atomic<bool> log_stream_added{false};
static std::atomic<bool> cerr_stream_added{false};

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good suggestion.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your feedback, we will generate more comments like this in the future.

@@ -638,40 +641,77 @@ const char *getReleaseVersion() {

// The same
int checkAuthorization(char *client_id, char *object, char *action,
struct Config *config) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider using RAII with a LoggerGuard class to simplify log stream management and eliminate global state.

The current implementation adds unnecessary complexity with global state and manual stream management. This can be simplified using RAII:

class LoggerGuard {
    std::ofstream log_file;
    uint32_t stream_id;
    bool stream_added;
public:
    LoggerGuard(const std::string& path) : stream_added(false) {
        if (!path.empty()) {
            log_file.open(path + "_authz", std::ios::app);
            if (log_file.is_open()) {
                stream_id = SDMS::global_logger.addStream(log_file);
                stream_added = true;
            }
        }
    }
    ~LoggerGuard() {
        if (stream_added) {
            SDMS::global_logger.removeStream(stream_id);
        }
    }
};

int checkAuthorization(char *client_id, char *object, char *action, struct Config *config) {
    static std::once_flag cerr_flag;
    std::call_once(cerr_flag, []() {
        SDMS::global_logger.addStream(std::cerr);
    });

    // Logger setup
    SDMS::global_logger.setSysLog(!defined(DONT_USE_SYSLOG));
    SDMS::global_logger.setLevel(SDMS::LogLevel::INFO);

    LoggerGuard log_guard(config->log_path);

    // Rest of the authorization logic...
}

This approach:

  1. Eliminates global state variables
  2. Automatically handles stream cleanup
  3. Uses std::call_once for thread-safe cerr initialization
  4. Reduces nesting and improves readability

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also a good suggestion.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So on looking at this closer there remain some problems. for 1 if multiple threads are writing to the same file will it cause problems if I close the file when the checkAuthorization command goes out of scope, but other threads may be attempting to write to the file. It looks like the member log_file is specific to the thread instance, but I don't know if closing a file stream changes the actual behavior of the file as seen from separate threads.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right to be concerned about multiple threads writing to the same file. Closing the log_file stream in the destructor of LoggerGuard while other threads might still be writing to it could lead to issues, particularly if those threads attempt to write to the file after the stream has been closed. This could result in lost data or unexpected behavior.

Here are a few ways to address this:

  1. Thread-specific log files: The simplest solution is to have each thread write to its own log file. You can modify the LoggerGuard constructor to incorporate thread-specific information into the filename (e.g., using std::this_thread::get_id()). This eliminates the contention issue entirely.

  2. Shared mutex for file access: If you need to write to a single shared log file, you'll need to synchronize access to the file stream using a std::mutex. You can encapsulate the std::ofstream and the std::mutex within the LoggerGuard class. The LoggerGuard constructor would acquire the mutex, and the destructor would release it after closing the stream. This ensures that only one thread can access the file stream at a time. However, this introduces a performance bottleneck as threads will have to wait for the mutex.

  3. Asynchronous logging with a dedicated logging thread: For optimal performance, consider using a dedicated logging thread. The checkAuthorization function would push log messages onto a thread-safe queue. The logging thread would then consume messages from the queue and write them to the log file. This approach decouples logging from the main thread and avoids blocking. This is more complex to implement but offers the best performance.

  4. Use a logging library: Consider using a dedicated logging library like spdlog, log4cxx, or Boost.Log. These libraries typically handle thread safety, file rotation, and other logging-related concerns efficiently. This is often the most robust and easiest long-term solution.

Given the potential complexity of multi-threaded logging, I strongly recommend option 4, using a well-established logging library. It will save you time and effort in the long run and provide a more robust solution. If that's not feasible, option 1 (thread-specific log files) is the simplest to implement and avoids the performance overhead of synchronization. Options 2 and 3 offer increasing performance benefits but come with increased implementation complexity. Choose the option that best balances your performance needs and development resources.

@JoshuaSBrown JoshuaSBrown marked this pull request as draft February 3, 2025 12:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Authz, Logs] - Bug logoutput from AuthzWorker is not being shown
1 participant