Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"malloc(): corrupted top size" during json serialization #1587

Closed
anders-wind opened this issue Feb 3, 2025 · 6 comments
Closed

"malloc(): corrupted top size" during json serialization #1587

anders-wind opened this issue Feb 3, 2025 · 6 comments

Comments

@anders-wind
Copy link

anders-wind commented Feb 3, 2025

Hi

After upgrading vcpkg to the new 2025.01.13 release which bumps glaze from 4.0.3 -> 4.2.3, we are getting segfaults and malloc(): corrupted top size for some of our glaze serialization. If I override the version back down to 4.0.3 the error disappears.

The error happens during a resize call to the buffer, however I think the true cause must be due to a write/allocation happening before this.

Building with sanitize produces the following error

$96│ │ std::cerr:
$96│ │ │ =================================================================
$96│ │ │ ==4104148==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x5160000008cd at pc 0x6242e13c39f2 bp 0x7ffe17bc3e00 sp 0x7ffe17bc35c0
$96│ │ │ WRITE of size 17 at 0x5160000008cd thread T0
$96│ │ │ 
$96│ │ std::cerr:
$96│ │ │     #0 0x6242e13c39f1 in __asan_memcpy (/home/anderswind/projects/project/build/test/project_test+0x8da9f1) (BuildId: 988130f535d03c397fd30c03d25871c8d7d27416)
$96│ │ │     #1 0x6242e22ee28b in memcpy /usr/include/x86_64-linux-gnu/bits/string_fortified.h:29:10
$96│ │ │     #2 0x6242e22ee28b in auto void glz::detail::to<10u, twig::project::intraday::public_api::Order>::op<glz::opts{10u, (char)1, (char)0, (char)1, (char)0, (char)1, (char)0, (char)0, (char)32, (unsigned char)3, (char)1, (char)0, (char)1, (char)0, (char)0, (char)0, (char)0, (unsigned char)0, (glz::float_precision)0, (char)0, (char)0, (char)0, (char)0, (char)0, (char)0, (char)1, (char)0, (char)1, (char)1}, twig::project::intraday::public_api::Order const&, twig::project::AppendStringBuffer<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>&, glz::context&, unsigned long&>(twig::project::intraday::public_api::Order const&, glz::context&, twig::project::AppendStringBuffer<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>&, unsigned long&)::'lambda'<unsigned long $N0>()::operator()<1ul>() const /home/anderswind/projects/project/build/vcpkg_installed/sanitize-x64-linux/include/glaze/json/write.hpp:1784:22
... more here

So it seems to point to these lines in write.hpp

static constexpr auto key = glz::get<I>(reflect<T>::keys); // GCC 14 requires auto here
static constexpr auto quoted_key = quoted_key_v<key, Opts.prettify>;
static constexpr auto n = quoted_key.size();
std::memcpy(&b[ix], quoted_key.data(), n);

We use a custom buffer implementation AppendBuffer<std::string> because we need to append to our string it looks like this:

#pragma once

#include <cstddef>
#include <cstdint>

namespace twig::project
{

// glaze writes to the beginning of a buffer, but we want it to append to it
// this class should not be stored since it has a reference field
template<typename InnerBufferT>
struct AppendStringBuffer
{
    InnerBufferT& buffer;  // NOLINT(cppcoreguidelines-avoid-const-or-ref-data-members)
    size_t offset {0};

    constexpr auto operator[](size_t index) -> char&
    {
        return this->buffer[this->offset + index];
    }

    constexpr auto begin()
    {
        return this->buffer.begin() + static_cast<int64_t>(this->offset);
    }
    constexpr auto end()
    {
        return this->buffer.end();
    }

    constexpr auto resize(size_t val) -> void
    {
        this->buffer.resize(this->offset + val);
    }

    constexpr auto size() const -> size_t
    {
        return this->buffer.size() - this->offset;
    }

    [[nodiscard]]
    constexpr auto empty() const -> bool
    {
        return this->size() == 0;
    }
};
}  // namespace twig::project

Perhaps you have an idea of what we are doing wrong? Otherwise let me know if I can provide more information

@anders-wind anders-wind changed the title malloc(): corrupted top size "malloc(): corrupted top size" during json serialization Feb 3, 2025
@stephenberry
Copy link
Owner

Thanks very much for reporting this. The code surrounding this quoted key dumping has been improved and simplified since 4.2.3, so I'd be interested to see if you still get this error if you pulled the latest version (or main). Is it possible for you to check if the error exists on the latest release of Glaze? Otherwise, I can explore the older code more.

It looks like your AppendStringBuffer struct should work without issue.

@stephenberry stephenberry added the bug Something isn't working label Feb 3, 2025
@anders-wind
Copy link
Author

anders-wind commented Feb 3, 2025

Thanks for the quick reply!

I just tried the v4.3.1 version which is available on vcpkg head, my tests succeeds (for some reason), but I still get a std::cerr print with "malloc(): corrupted top size". and the address sanitizer output becomes

$99│ [ RUN      ] `LocalViewRow: to and then from json` @ ./test/source/intraday/public_api/message_tests.cpp:348
$99│ │ std::cerr:
$99│ │ │ =================================================================
$99│ │ │ ==65002==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x5160000005cd at pc 0x5ec1ef7e69f2 bp 0x7ffeddf8d800 sp 0x7ffeddf8cfc0
$99│ │ │ WRITE of size 18 at 0x5160000005cd thread T0
$99│ │ │ 
$99│ │ std::cerr:
$99│ │ │     #0 0x5ec1ef7e69f1 in __asan_memcpy (/home/anderswind/projects/project/build/test/project_test+0x8c99f1) (BuildId: a21fc66ab4455f8ef0ff064d9982eff5063b5ed8)
$99│ │ │     #1 0x5ec1f06effae in memcpy /usr/include/x86_64-linux-gnu/bits/string_fortified.h:29:10
$99│ │ │     #2 0x5ec1f06effae in auto void glz::detail::to<10u, twig::project::intraday::public_api::Order>::op<glz::opts{10u, (char)1, (char)0, (char)1, (char)0, (char)1, (char)0, (char)0, (char)32, (unsigned char)3, (char)1, (char)0, (char)0, (char)1, (char)0, (char)0, (char)0, (char)0, (unsigned char)0, (glz::float_precision)0, (char)0, (char)0, (char)0, (char)0, (char)0, (char)0, (char)1, (char)0, (char)1, (char)1}, twig::project::intraday::public_api::Order const&, twig::project::AppendStringBuffer<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>&, glz::context&, unsigned long&>(twig::project::intraday::public_api::Order const&, glz::context&, twig::project::AppendStringBuffer<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>&, unsigned long&)::'lambda'<unsigned long $N0>()::operator()<1ul>() const /home/anderswind/projects/project/build/vcpkg_installed/sanitize-x64-linux/include/glaze/json/write.hpp:1858:28
$99│ │ │     #3 0x5ec1f06ef80a in void glz::detail::invoke_table<9ul, void glz::detail::to<10u, twig::project::intraday::public_api::Order>::op<glz::opts{10u, (char)1, (char)0, (char)1, (char)0, (char)1, (char)0, (char)0, (char)32, (unsigned char)3, (char)1, (char)0, (char)0, (char)1, (char)0, (char)0, (char)0, (char)0, (unsigned char)0, (glz::float_precision)0, (char)0, (char)0, (char)0, (char)0, (char)0, (char)0, (char)1, (char)0, (char)1, (char)1}, twig::project::intraday::public_api::Order const&, twig::project::AppendStringBuffer<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>&, glz::context&, unsigned long&>(twig::project::intraday::public_api::Order const&, glz::context&, twig::project::AppendStringBuffer<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>&, unsigned long&)::'lambda'<unsigned long $N0>()>(twig::project::intraday::public_api::Order const&) /home/anderswind/projects/project/build/vcpkg_installed/sanitize-x64-linux/include/glaze/util/for_each.hpp:222:7

So the error happens here

                           static constexpr auto quoted_key = join_v<chars<",">, quoted_key_v<key>>;
                           static constexpr auto n = quoted_key.size();
                           std::memcpy(&b[ix], quoted_key.data(), n);
                           ix += n;

@stephenberry
Copy link
Owner

Thanks for testing v4.3.1. I have guess at the problem. The maybe_pad calls make sure there is sufficient padding, but they only actually add padding if the vector_like concept is true. This is so that raw c-style buffers can be used. However, your buffer is resizable, but it doesn't look like it conforms to glz::detail::vector_like.

template <class T>
concept vector_like =
  resizable<std::remove_cvref_t<T>> && accessible<std::remove_cvref_t<T>> && has_data<std::remove_cvref_t<T>>;

Notably, your AppendStringBuffer doesn't have a .data() method. If you added this method I think it would solve the problem and padding would properly be added.

@anders-wind
Copy link
Author

Thank you so much! I also needed using reference = char&; but otherwise it seems to have fixed it!

@stephenberry
Copy link
Owner

Yay! I'll have to think more about how to avoid these kinds of errors in the future. Supporting raw buffers makes it difficult and I think it might be better to instead only support an explicit type like std::string_view for non-allocating buffers, so that we let users know if their buffers are non-conforming.

@stephenberry
Copy link
Owner

I'm closing this because we found a fix, but I opened #1589 to provide better feedback and avoid these issues in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants