Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions icu4c/source/i18n/measunit.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2729,6 +2729,24 @@ StringEnumeration* MeasureUnit::getAvailableTypes(UErrorCode &errorCode) {
return result;
}

bool MeasureUnit::validateAndGet(StringPiece type, StringPiece subtype, MeasureUnit &result) {
// Find the type index using binary search
int32_t typeIdx = binarySearch(gTypes, 0, UPRV_LENGTHOF(gTypes), type);
if (typeIdx == -1) {
return false; // Type not found
}

// Find the subtype within the type's range using binary search
int32_t subtypeIdx = binarySearch(gSubTypes, gOffsets[typeIdx], gOffsets[typeIdx + 1], subtype);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does gOffsets[] have an entry at max typeIndex + 1?

Copy link
Member Author

@younies younies Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, gOffsets[] includes an entry at max typeIndex + 1.

Evidence:

  1. gTypes[] contains 24 elements (indices 0–23), so the maximum valid typeIndex is 23.
  2. gOffsets[] contains 25 elements (indices 0–24), as defined in measunit.cpp (lines 39–65).
  3. gSubTypes[] has 538 elements (indices 0-537)
  4. The last element, gOffsets[24] = 538, serves as the end boundary (i.e., total size).
  5. The existing code safely accesses gOffsets[typeIndex + 1] at line 2683:
    int32_t len = gOffsets[typeIndex + 1] - gOffsets[typeIndex];
  6. An assertion at line 2753 validates this arrangement:
    U_ASSERT(gOffsets[UPRV_LENGTHOF(gOffsets) - 1] == UPRV_LENGTHOF(gSubTypes));

However, I think it would be beneficial to:
• Add a comment next to the sentinel value 538 to clarify its purpose (e.g., "end boundary").
• Consider using a helper function to retrieve the range for a specific typeIndex, instead of directly using gOffsets[typeIndex] and gOffsets[typeIndex + 1].

if (subtypeIdx == -1) {
return false; // Subtype not found
}

// Create the MeasureUnit and return it
result.setTo(typeIdx, subtypeIdx - gOffsets[typeIdx]);
return true;
}

bool MeasureUnit::findBySubType(StringPiece subType, MeasureUnit* output) {
// Sanity checking kCurrencyOffset and final entry in gOffsets
U_ASSERT(uprv_strcmp(gTypes[kCurrencyOffset], "currency") == 0);
Expand Down
23 changes: 5 additions & 18 deletions icu4c/source/i18n/number_skeletons.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1072,25 +1072,12 @@ void blueprint_helpers::parseMeasureUnitOption(const StringSegment& segment, Mac
CharString subType;
SKELETON_UCHAR_TO_CHAR(subType, stemString, firstHyphen + 1, stemString.length(), status);

// Note: the largest type as of this writing (Aug 2020) is "volume", which has 33 units.
static constexpr int32_t CAPACITY = 40;
MeasureUnit units[CAPACITY];
UErrorCode localStatus = U_ZERO_ERROR;
int32_t numUnits = MeasureUnit::getAvailable(type.data(), units, CAPACITY, localStatus);
if (U_FAILURE(localStatus)) {
// More than 30 units in this type?
status = U_INTERNAL_PROGRAM_ERROR;
MeasureUnit unit;
if (MeasureUnit::validateAndGet(type.toStringPiece(), subType.toStringPiece(), unit)) {
macros.unit = unit;
return;
}
for (int32_t i = 0; i < numUnits; i++) {
auto& unit = units[i];
if (uprv_strcmp(subType.data(), unit.getSubtype()) == 0) {
macros.unit = unit;
return;
}
}

// throw new SkeletonSyntaxException("Unknown measure unit", segment);
}

status = U_NUMBER_SKELETON_SYNTAX_ERROR;
}

Expand Down
16 changes: 16 additions & 0 deletions icu4c/source/i18n/unicode/measunit.h
Original file line number Diff line number Diff line change
Expand Up @@ -720,6 +720,22 @@ class U_I18N_API MeasureUnit: public UObject {
*/
static StringEnumeration* getAvailableTypes(UErrorCode &errorCode);

#ifndef U_HIDE_INTERNAL_API
/**
* Validates that a specific type and subtype combination exists and retrieve the unit.
*
* <p> Note: This is more efficient than calling getAvailable() when you only need
* to validate and retrieve a single unit.
*
* @param type the unit type (e.g., "length", "mass", "volume")
* @param subtype the unit subtype (e.g., "meter", "kilogram", "liter")
* @param result if the unit is valid, this will be set to the MeasureUnit
* @return true if the type/subtype combination is valid, false otherwise
* @internal
*/
static bool validateAndGet(StringPiece type, StringPiece subtype, MeasureUnit &result);
#endif /* U_HIDE_INTERNAL_API */

/**
* Return the class ID for this class. This is useful only for comparing to
* a return value from getDynamicClassID(). For example:
Expand Down