[Part 2] [ORC-timezone]: Extract recurring DST rule from JVM timezone APIs#4635
Conversation
Greptile SummaryThis PR adds the Java-side DST rule extractor (
Confidence Score: 5/5Safe to merge — all changes are dead code and production behavior is unchanged. All new code sits behind the unmodified DST gate in convertOrcTimezones and cannot affect runtime behavior. The extractor logic is thoroughly verified by cross-year probing, and the test suite covers every exception branch. No files require special attention for this merge. Important Files Changed
Reviews (13): Last reviewed commit: "Fix synthetic ZoneRules fixtures failing..." | Re-trigger Greptile |
Adds a static DST extraction path to OrcTimezoneInfo that produces a SimpleTimeZone-shaped DstRule (month / day / dayOfWeek / time / mode, plus DST savings) for any IANA zone with active DST. The extraction tries two paths and verifies the result against tz.getOffset() across the anchor years 2060, 2400, 9997: - Path A: derive from ZoneRules.getTransitionRules() (recurring rule list returned for modern IANA zones) - Path B: probe tz.getOffset() hourly across a reference year, then decode the two transitions into the SimpleTimeZone shape The probing path runs first because it captures what java.util.TimeZone.getOffset returns, which is the source of truth the GPU side must match for ORC byte-compatibility. ZoneRules is the fallback for zones whose recurring rule cannot be recovered from probes alone. Verified against tzdata on JDK 17: - America/New_York: 2nd Sun of March -> 1st Sun of November - Europe/London: last Sun of March -> last Sun of October - Australia/Sydney: 1st Sun of October -> 1st Sun of April - Asia/Shanghai, UTC, +05:30: null (no DST) The DST rule is NOT wired into buildRuntimeOrcTimezoneInfo or the GPU dispatch path yet -- the DST gate in GpuTimeZoneDB.convertOrcTimezones still throws UnsupportedOperationException for DST zones, so production behavior is unchanged. extractDstRule is package-private and exercised only by OrcTimezoneInfoTest. The follow-up PR will hand DstRule to the new CUDA kernel. Signed-off-by: Chong Gao <res_life@163.com>
- Document IllegalStateException on extractDstRule with the specific failure modes (unsupported rule count, non-DOW_GE_DOM shape, mismatched savings, zero-delta, verification mismatch). - Guard extractDstRule with rules.isFixedOffset() so the contract is robust against the silent +05:30 -> GMT trap in TimeZone.getTimeZone; mirrors the production guard already used in buildRuntimeOrcTimezoneInfo. - Clamp computeRuleDay mode 2 (DOW_GE_DOM) anchor and result against monthLength. Defends against ZoneOffsetTransitionRule indicators that refer to a day past month end (e.g. Feb 29 in non-leap years per the JDK contract), so an unexpected rule shape returns a clean in-month day instead of leaking DateTimeException out of verifyDstRule. Mirrors SimpleTimeZone's documented within-month clamp for DOW_GE_DOM. No behaviour change for current IANA zones. - Lock startTime / endTime / startTimeMode / endTimeMode for America/New_York in testExtractDstRuleNorthernHemisphere so a regression that flips timeMode (e.g. STANDARD -> WALL) is caught. - Mirror the production fixed-offset guard in the test helper extractDstRuleFor and document the +05:30 -> GMT trap. Signed-off-by: Chong Gao <res_life@163.com>
Replace the magic numbers in DstRule mode/timeMode assignments and the consuming switch/if-else chains with named static-final ints on DstRule itself: MODE_DOM, MODE_DOW_IN_MONTH, MODE_DOW_GE_DOM, MODE_DOW_LE_DOM TIME_MODE_WALL, TIME_MODE_STANDARD, TIME_MODE_UTC Mirrors SimpleTimeZone's internal encoding, so the GPU side consuming these values directly in a follow-up part can re-use the same numeric meaning. Pure rename / constant substitution -- no behaviour change. Signed-off-by: Chong Gao <res_life@163.com>
extractDstRuleByProbing(tz, refYear) walked the entire reference year at hourly step (~8760 tz.getOffset calls per try) even after both DST transitions had been recorded. Capture the year-start standard offset, and once both transitions are observed and the running offset returns to it, exit the loop. The remainder of the year is a no-op tail for any SimpleTimeZone-style two-transition zone -- a later offset change would have correctly triggered the existing "more than one DST-on/off" early-return. Cuts ~700 hourly probes per probing attempt for typical northern-hemisphere zones (DST ends in October/November); larger savings for zones whose DST ends earlier. Signed-off-by: Chong Gao <res_life@163.com>
computeDstOffset used utcMs + rawOffsetMs to pick the year, which returns the standard-time local epoch day. For a southern-hemisphere zone with DST active in late Dec / early Jan (e.g. Australia/Sydney DST starts in October and ends in April), an instant whose actual wall-clock year is Y can land in year Y-1 when only rawOffsetMs is added. computeTransitionUtcMillis would then resolve DST start/end against Y-1's transitions, and the cross-year branch below would compare against the wrong window. Adding dstSavings to the local guess lands utcMs inside DST when DST is active, which is the regime where the year-boundary mis- classification matters. Non-DST instants stay in the same year too, since DST savings are hours not days. Existing tests (testExtractDstRuleSouthernHemisphere) continue to pass; verifyDstRule's anchor-year probes were already exercising Dec/Jan boundary cases that now resolve correctly. Signed-off-by: Chong Gao <res_life@163.com>
OrcTimezoneInfo had grown to ~700 lines and mixed two unrelated responsibilities: building the historical-transition table consumed by ORC rebasing, and recovering a recurring DST rule for the GPU DST path. The DST code shared no read/write state with the historical-transition code, only a couple of small helpers. Lift DstRule, extractDstRule, and all DST helpers (extractDstRuleByProbing, extractDstRuleFromZoneRules, fillDstRuleFromTransitionRule, getTransitionRuleTime*, toCalendarDayOfWeek, decodeTransition, verifyDstRule*, computeDstOffset, computeTransitionUtcMillis, computeRuleDay) plus the DST_RULE_VALIDATION_YEARS constant into a sibling package-private OrcDstRuleExtractor class. Shared helpers utcMillisForDate and binarySearchTransition stay in OrcTimezoneInfo but are promoted from private to package-private so OrcDstRuleExtractor can reuse them. OrcTimezoneInfoTest is updated to reference OrcDstRuleExtractor instead of OrcTimezoneInfo for DstRule and extractDstRule. The runtime behaviour for all tested zones is unchanged (America/New_York, Europe/London, Australia/Sydney extract the same rule; Asia/Shanghai, UTC, +05:30 still return null). Signed-off-by: Chong Gao <res_life@163.com>
e3a0b10 to
90c8863
Compare
computeRuleDay's MODE_DOW_GE_DOM branch already clamps both the LocalDate.of anchor and the returned day into [1, monthLength]. The MODE_DOW_IN_MONTH and MODE_DOW_LE_DOM branches did not, even though they consume the same untrusted day-of-month indicator from ZoneOffsetTransitionRule: - MODE_DOW_IN_MONTH positive: "1 + diff + (ruleDay - 1) * 7" can reach 29-35 for a 5th occurrence in a 28-day February. - MODE_DOW_IN_MONTH negative: "monthLength - diff + (ruleDay + 1) * 7" drops below 1 for sufficiently negative ruleDay. - MODE_DOW_LE_DOM: LocalDate.of(... ruleDay) throws when ruleDay exceeds monthLength, and "ruleDay - diff" drops below 1 when ruleDay < diff. The bad values would then escape as DateTimeException from utcMillisForDate. No real IANA recurring rule triggers these paths today (both extraction paths emit only DOW_GE_DOM), but the constants are package-visible and Part 3+ may exercise them. Mirror the DOW_GE_DOM treatment in the two sibling branches: clamp anchor input and clamp the returned day so the result is always a valid day in the requested month. Signed-off-by: Chong Gao <res_life@163.com>
The extractDstRuleFor helper's Javadoc still pointed
{@link OrcTimezoneInfo#extractDstRule} after extractDstRule moved
to OrcDstRuleExtractor. Repoint the link to the actual method
and include the parameter signature so the reference resolves
unambiguously.
Signed-off-by: Chong Gao <res_life@163.com>
Per spark-rapids-jni Doxygen convention, package-private API methods that already document @throws should also document @param for each parameter and @return for the return value. extractDstRule had only the @throws clause -- add the missing tags so the contract is complete at the API boundary. Signed-off-by: Chong Gao <res_life@163.com>
testExtractDstRuleEuropeLondon and testExtractDstRuleSouthernHemisphere asserted only a subset of the rule fields (dstSavings, months, day, dayOfWeek). Mode, timeMode, and time-of-day were not pinned, so a regression that, for example, encoded a DOW_GE_DOM rule as DOW_IN_MONTH or flipped STANDARD to WALL would pass these tests silently while producing wrong UTC transitions in the GPU dispatch. Add the missing assertions for startMode/endMode (DOW_GE_DOM = 2), startTimeMode/endTimeMode (STANDARD = 1), and startTime/endTime (London: 01:00; Sydney: 02:00). Values verified empirically against the probing-path output on JDK 17 Zulu. Signed-off-by: Chong Gao <res_life@163.com>
extractDstRule documents five distinct IllegalStateException
conditions but no test exercised any of them; only the null-return
paths (non-DST, fixed-offset) were covered. A regression that turned
one of the documented throws into a null return would have passed
silently.
Add a test that hand-crafts a TimeZone whose getOffset is constant
(so the probing path observes no transitions and falls through) and
a ZoneRules with exactly one transition rule. Production code
rejects any rule count outside {0, 2}, so extractDstRuleFromZoneRules
throws the "Unsupported ORC DST rule count" IllegalStateException.
Verified empirically on JDK 17 Zulu: the test triggers the expected
branch with the expected message.
Signed-off-by: Chong Gao <res_life@163.com>
extractDstRule's terminal IllegalStateException at line 128
("Failed to extract ORC DST rule") -- the survivor path when
useDaylightTime()==true but both probing and ZoneRules extraction
return null -- had no test. The neighboring "Unsupported ORC DST
rule count" branch is covered; this commit adds its companion.
Synthesize a TimeZone whose getOffset is constant (probing
observes no transitions) and a ZoneRules whose lastRules list is
empty but whose transitionList contains one historical entry (so
isFixedOffset() returns false and extractDstRule does not
short-circuit at the early guard). Path B returns null because of
the constant offset; Path A returns null because the recurring
rules list is empty. extractDstRule falls through to the terminal
throw. Verified empirically.
Signed-off-by: Chong Gao <res_life@163.com>
verifyDstRule's inner sample loop called computeDstOffset on every probe (~62 calls per refYear × 3 anchor refYears × up to 3 probing attempts = ~558 calls per cold zone). Each call re-derived the wall-clock year from the timestamp and re-ran two computeTransitionUtcMillis calls -- both of which allocate a LocalDate and query lengthOfMonth / getDayOfWeek. The surrounding loop already computed the year's dstStart and dstEnd, so the re-derivation was redundant. Split computeDstOffset into a thin entry point (still derives the year for callers that don't know it) and a bounds-aware variant computeDstOffsetWithBounds that takes dstStart and dstEnd as parameters. verifyDstRule now calls the bounds-aware variant in the hot loop, eliminating ~558 LocalDate allocations per cold zone on the verification path. Public API surface and observable behaviour unchanged; all current zone tests still extract the same DstRule. Signed-off-by: Chong Gao <res_life@163.com>
extractDstRuleFromZoneRules throws IllegalStateException on five distinct shape violations. Only the "Unsupported rule count" and final "Failed to extract" branches were covered. Add tests for the four remaining branches: - "Unsupported zero-delta ORC DST rule" -- second rule whose offsetBefore equals offsetAfter - "Failed to identify ORC DST start/end rules" -- two rules with same-sign delta (both positive) leaving endTransitionRule null - "Mismatched ORC DST savings" -- start rule +1h, end rule -2h - "Unsupported ORC DST transition rule shape" -- first rule with null dayOfWeek (DOM-style fixed-day, rejected by Path A) Each test uses the same synthetic constant-offset TimeZone pattern (extracted into a small helper to avoid five copies of the anonymous-class boilerplate) and a hand-crafted two-rule ZoneRules that targets the specific branch. All four verified empirically to trigger the expected message. Signed-off-by: Chong Gao <res_life@163.com>
The general-purpose computeDstOffset(utcMs, rawOffsetMs, rule) introduced when hoisting bounds out of verifyDstRule had zero callers. Its body was already covered by computeDstOffsetWithBounds plus a five-line year-derivation prologue; keeping the wrapper added 20 lines of dead code (which the project convention treats the same as commented-out code). Drop the wrapper. The bounds-aware variant remains the sole DST classification helper. A general-purpose entry point can be reintroduced in a follow-up part when an actual non-verification caller appears. Signed-off-by: Chong Gao <res_life@163.com>
- Import LocalDateTime and Arrays directly instead of using fully qualified names (java.time.LocalDateTime.of, java.util.Arrays.asList). Consistent with the rest of the file's java.time / java.util imports. - Add testExtractDstRuleThrowsOnNegativeDayIndicator covering the getDayOfMonthIndicator() <= 0 sub-case of fillDstRuleFromTransitionRule's shape guard. The existing testExtractDstRuleThrowsOnUnsupportedRuleShape covers the null-dayOfWeek sub-case; this one covers the negative-indicator sub-case (a DOW_LE_DOM rule shape), so a future split of the || guard cannot silently drop coverage on either half. Signed-off-by: Chong Gao <res_life@163.com>
extractDstRuleFromZoneRules's success path was untested -- all real-zone tests (America/New_York, Europe/London, Australia/Sydney) satisfy the probing path before Path A is ever reached. Consequently, fillDstRuleFromTransitionRule and the TIME_MODE_UTC / TIME_MODE_WALL branches of getTransitionRuleTimeMode were dead code from the test suite's perspective. testExtractDstRuleViaZoneRulesFallback forces Path B to fail by constructing a TimeZone that observes a real +2h DST window but reports getDSTSavings() == +1h. Probing extracts a rule with dstSavings=+1h, whose verifyDstRule prediction disagrees with the observed +2h offset -- verify fails, probing returns null, and extractDstRuleFromZoneRules runs with the synthetic ZoneRules. That path re-derives dstSavings from the rule's offset deltas (+2h, correct), verify passes, and the returned DstRule is asserted to carry dstSavings=+2h (the +1h would have indicated a Path B win). The end rule uses TimeDefinition.UTC, exercising getTransitionRuleTimeMode's TIME_MODE_UTC branch. The extracted DstRule's endTimeMode is asserted to be 2 (UTC). Signed-off-by: Chong Gao <res_life@163.com>
A multi-parameter helper that carries a Javadoc block should spell its parameters out rather than fitting the contract into a single 118-char line without tags. Expand the comment to use the project's standard Doxygen-style tags. Signed-off-by: Chong Gao <res_life@163.com>
OrcDstRuleExtractor mixed two groupings -- "3600_000" (grouped
from the right) and "3_600_000" (grouped from the left, the form
used everywhere in the test file and elsewhere in the project).
Per the project convention ("4+ digit decimal literals: 1'234'567"
translated to Java's underscore separator), thousands grouping is
left-to-right. Normalise the six occurrences in this file.
Signed-off-by: Chong Gao <res_life@163.com>
extractDstRuleFromZoneRules's "ZoneRules ORC DST rule verification failed" throw was the last uncovered exception branch. The test feeds a constant-offset TimeZone to a valid 2-rule synthetic ZoneRules. Probing finds no transitions, Path A parses the rules into a DstRule with dstSavings=+1h, but the constant tz disagrees with the rule's predictions and the verify step rejects -- triggering the targeted throw. Verified empirically. Signed-off-by: Chong Gao <res_life@163.com>
Three of the four switch arms (MODE_DOW_IN_MONTH, MODE_DOW_LE_DOM, MODE_DOM) are kept for forward compatibility with a future caller that constructs a DstRule directly (e.g. from a serialised form). The current Path A / Path B extraction code always emits MODE_DOW_GE_DOM, so the other three branches have no test path until that future caller appears. Add a short comment documenting this intent so the gap is visible at the switch. Signed-off-by: Chong Gao <res_life@163.com>
getTransitionRuleTimeMode's TIME_MODE_WALL else-branch and the matching WALL arm of computeTransitionUtcMillis were untested -- all prior synthetic tests used STANDARD or UTC TimeDefinition. testExtractDstRuleTimeDefinitionWallMode mirrors the existing Path-A-success test but encodes the start rule with TimeDefinition.WALL. The custom TimeZone returns +2h inside the DST window but reports getDSTSavings()==0, so Path B's verify fails (predicted 0 vs observed +2h) and Path A runs. The extracted rule's startTimeMode is asserted to be 0 (TIME_MODE_WALL) and endTimeMode is 1 (TIME_MODE_STANDARD), pinning both branches. Verified empirically on JDK 17 Zulu. Signed-off-by: Chong Gao <res_life@163.com>
OrcTimezoneInfoTest used LocalDate.of and LocalDate.ofEpochDay in nthDayOfWeekUtcMs and the synthetic TimeZone subclasses introduced by recent commits, but the import was missing -- the file did not compile. Add the import. Signed-off-by: Chong Gao <res_life@163.com>
getTransitionRuleTimeMillis selects between 24 * 3600 (when isMidnightEndOfDay()==true) and getLocalTime().toSecondOfDay() based on the ZoneOffsetTransitionRule flag. No existing test set the flag to true, so a mutation swapping the two branches would have shipped silently. testExtractDstRuleMidnightEndOfDayRule passes midnightEndOfDay=true + LocalTime.MIDNIGHT to ZoneOffsetTransitionRule.of and asserts that the extracted rule's startTime equals 24h (not 0). The custom TimeZone fires DST at March 2nd-Sunday + 24h and reports getDSTSavings()==0, forcing Path B verify to fail so Path A runs. Signed-off-by: Chong Gao <res_life@163.com>
decodeTransition recovers the standard wall clock by computing utcMs + rawOffsetMs. The arithmetic only works because OrcTimezoneInfo.binarySearchTransition returns the first ms at which the new offset applies -- so on a fall-back, the input UTC ms is already on the post-transition (standard) side and adding rawOffsetMs gives the right answer. That contract is not currently spelled out anywhere near decodeTransition; a future change that shifts the binary-search return convention by one ms would silently break this. Add the precondition to the Javadoc. Signed-off-by: Chong Gao <res_life@163.com>
The file uses underscore-separated literals (3_600_000) everywhere else seconds-to-milliseconds constants appear. The bare 3600 at line 221 is inconsistent; add the digit separator. Signed-off-by: Chong Gao <res_life@163.com>
testExtractDstRuleThrowsOnBothPositiveDeltaRules covers endTransitionRule == null in the "Failed to identify ORC DST start/end rules" guard. The symmetric startTransitionRule == null case (two negative-delta rules) was untested -- a mutation from the || at line 157 to && would slip through silently. Add testExtractDstRuleThrowsOnBothNegativeDeltaRules with the same shape as the positive variant. Signed-off-by: Chong Gao <res_life@163.com>
Math.min(ruleDay, monthLength) and Math.min(anchorDay + diff, monthLength) in computeRuleDay's MODE_DOW_GE_DOM branch guard against utcMillisForDate throwing DateTimeException when a February rule comes in with dayOfMonthIndicator=29 on a non-leap year. The code comment cites this exact scenario, but no existing test supplies ruleDay > monthLength, so the clamps were dead from the test suite's perspective. testExtractDstRuleFebruaryClampDoesNotThrow constructs a synthetic ZoneOffsetTransitionRule for FEBRUARY with indicator=29, forces Path B verify to fail (custom getDSTSavings()==0), and asserts the extracted startDay lands in [1, 28] without throwing. Signed-off-by: Chong Gao <res_life@163.com>
extractDstRule's two-argument signature requires that tz and rules describe the same zone, but only the Javadoc said so -- a caller that passed mismatched objects would silently get a verification mismatch much later, with no signal pointing at the API misuse. Add an explicit fail-fast guard: compare tz.getRawOffset() against rules.getStandardOffset(ref) at a recent reference instant. The epoch is unsafe because some zones (Europe/London) had a different standard offset in 1970. Note this sits after the existing isFixedOffset/!useDaylightTime short-circuit, so the original silent-GMT trap (e.g. tz built from "+05:30" which returns GMT with useDaylightTime()==false) is still handled by the early return; the new guard catches the broader "DST-having tz mismatches DST-having rules" case. Signed-off-by: Chong Gao <res_life@163.com>
The synthetic TimeZone reported DST starting on the 4th Sunday of February, but Path A's encoding (dayOfMonthIndicator=29 + SUNDAY + MODE_DOW_GE_DOM) collapses through computeRuleDay's clamp to the last day of February (29 leap / 28 non-leap, dropping the day-of- week constraint when ruleDay >= monthLength). For most years the two days differ and verifyDstRuleAcrossReferenceYears throws "verification failed" before the assertion ever runs. The second assertion was also wrong: Path A stores getDayOfMonthIndicator() unchanged so rule.startDay is 29, not in [1, 28]. Align the synthetic tz with computeRuleDay's clamped semantics (report DST starting on the last day of February) and assert rule.startDay == 29 to confirm Path A's verbatim storage. The test still proves the clamp prevents DateTimeException because verify exercises computeRuleDay for the non-leap reference year 9997 with ruleDay=29. Signed-off-by: Chong Gao <res_life@163.com>
The new sanity-check at extractDstRule lines 128-133 had no test coverage. Every other IllegalStateException branch in extractDstRule has a dedicated synthetic-zone test (testExtractDstRuleThrowsOnUnsupportedRuleCount, testExtractDstRuleThrowsOnZeroDeltaRule, ...) -- this guard was the gap. Pair a constant-zero-offset TimeZone with ZoneRules for America/New_York so tz.rawOffset (0) differs from rules.getStandardOffset(ref) (-18_000_000 ms) and the guard fires before either Path A or Path B runs. Signed-off-by: Chong Gao <res_life@163.com>
The previous "last day of February" shortcut was functionally equivalent for indicator=29 (because computeRuleDay's outer Math.min(_, monthLength) clamp collapses the day-of-week constraint when the anchor sits at the month boundary), but it required a reader to derive that collapse independently to see the test agrees with production. Inline the same MODE_DOW_GE_DOM algorithm in the synthetic tz: anchor = min(29, monthLength), step forward to the first SUNDAY, clamp to monthLength. This way the synthetic getOffset and the production computeRuleDay read identically, leaving no room for misreading the equivalence. Signed-off-by: Chong Gao <res_life@163.com>
The helper substitutes TimeZone.getTimeZone("UTC") for fixed-offset
zones to dodge the silent-GMT trap. That UTC placeholder has
rawOffset=0, which does *not* match the rules' actual standard
offset and would now fail the recently-added tz-vs-rules sanity
check inside extractDstRule. It is safe only because the
isFixedOffset() short-circuit at the top of extractDstRule fires
before the sanity check.
Spell out that ordering dependency in a comment so a future
reordering of guards in extractDstRule does not silently turn a
benign helper into a source of "describe different zones"
exceptions.
Signed-off-by: Chong Gao <res_life@163.com>
The early-return at the second DST-on transition in extractDstRuleByProbing was uncovered. Every other Path-A-success test reaches Path A by failing Path B's cross-year verify, leaving the "second DST-on transition seen" guard untouched. Add a synthetic TimeZone that spring-forwards twice per year (March 2nd Sunday and June 1st Sunday) plus a fall-back at November 1st Sunday. ZoneRules supplies one historical transition to escape rules.isFixedOffset() short-circuit but no recurring rules, so Path A also returns null on the empty getTransitionRules early-out -- the terminal "Failed to extract" throw is the expected outcome and confirms probing returned null specifically due to the multiple-DST-on guard. Signed-off-by: Chong Gao <res_life@163.com>
|
build |
12 OrcTimezoneInfoTest cases failed under the maven build (Java 1.8 target) while passing under a JDK 17 compile check. Root cause: ZoneRules.of(base, base, emptyList, emptyList, recurringRules) -- recurring DST rules but no concrete historical transitions -- reports isFixedOffset() == true on JDK 8, so extractDstRule's rules.isFixedOffset() guard short-circuits to null before either extraction path runs. The throws-tests then saw no exception and the Path-A-success tests saw a null rule. JDK 17 reports false for the same construction, which is why the regression was invisible to standalone javac checks during development. Two fixtures (BothPathsFail, MultipleDstOnTransitions) already included a historical transition for exactly this reason; the other 12 did not. Add a shared SYNTHETIC_HISTORICAL_TRANSITION constant (an inert 1900 wall-offset transition that leaves getStandardOffset and the post-2060 probing windows untouched) and route every synthetic rules-only ZoneRules.of through it. Real IANA zones always carry historical transitions, so production is unaffected. Verified: OrcTimezoneInfoTest 26/26 green on both JDK 8 and JDK 17. Signed-off-by: Chong Gao <res_life@163.com>
|
build |
firestarman
left a comment
There was a problem hiding this comment.
LGTM, some nits, but good to merge it as-is.
| if (diff < 0) diff += 7; | ||
| return Math.max(anchorDay - diff, 1); | ||
| } | ||
| case DstRule.MODE_DOM: |
There was a problem hiding this comment.
NIT: Empty case, can be removed ?
There was a problem hiding this comment.
Keep it → documents intent — it makes explicit that MODE_DOM is a recognized, handled mode (mapped to identity), rather than something silently swept into default.
| // to the variant. | ||
| long[] boundaries = {dstStart, dstEnd}; | ||
| for (long boundary : boundaries) { | ||
| long from = boundary - 12 * 3_600_000L; |
There was a problem hiding this comment.
NIT: better name these literal values.
| int month = ldt.getMonthValue() - 1; // 0-based for Calendar compat | ||
| int dayOfMonth = ldt.getDayOfMonth(); | ||
| int dayOfWeek = toCalendarDayOfWeek(ldt.getDayOfWeek().getValue()); | ||
| int timeInDay = ldt.getHour() * 3_600_000 |
There was a problem hiding this comment.
NIT: Better name these literal values.
Context
Part 2 of the split of #4432 (full ORC DST GPU support). Part 1 (#4539, runtime build of
OrcTimezoneInfo) is merged. This PR adds the Java-side machinery to extract a recurring DST rule for a zone, but does not yet wire it into the GPU dispatch path — production behavior is unchanged.What this adds
New file
OrcDstRuleExtractor.java(package-privatefinal class), which hosts:static final class DstRule— SimpleTimeZone-shaped DST descriptor (raw offset, dst savings, start/end month / day / dayOfWeek / time / mode), 0-based month and Calendar-style 1=Sun..7=Sat to match the GPU consumer.extractDstRule(timezoneId, tz, rules)— entry point. Returnsnullfor non-DST zones (rules.isFixedOffset()or!tz.useDaylightTime()). Fail-fast sanity check thattzandrulesdescribe the same zone (guards against the silent-GMT trap whereTimeZone.getTimeZone(id)returns GMT for offset-style ids).extractDstRuleByProbing— probetz.getOffset()hourly across a reference year, find the two transitions, encode them in SimpleTimeZone form. Captures whatjava.util.TimeZone.getOffsetactually returns, which is the source of truth the GPU side must match.extractDstRuleFromZoneRules— derive fromZoneRules.getTransitionRules(). Used for zones whose recurring rule cannot be recovered by probing alone.tz.getOffsetacross anchor years 2060, 2400, 9997 to catch divergence well into the recurring-rule fallback regime.In
OrcTimezoneInfo.java:utcMillisForDateandbinarySearchTransitionfromprivateto package-private soOrcDstRuleExtractorcan reuse the same UTC anchor and bracketed binary search. No behavioral change.What this does not change
GpuTimeZoneDB.convertOrcTimezones— DST gate still throwsUnsupportedOperationExceptionfor DST zones.buildRuntimeOrcTimezoneInfo— does not yet populate anyDstRule.OrcTimezoneInfo's public constructor / fields (the two widened methods are package-private, not public).extractDstRuleis package-private and exercised only byOrcTimezoneInfoTest.Verification
Extracted rules cross-checked against tzdata; the test suite is green on both JDK 8 (the build's
maven.compiler.target) and JDK 17 Zulu:nullnullReal-zone tests assert the exact (
month,day,dayOfWeek,time,mode) shape for each zone. Synthetic-zone tests additionally cover both extraction paths (probing success, ZoneRules fallback), everyIllegalStateExceptionbranch (unsupported rule count / shape, zero-/both-positive-/both-negative-delta, mismatched savings, negative day indicator, cross-year verification failure, tz/rules zone mismatch), and edge cases (isMidnightEndOfDay,TimeDefinition.WALL, February day-of-month clamp, multiple-DST-on probing guard). 26 tests total.Diff size
+1314 / -2across 3 files:OrcDstRuleExtractor.java+529(new file)OrcTimezoneInfo.java+6 / -2(visibility widening)OrcTimezoneInfoTest.java+779No changes outside this slice.
Follow-up parts
DstRuleserialization scaffolding (still all dead-code).convertOrcTimezones, dispatch DST zones through the new kernel, add end-to-end Java tests. This is the only follow-up part that changes runtime behavior.