ORC tz split 2/6: OrcTimezoneInfo DstRule and DST rule extraction#4546
ORC tz split 2/6: OrcTimezoneInfo DstRule and DST rule extraction#4546res-life wants to merge 1 commit into
Conversation
Part of the split of #4432. Replaces the snapshot-driven OrcTimezoneInfo with a runtime-built version that derives metadata from java.time.zone.ZoneRules and java.util.TimeZone. This PR adds: - License header, java.time / java.util.concurrent imports. - Updated OrcTimezoneInfo constructor, initialOffset/rawOffset fields, and the DstRule inner class. - DST_RULE_VALIDATION_YEARS / MIN_SUPPORTED_ORC_UTC_MILLIS constants and the historical scan step. - extractDstRule entry point, extractDstRuleFromZoneRules, fillDstRuleFromTransitionRule, getTransitionRuleTimeMillis, getTransitionRuleTimeMode, toCalendarDayOfWeek. - extractDstRuleByProbing (probing fallback) and the binarySearchTransition / decodeTransition helpers it relies on. The rest of the rewrite (verifyDstRule, computeDstOffset, transition math, toString, runtime registry, HistoricalTransitions) lands in orc-tz-3-orctzinfo-runtime-build. This intermediate state intentionally truncates the class after decodeTransition. Signed-off-by: Chong Gao <chongg@nvidia.com>
91450ec to
b38def7
Compare
c11c7cd to
0d0a2c4
Compare
| private static boolean verifyDstRuleAcrossReferenceYears(TimeZone tz, DstRule rule) { | ||
| for (int refYear : DST_RULE_VALIDATION_YEARS) { | ||
| if (!verifyDstRule(tz, rule, refYear)) { | ||
| return false; | ||
| } | ||
| } | ||
| return true; | ||
| } |
There was a problem hiding this comment.
Missing methods prevent compilation
verifyDstRuleAcrossReferenceYears calls verifyDstRule(tz, rule, refYear) (line 355), and the static field initializer at line 147 calls utcMillisForDate(1, 0, 1), but neither method is defined anywhere in this file or in any other file in the package. The class will fail to compile as-is, meaning any CI job targeting this branch will fail. The PR description explicitly notes this is intentional ("this intermediate state intentionally truncates the class after decodeTransition"), but it's worth confirming that CI is not expected to pass until #orc-tz-3 lands.
| import java.time.DateTimeException; | ||
| import java.time.Instant; | ||
| import java.time.LocalDate; | ||
| import java.time.ZoneId; | ||
| import java.time.zone.ZoneOffsetTransition; | ||
| import java.time.zone.ZoneOffsetTransitionRule; | ||
| import java.time.zone.ZoneRules; | ||
| import java.util.ArrayList; | ||
| import java.util.Arrays; | ||
| import java.util.List; | ||
| import java.util.TimeZone; | ||
| import java.util.concurrent.ConcurrentHashMap; | ||
| import java.util.concurrent.ConcurrentMap; |
There was a problem hiding this comment.
Numerous forward-declared unused imports
Nine of the eleven new imports are unused in this file: DateTimeException, Instant, LocalDate, ZoneId, ZoneOffsetTransition, ArrayList, Arrays, ConcurrentHashMap, ConcurrentMap. The PR description calls this out ("java.time / java.util.concurrent imports"), so these are intentional pre-declarations for #orc-tz-3. Just worth noting that javac -Xlint:all or a strict checkstyle configuration will flag these, which may also break CI depending on build settings.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| private static int[] decodeTransition(long utcMs, int rawOffsetMs) { | ||
| // Convert UTC ms to standard local time | ||
| long localMs = utcMs + rawOffsetMs; | ||
| java.time.Instant instant = java.time.Instant.ofEpochMilli(localMs); | ||
| java.time.LocalDateTime ldt = java.time.LocalDateTime.ofInstant( | ||
| instant, java.time.ZoneOffset.UTC); |
There was a problem hiding this comment.
Fully-qualified
java.time names inside decodeTransition
java.time.Instant, java.time.LocalDateTime, and java.time.ZoneOffset are all written with their full package prefix here, while the rest of the file uses short imported names. Instant is already imported (albeit unused elsewhere right now), and LocalDateTime/ZoneOffset should be added to the import block alongside the other java.time.* types. Using fully-qualified names in a method body is unusual Java style and makes the code harder to read at a glance.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| // calendars agree on the offset at this instant. Kept as a single anchor so | ||
| // the GPU side matches whatever TimeZone.getOffset returns here. | ||
| private static final long MIN_SUPPORTED_ORC_UTC_MILLIS = utcMillisForDate(1, 0, 1); | ||
| private static final long HISTORICAL_TRANSITION_SCAN_STEP_MILLIS = 24L * 3600_000L; |
There was a problem hiding this comment.
HISTORICAL_TRANSITION_SCAN_STEP_MILLIS declared but never used
This constant is defined here but never referenced in any method of this file. If it belongs to the HistoricalTransitions logic described as landing in #orc-tz-3, it might be cleaner to define it closer to where it is used rather than leaving it as dead code in this intermediate state.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
|
useless now. |
Part of the split of #4432.
Companion: NVIDIA/cudf-spark#14544
Previous: #4545
Replaces the snapshot-driven OrcTimezoneInfo with a runtime-built version that derives metadata from
java.time.zone.ZoneRulesandjava.util.TimeZone. This PR adds:DstRuleinner class.The rest of the rewrite (verifyDstRule, computeDstOffset, transition math, toString, runtime registry, HistoricalTransitions) lands in #orc-tz-3. This intermediate state intentionally truncates the class after
decodeTransition.Signed-off-by: Chong Gao chongg@nvidia.com