01.-What is a data series? (01:29)
01.-Searching with Time (25:36)
03.-Using time commands (09:29)
04.-Working with time zones (2:43)
Time is one of the most important components to consider when working with data. We will learn:
- How to work with time in searches.
earliest=@d
,latest=@d
- Time base functions (
now()
,time()
,strftime()
,strptime()
) - Time commands (
timechart
,timewrap
) - Understand how time zones are represented in the data
What does time mean to Splunk?
When an event is ingested, its timestamp is stored in the _time
field that is used to implement the event timeLine in the Splunk Web interface.
The _time
field is stored with the event in the index prior to the search time, along with default fields such as host, source, and source type (also the index name in which it is stored.
Timestamps are expressed in Unix or epoch time (_time = 1619736430) and translated to human-readable time (Thu, 29 Apr 2021 22:47:10 GMT) during the search operation process.
All events are sorted by time, thus time is the most efficient filter.
The timestamp displayed in the Time Column of each event is adjusted to the user's local time zone, As long as the time zone is set within the account settings.
The timeline is an overall distribution of the events in relation to the specific time range set for the search.
- Zoom in, Zoom Out.
- Hover to view the event count for a specific time.
- Format the x and y-axis.
- Rearrange the time range with the `click and drag' (acts as a filter without re-running the search).
- and undo with the
deselect
orzoom to selection
RUNNING A BRAND NEW SEARCH
We can control the overall time range for our search with a bit more detail using the time range picker: We can customize preset options in the menu Settings/user interface/time ranges.
Splunk Cloud version
-
real-time search: return results up to the second they are coming in. The benefit of that is that we see data as it's coming into Splunk in real-time. Real-time searches are resource intensive. They consume an entire CPU-CORE. They continuously update search results as the events arrive. Multiple real-time searches could impact the overall performance. One alternative is to Schedule a report.
-
Relative time option;
- earliest time range.
- latest (now or at a beginning of the current hour)
-
Advanced: we can express time unit abbreviations which allows us to be even more specific about our time.
Please note the absolute times below the field texts.
Time unit abbreviations can also be used within the earliest
and latest
time modifiers in our search to override the TimeRange Picker settings.
Syntax starts with plus +
or minus -
to look forward or back in time. Continues with a number (1 is implicit) and a specific time unit, and we can round down to a specific time unit using @
symbol
For week days we can use w1(Monday)... w6(Saturday, w6 or w0 (Sunday).
For understanding this example let's assume is exactly 9:45 AM on April 1st, 2021.
current time | modifier | meaning |
---|---|---|
9:45 AM on April 1st, 2021. | -30m@h | -30m 30 minutes before ==> 9:15 AM @h rounded down to hours ==> 9:00 AM |
9:45 AM on April 1st, 2021. | -h@h | -h one hour before ==> 8:45 AM @h rounded down to hours ==> 8:00 AM |
9:45 AM on April 1st, 2021. | earliest=-mon@mon latest=@mom | -mon one mon before ==> 9:45 AM on march 1 st @m rounded down to mon ==> 9:45 AM on march 1 st .. 00:00 AM April 1st, 2021 |
9:45 AM on April 1st, 2021. | earliest=-7d@d | -7d seven days before ==> 9:45 AM on march 25th @d rounded down to day ==> 00:00 AM on march 25th .. 9:45 AM on April 1st, 2021 |
9:45 AM on April 1st, 2021. | earliest=@d+3h | @d rounded down to day ==> 00:00 AM on march Apro 1st +3h plus 3 hours ==> 03:00 .. 9:45 AM on April 1st,201 |
In this example, the time range picker was settled -5h@h
, but the search textbox has a restriction to view only the last five minutes earliest=-5m@m latest=now
In this dual specification, the restriction specified in the search textbox has a prevalence.
earliest
and latest
modifiers have prevalence over time Range picker.
There are a set of fields that are available in most cases for controlling the time of our searches. All begin with date_
prefix followed by a specific unit time (hour, mday (month_day) minute, month, second, wday (weekday) year, zone. These date_
fields are only generated for events where inside the raw event, Splunk is able to extract the timestamp.
¿Do not all events that Splunk logs have timestamps assigned to them?
Every single event will be assigned a value for _time
, but not all source events have an actual timestamp value in them. In this case, the _time
field is filed with the even's indexation time.
As this event has a timestamp inside _raw
field, Splunk extracts all date_*
In a lot of log files, the first value will be a timestamp, and Spluck's default event processing is that it wants to break the event at the timestamp. Whenever it sees another timestamp it will want to break it into a new event.
The difference between the _raw
timestamp and the time under the Time
columns is due to the time zone preference of the account settings.
When I removed such a setting the _raw
timestamp and the time under the Time
columns becomes the same.
We can see in this other set of events without a timestamp that date_*
fields are not generated and we can see also that time under Time
column is the same for all events. That timestamp comes from index time. When Splunk does not manage to find a timestamp by default uses index time. It is the time the events were indexed.
When the default time fields date_*
are present we can use them to solve specific use cases in our searches.
Earliest
is looking back two days ago -2d
to the beginning of day @d
and latest
is rounding down to the beginning of the current day.
And we are interested in displaying early-morning events from 2 AM until 5 AM in UNIX TIME.
The bin command helps us bucket up our events. We can put numerical values into discrete sets or bins. The bin command has a span
option that is set to an integer or time-scale that allows us to set the size for each bin.
In this example, I try to bin the events in intervals of 15 minutes and made some statistics over the events intra-bin. Count a number of file names as indexations. Distinct count of file names as indexes, a ratio of times each index is indexed, and the sum of the file sizes.
You can define how time is formatted in the search results using the time functions of the eval command.
Calculates an expression and puts the resulting value into a new o existing field that can be reused in the search pipeline. Supports a vast assortment of functions. Can exist as an expression.
-
now()
function returns the time that a search was started. -
time()
function returns the time an event was processed by eval command. -
relative_time(X, Y)
function returns an epoch timestamp relative (Y) to a supplied time(X) -
strftime(X, Y)
function converts a epoch timestamp (X) into a string representing a time accordingly to the format (Y) expressed by a string. From Epoch to formatted. -
strptime(X, Y)
function converts a string (X) representing a time, into a Unix Timestaps based on a format (Y) expresseed by a string. From Formatted to Epoch.
[Documentaition link] (https://docs.splunk.com/Documentation/SCS/current/Search/Timevariables)
Timechart
performs statistical aggregations against time.
It is a transforming command that allows plotting data over time where _time
is always the x-axis and the leftmost column of our table.
Support the same family of functions that the stats
and chart
commands.
Has a built option span
that allows us to control the time gap of our bins.
The result can be split by another field with the by
clause.
Timechart are best visualized as a line or area chart.
This simple command produces a Single series time series.
Using the by
clause we can get NULL
column that counts the events with no value on the field used in the by
clause.
The by
clause creates a multi-series time series.
Time range | Default time bucket |
---|---|
last 30 days | 1 day |
last 7 days | 1 day |
last 24 hours | 30 minutes |
last hour | 1 minute |
last 15 minutes | 1o secodns |
Uses an integer and a time unit to define de bucket size we want the aggregation to be made.
By default, the by
clause shows the top 10 series in a multi-series time series. If there are more, are grouped into an OTHER
single series.
Limit = 1
shows the most important single series
If set to limit=0, all distinct values are used. All other values are grouped into an 'OTHER', as long as `useother´ is not set to false
Helps to display the result of timechart
command in such a way that each time period is a separate series. We can compare data over a specific time period
tiemwrap
tipically will follow a timechart
command
Transforms a regular timechart like this one.
Into a chart to compare 6 hours periods where you can see that recent peaks have been under control in the last six hours. Be aware of how legend changes and adapts its titles to discriminate each single series.
Working with time base data for multiple geographic locations can be tricky.
We see here how time zones are represented in the data and how to use strftime function to correct time zones in search results.
**FIRST: Verify that results return what we are expecting.
If you wait for results between 2 and 5 AM, but the results display times between 9 to 11 PM
the date_*
fields do not reflect my local time. They are the values directly from the raw events.
To determine your time zones:
1.- In preferences, set the time zone to Default system timeZone.
2.- Run a search over the last 15 minutes.
3.- Read the event time stamps and compare them with your local time.
I did it from Spain, timezone GMT+1 at local time is 20:34 and I got the results od 18:24, two hours before
After changing again the time zone....
Do not take into account the time zone of the Splunk Web. We have to ensure that we set our time zone to the default system time, or as an alternative we use the strftime
function to organize our data and normalize it.
%H
modifier makes strftime
display data with the user's time zone preference
In previous examples, my_hour
takes into account the timezone that is set. Date_hour field does not adjust the hour based on Splunk web time zone.
According to what we have learned we rearrange the previous search to consider my_hour calculated with strftime
instead of date_hour.