Professional Documents
Culture Documents
Data
On-Boarding
Andrew
Duca
Sr.
Professional
Services
Consultant,
Splunk
Disclaimer
During
the
course
of
this
presentaGon,
we
may
make
forward-looking
statements
regarding
future
events
or
the
expected
performance
of
the
company.
We
cauGon
you
that
such
statements
reect
our
current
expectaGons
and
esGmates
based
on
factors
currently
known
to
us
and
that
actual
events
or
results
could
dier
materially.
For
important
factors
that
may
cause
actual
results
to
dier
from
those
contained
in
our
forward-looking
statements,
please
review
our
lings
with
the
SEC.
The
forward-looking
statements
made
in
the
this
presentaGon
are
being
made
as
of
the
Gme
and
date
of
its
live
presentaGon.
If
reviewed
aQer
its
live
presentaGon,
this
presentaGon
may
not
contain
current
or
accurate
informaGon.
We
do
not
assume
any
obligaGon
to
update
any
forward-looking
statements
we
may
make.
In
addiGon,
any
informaGon
about
our
roadmap
outlines
our
general
product
direcGon
and
is
subject
to
change
at
any
Gme
without
noGce.
It
is
for
informaGonal
purposes
only,
and
shall
not
be
incorporated
into
any
contract
or
other
commitment.
Splunk
undertakes
no
obligaGon
either
to
develop
the
features
or
funcGonality
described
or
to
include
any
such
feature
or
funcGonality
in
a
future
release.
2
About
Me
! Senior
Professional
Services
Consultant
based
in
Boston,
MA
! 14+
Years
of
world-wide
Professional
Services
ConsulGng
with
the
last
two
at
Splunk
! Involved
in
20+
deployments
from
1GB
to
5TB
3
Agenda
! Data
! Splunk
Components
! Index
Data
! Proper
Parsing
! Challenging
Data
! Advanced
Inputs
4
Are
You
in
The
Right
Room?
! You
have
used
Splunk
at
least
once,
or
at
least
read
about
it
! You
are
interested
in
Splunk
best
pracGces
! You
like
to
use
Splunks
default
parsing
rules
! You
just
took
over
a
Splunk
deployment
and
youre
not
sure
what
to
do
! This
is
not
an
educaGon
class;
its
best
pracGce
5
Data
Splunk
is
the
engine
for
machine
data
!
Machine
data
is
more
than
just
logs
-
it's
conguraGon
data,
data
from
APIs
and
message
queues,
change
events,
the
output
of
diagnosGc
commands
and
more
! Log
types:
ApplicaGon,
Web
Access
and
Proxy,
Call
Detail
Records
(CDR),
Clickstream,
Message
Queues,
Packet,
Database
audit
and
tables,
File
audit,
Syslog,
WMI,
PerfMon
! Manual:
Gecng
Data
In
hdp://docs.splunk.com/DocumentaGon/Splunk/latest/Data/
WhatSplunkcanmonitor
6
Splunk
Apps
! Look
to
Splunk
Apps
rst
and
uGlize
Technical
Add-On
(TA)
! Applies
the
Common
InformaGon
Model
(CIM)
! CIM
details
the
standard
elds,
event
type
tags,
and
host
tags
that
Splunk
uses
when
it
processes
most
IT
data
! Example
TAs:
Windows
Unix
Exchange
AcGve
Directory
VMware
Vcenter
WebSphere
7
Splunk
Distributed
Components
Search Head
Deployment Server
Indexer
Forwarder
8
Test
Environment
! Every
Splunk
deployment
should
have
a
test
environment
! It
can
be
a
laptop,
virtual
machine
or
spare
server
! Should
have
the
same
version
of
Splunk
running
in
producGon
! Accessible
to
other
Splunk
developers
and
administrators
9
One
Shot
! Easiest
way
to
get
data
into
your
test
environment
! Components
of
the
oneshot:
./splunk
add
oneshot
user_conf.txt
index
indexname
sourcetype
sourcetype
name
! Where
to
nd
more
informaGon:
hdp://docs.splunk.com/DocumentaGon/Splunk/latest/Data/
MonitorlesanddirectoriesusingtheCLI
10
Data
-
Broken
11
Props
! Always
set
these
six
parameters
#
USER
CONFERENCE
[user_conf_2014]
TIME_PREFIX
=
^
TIME_FORMAT
=
%Y-%m-%d
%H:%M:%S
MAX_TIMESTAMP_LOOKAHEAD
=
19
SHOULD_LINEMERGE
=
False
LINE_BREAKER
=
([\n\r]+)\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}
TRUNCATE
=
10000
12
Props
! Defaults
to
empty
#
USER
CONFERENCE
[user_conf_2014]
TIME_PREFIX
=
^
TIME_FORMAT
=
%Y-%m-%d
%H:%M:%S
MAX_TIMESTAMP_LOOKAHEAD
=
19
SHOULD_LINEMERGE
=
False
LINE_BREAKER
=
([\n\r]+)\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}
TRUNCATE
=
10000
13
Props
! strpGme
Style
format
#
USER
CONFERENCE
[user_conf_2014]
TIME_PREFIX
=
^
TIME_FORMAT
=
%Y-%m-%d
%H:%M:%S
MAX_TIMESTAMP_LOOKAHEAD
=
19
SHOULD_LINEMERGE
=
False
LINE_BREAKER
=
([\n\r]+)\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}
TRUNCATE
=
10000
14
Props
! By
default
MAX_TIMESTAMP_LOOKAHEAD
=
150
characters
#
USER
CONFERENCE
[user_conf_2014]
TIME_PREFIX
=
^
TIME_FORMAT
=
%Y-%m-%d
%H:%M:%S
MAX_TIMESTAMP_LOOKAHEAD
=
19
SHOULD_LINEMERGE
=
False
LINE_BREAKER
=
([\n\r]+)\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}
TRUNCATE
=
10000
15
Props
! By
default
set
to
True
#
USER
CONFERENCE
[user_conf_2014]
TIME_PREFIX
=
^
TIME_FORMAT
=
%Y-%m-%d
%H:%M:%S
MAX_TIMESTAMP_LOOKAHEAD
=
19
SHOULD_LINEMERGE
=
False
LINE_BREAKER
=
([\n\r]+)\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}
TRUNCATE
=
10000
16
Props
! By
default
set
to
([\r\n]+);
change
to
posiGve
lookahead
#
USER
CONFERENCE
[user_conf_2014]
TIME_PREFIX
=
^
TIME_FORMAT
=
%Y-%m-%d
%H:%M:%S
MAX_TIMESTAMP_LOOKAHEAD
=
19
SHOULD_LINEMERGE
=
False
LINE_BREAKER
=
([\n\r]+)\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}
TRUNCATE
=
10000
17
Props
! By
default
set
to
10000
bytes;
set
to
0
to
never
truncate
#
USER
CONFERENCE
[user_conf_2014]
TIME_PREFIX
=
^
TIME_FORMAT
=
%Y-%m-%d
%H:%M:%S
MAX_TIMESTAMP_LOOKAHEAD
=
19
SHOULD_LINEMERGE
=
False
LINE_BREAKER
=
([\n\r]+)\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}
TRUNCATE
=
10000
18
Data
-
Fixed
19
6.2
Splunk
Web
Data
On-Boarding
Why
to
Use
Splunk
Web
to
On-board?
Quick
and
easy
way
to
! Easily
visualize
the
data
into
events
rather
then
lines
of
text
! Quickly
get
the
data
properly
broken
into
events
! Accurately
get
the
Gme
stamp
extracted
21
Splunk
Web
Data
On-Boarding
! Locate
the
source
le
on
the
Splunk
Servers
le
system
22
Splunk
Web
Data
On-Boarding
! Validate
event
breaking
and
Gmestamp
recogniGon
23
Splunk
Web
Data
On-Boarding
! Resolve
event
breaking
24
Splunk
Web
Data
On-Boarding
! Set
Gmestamp
format
even
if
Splunk
gures
it
out
automaGcally
25
Splunk
Web
Data
On-Boarding
! Copy
the
props.conf
secngs
and
deploy
in
a
custom
app
26
Challenging
Data
Limit
Indexed
Data
! Anonymize
data:
[source::.../accounts.log]
SEDCMD-accounts
=
s/ssn=\d{5}(\d{4})/ssn=xxxxx\1/g
s/cc=(\d{4}-){3}(\d{4})/cc=xxxx-xxxx-xxxx-\2/g
! Discard
events:
transforms
props
[setnull]
[source::/var/log/user_conf.txt]
REGEX
=
(?i)DEBUG
TRANSFORMS-null=
setnull
DEST_KEY
=
queue
FORMAT
=
nullQueue
28
Limit
Indexed
Data
! Anonymize
data:
[source::.../accounts.log]
SEDCMD-accounts
=
s/ssn=\d{5}(\d{4})/ssn=xxxxx\1/g
s/cc=(\d{4}-){3}(\d{4})/cc=xxxx-xxxx-xxxx-\2/g
! Discard
events:
transforms
props
[setnull]
[source::/var/log/user_conf.txt]
REGEX
=
(?i)DEBUG
TRANSFORMS-null=
setnull
DEST_KEY
=
queue
FORMAT
=
nullQueue
29
Limit
Indexed
Data
! Anonymize
data:
[source::.../accounts.log]
SEDCMD-accounts
=
s/ssn=\d{5}(\d{4})/ssn=xxxxx\1/g
s/cc=(\d{4}-){3}(\d{4})/cc=xxxx-xxxx-xxxx-\2/g
! Discard
events:
transforms
props
[setnull]
[source::/var/log/user_conf.txt]
REGEX
=
(?i)DEBUG
TRANSFORMS-null=
setnull
DEST_KEY
=
queue
FORMAT
=
nullQueue
30
Limit
Indexed
Data
6.X
or
later
Windows
forwarders
31
Index
ExtracGons
! Provides
reliable
and
consistent
indexing
of
data
with
headers.
! Address
issue
on
forwarder:
INDEX_EXTRACTIONS
=
{CSV
|
W3C
|
TSV
|
PSV
|
JSON}
! Supports
custom
header
parsing
and
easy
mode
for
common
formats.
! Extract
IIS
elds
using
Props.conf
on
Windows
forwarder:
[iis]
INDEX_EXTRACTIONS
=
w3c
32
MulGple
Timestamps
12-Sep-2014,09:01:00,12-Sep-2014,09:02:00,-4
INFO
Gtle="User
Conference"
msg="Splunk
hosted
user
conference
in
Las
Vegas."
12-Sep-2014,19:01:00,12-Sep-2014,19:02:00,-5
DEBUG
Gtle="User
Conference"
msg="Gecng
Data
In,
Correctly
is
a
solid
session."
datePme.xml
<datetime>
<text><![CDATA[^(\d+)-(\w+)-(\d+),(\d+):(\d+):(\d+),(?:[^,]*,){2}([\w\-]*)]]></text>
</define>
<timePatterns>
<use name=two_tz">
</timePatterns>
<datePatterns>
<use name=two_tz">
</datePatterns>
</datetime>
props.conf
#
USER
CONF
[user_conf]
DATETIME_CONFIG
=
/etc/apps/splk_ps_user_conf_props/local/datetime.xml
*
Do
not
set
TIME_FORMAT
33
Database
Connect
Database
Connect
! Allows
for
indexing
data
from
database
sources
directly
! Allows
for
adding
meta
data
to
events
from
database
sources
using
lookups
Caveats
! Java
required
on
Splunk
server
! Search
head
pooling
requires
custom
conguraGon
to
share
the
DB
connecGon
passwords.
Not
meant
for
data
input
sources
35
Database
Connect
Best
PracGces
! Normalize
Gmestamps
naGvely
inside
the
SQL
Query
! Filter
results
down
in
SQL
Query
to
reduce
garbage
in
Splunk
index
! Repeated
DBLookups
should
be
converted
to
staGc
lookup
! Search
head
pooling
requires
encrypted
password
replicaGon
36
Modular
and
Scripted
Inputs
Modular
and
Scripted
Inputs
Benets
! Almost
any
program
that
can
output
text
can
be
used
to
index
!Modular
inputs
allow
for
conguraGon
les
and
conguraGon
secngs
inside
Splunk
Dierences
! Scripted
inputs
require
conguraGon
to
be
done
in
the
script
! Modular
inputs
can
be
congured
via
deployed
.conf
les
and
accessed
via
REST
API
!Scripted
inputs
need
are
specic
to
the
OS
deployed
on
where
modular
inputs
can
support
mulGple
Examples
vmstat,
iostat,
Checkpoint
Opsec,
Twider,
Stream,
Amazon
S3
Online
storage
and
more
38
Scripted
Inputs
Example
! Shell
script
saved
in
/opt/splunk/bin/scripts/
OR
in
a
specic
app
! Allows
you
to
execute
any
program
on
Splunk
forwarder
and
index
STDOUT
data.
! UGlizing
key
value
pairs
makes
for
easier
searching.
39
Scripted
Inputs
Example
Shell
script
calls
local
system
binary
programs
and
can
provide
conguraGon
opGons.
Use Inputs.conf to dene INDEX, SOURCETYPE, and INTERVAL for the scripted input
40
ProducGon
Deployment
ProducGon
Environment
! Complexity
managing
conguraGons
across
tens,
hundreds,
or
thousands
of
SHP
forwarders
! Not
all
indexers
and
search
heads
receive
the
same
conguraGons
! Should
think
about
version
control
for
deployment
apps,
e.g.,
GitHub
42
Deployment
Server
Terminology
! Deployment
Server
-
A
Splunk
instance
that
acts
as
a
centralized
conguraGon
manager,
grouping
together
and
collecGvely
managing
any
number
of
Splunk
instances.
Any
Splunk
instance
can
act
as
a
deployment
server,
even
one
that
is
indexing
data
locally.
Splunk
instances
that
are
remotely
congured
by
deployment
servers
are
called
deployment
clients.
! Deployment
Client
-
A
Splunk
instance
that
is
remotely
congured
by
a
deployment
server.
! Server
Class
-
Represents
a
conguraGon
of
Splunk
deployment
clients.
Server
classes
enable
the
management
of
a
group
of
deployment
clients
as
a
single
unit.
A
server
class
can
be
used
to
group
deployment
clients
together
by
applicaGon,
OS,
data
type
to
be
indexed,
or
any
other
feature
of
your
Splunk
deployment.
43
Deployment
App
! A
deployment
app
(conguraGon
bundle)
is
a
set
of
deployment
content
(including
conguraGon
les)
deployed
as
a
unit
to
clients
of
a
server
class
! Located
in
$SPLUNK_HOME/etc/deployment-apps
and
pushed
to
deployment
clients
$SPLUNK_HOME/etc/apps
folder
! DO
NOT
store
conguraGons
in
$SPLUNK_HOME/etc/system/local
! Use
deployment
apps
regardless
of
your
deployment
tool
44
Deployment
App
-
Naming
ConvenGon
45
Deployment
App
-
Naming
ConvenGon
46
Deployment
App
-
Naming
ConvenGon
47
Deployment
App
-
Naming
ConvenGon
48
Deployment
App
-
Naming
ConvenGon
49
Deployment
App
-
Naming
ConvenGon
splk_ps_user_conf_inputs
50
Deployment
Apps
mba13:apps
$
ls
-la
! SplunkForwarder
! SplunkLightForwarder
! Splunk_for_AcGveDirectory
! Splunk_for_Exchange
! splk_all_deploymentclient
! splk_all_forwarder_outputs
! splk_all_indexer_base
! splk_all_search_base
! splk_ps_user_conf_inputs
! splk_ps_user_conf_props
! splk_ps_user_conf_web
! splunk_app_was
user-prefs
51
CollecGng
Syslog
! Send
device,
e.g.,
routers,
rewalls
to
a
syslog
collector
! Write
les
to
this
directory
structure:
/sourcetype/host/log.txt
! Monitor
the
sourcetype
level
cisco_asa
my.rewall.name
#
CISCO
ASA
[monitor:///data/cisco_asa//]
sourcetype
=
cisco_asa
host_segment
=
3
index
=
firewall
52
Summary
! Test
in
a
non-producGon
environment
! Always
use
key
props
parameters:
TIME_PREFIX
TIME_FORMAT
MAX_TIMESTAMP_LOOKAHEAD
SHOULD_LINEMERGE
LINE_BREAKER
TRUNCATE
! Deploy
apps
to
/etc/apps;
not
/etc/system/local
! Clear
predictable
naming
convenGon
! When
youre
stuck,
use
Answers
and
Re-Use
apps
from
Apps.Splunk.com
53
Resources
! Get
educated:
hdp://www.splunk.com/view/educaGon/SP-CAAAAH9
! Download
Splunk
applicaGons:
hdp://apps.splunk.com/
! Hire
Splunk
Professional
Services:
hdp://www.splunk.com/view/professional-services/SP-CAAABH9
! Watch
some
videos:
hdp://www.splunk.com/videos
54
THANK
YOU