How to Create a Splunk KV Store State Table or Lookup in 10 Simple Steps

As of Splunk 6.2, there is a Key-Value (KV) store baked into the Splunk Search Head. The Splunk KV store leverages MongoDB under the covers and among other things, can be leveraged for lookups and state tables. Better yet, unlike regular Splunk CSV lookups, you can actually update individual rows in the lookup without rebuilding the entire lookup – pretty cool! In this article, we will show you a quick way of how you can leverage the KV store as a lookup or state table. State tables are extremely useful from an operational or security perspective to keep track of the last time something occurred. For example, perhaps you want to keep track of user logins or authentications.

For this exercise, we are going to track the last time we saw each Windows Event ID in our Windows Event logs and update any rows in the lookup on a daily basis if a newer event has been detected. To do this, we are going to create a regular CSV lookup, then convert it to a KV store lookup – mainly because it is much simpler to do this and it can *mostly* be performed from the Splunk Search Head GUI.

1. The first step in the process is to think of a name for your lookup. In this case, we are going to use the lookup to track windows events, so we are going to call the lookup last_windows_events. In SPLUNK_HOME/etc/system/local (or similar) create a new conf file called collections.conf. Into this file enter the name of your lookup as follows:

[last_windows_events]

Then save the file and restart Splunk. This is the only part of this exercise where you will be touching back-end conf files. Everything else can be done via the GUI.

2. Now we are going to create a regular lookup that counts Windows Events by event_id using a search like this:

index=main host=* host="DISCOVERED-INTELLIGENCE" sourcetype="WinEventLog*" earliest=-24h 
| stats latest(_time) as LastEvent by LogName EventCode Type
| convert ctime(LastEvent)
| eval LogName=lower(LogName) | eval Type=lower(Type) 
| outputlookup last_windows_events.csv

This search essentially creates our old-school regular CSV lookup and searches over the past 24hrs details every windows event code for the past 24hrs and the last time we last saw each event.

Note – there is currently an issue with case and KV Store lookups in Splunk – this is why we have converted the text fields to lowercase before writing to the lookup. I will remove this point, once the issue is remediated.

3. Next, lets create the lookup in the GUI. Navigate to Settings –> Lookups –> Lookup definitions. Then click the New button and enter in the details, selecting the File-based lookup as in the diagram below and hit Save when done. Now, we could select a type of KV Store from the drop down list, but we are taking the lazy approach and do not want to enter field names etc.

Create Lookup in GUI

4. Your regular lookup is now created and you can test it with the following Splunk search:

| inputlookup last_windows_events

You should see the results being displayed.

5. Now, lets go back to the Lookup definitions screen again and edit the lookup we just created. Change the Type to KV Store. You will now see all the field names in the lookup are automatically populated for you – bonus! Enter the Collection Name with exactly the same name you entered into collections.conf in step 1. In addition, KV Store lookups in Splunk come with a hidden field called _key, which is a unique identifier of the each row in the lookup. We are going to use this field to identify which rows we want to update in future runs of our search. We need to add this as a field as in the illustration below. Hit Save when this step is done.

lookup_creation2

6. Your new KV Store lookup is now created and you can test it with the same Splunk search as before:

| inputlookup last_windows_events

Yes, the results are identical, but trust me – you are now using the KV store for your lookup. You could even delete the old CSV file at this point if you wanted as we are now done with it.

7. Ok, so we now have our shiny new KV lookup working. Let’s now amend our search so it updates only the rows that need updating, save it and schedule it to run daily. To do this, we are actually going to use this newly created lookup in the search to enrich the data with the hidden _key field, then we are going to update the lookup where the _key fields match. Got that? That’s ok if you didn’t – let’s continue and you will probably get it. Modify the search in step 2 as follows:

index=main host=* host="DISCOVERED-INTELLIGENCE" sourcetype="WinEventLog*" earliest=-24h 
| stats latest(_time) as LastEvent by LogName EventCode Type
| convert ctime(LastEvent)
| eval LogName=lower(LogName) | eval Type=lower(Type) 
| lookup last_windows_events EventCode, LogName, Type OUTPUTNEW _key AS viewKey

If you run this search now, you will see a unique key is placed in any rows that have a matching entry in the lookup we created. Of course the LastEvent date will likely be different, but this is what we are going to update in the lookup/state table.

8. Lets modify the search further to finish it up.

index=main host=* host="DISCOVERED-INTELLIGENCE" sourcetype="WinEventLog*" earliest=-24h 
| stats latest(_time) as LastEvent by LogName EventCode Type
| convert ctime(LastEvent)
| eval LogName=lower(LogName) | eval Type=lower(Type) 
| lookup last_windows_events EventCode, LogName, Type OUTPUTNEW _key AS _key
| outputlookup last_windows_events append=t

Note that we are now outputting the key field from the lookup as the hidden field _key and not viewKey… Additionally, the last row containing outputlookup writes out the results to the KV store. Any rows in the existing store that have a _key field that matches the new data being written to the store will be overwritten/updated. Any new data that does not have an existing entry in the KV Store table will be appended to the KV table and assigned a unique _key field.

9. Your new KV Store lookup is now updated and you can test it with the same Splunk search as before:

| inputlookup last_windows_events

You should see that the LastEvent date/time was updated with the lastest event date for that particular Windows EventCode. Essentially, updating it’s state.

10. Save the search and schedule it to run every 24 hours. Every time the search now runs, the LastEvent field will be updated and any new Windows EventCodes added, together with their respective LastEvent date.

Points to Note

  • The above example is for illustrative purposes, it is likely that you may want to increase the frequency of the search.
  • Currently the KV store resides on the Search Head only. This means that the lookups are not passed down to the Indexers. This is expected to change in a future version of Splunk, but is a limitation. This means that all data is brought back to the Search Head before a lookup is applied to it. This is likely not so much of an issue if you are simply using the KV store as a state table, but if you are using it as a large-scale lookup, then expect a performance hit over a regular CSV lookup that is pushed down to the indexers.
  • The Splunk DMC (Distributed Management Console) has some basic dashboarding insight into the KV stores that exist on your Search Head.

Looking to expedite your success with Splunk? Click here to view our Splunk service offerings.

© Discovered Intelligence Inc., 2015. Unauthorised use and/or duplication of this material without express and written permission from this site’s owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Discovered Intelligence, with appropriate and specific direction (i.e. a linked URL) to this original content.