Good point.
I’d like that too by the way.
Good point.
I’d like that too by the way.
Hi Graham,
Splunk references “dynamic setting of URL arguments/HTTP header propertys ie: a stream cursor” in https://splunkbase.splunk.com/app/1546/#/details . That seems exactly what you’d need to do.
API and CSV work quite differently and generally the API should be very fast while CSV export may take some time.
Alex
I’m not a Python developer, but looking at their code examples (specifically the Twitter handler) this seems like it might get close
https://raw.githubusercontent.com/damiendallimore/SplunkModularInputsPythonFramework/master/implementations/rest/bin/responsehandlers.py
class AirTableEventHandler:
def __init__(self,**args):
pass
def __call__(self, response_object,raw_response_output,response_type,req_args,endpoint):
if response_type == "json":
output = json.loads(raw_response_output)
for record in outputu"records"]:
print_xml_stream(json.dumps(record))
if not "params" in req_args:
req_argsg"params"] = {}
if outputu"offset"] is not None:
req_argsg"params"]""offset"] = outputu"offset"]
else:
req_argsg"params"]""offset"] = None
else:
print_xml_stream(raw_response_output)
I’m not a Python developer, but looking at their code examples (specifically the Twitter handler) this seems like it might get close
https://raw.githubusercontent.com/damiendallimore/SplunkModularInputsPythonFramework/master/implementations/rest/bin/responsehandlers.py
class AirTableEventHandler:
def __init__(self,**args):
pass
def __call__(self, response_object,raw_response_output,response_type,req_args,endpoint):
if response_type == "json":
output = json.loads(raw_response_output)
for record in outputu"records"]:
print_xml_stream(json.dumps(record))
if not "params" in req_args:
req_argsg"params"] = {}
if outputu"offset"] is not None:
req_argsg"params"]""offset"] = outputu"offset"]
else:
req_argsg"params"]""offset"] = None
else:
print_xml_stream(raw_response_output)
Hi Chester,
Many thanks for this, I’ve tried it on my test system and it’s 99% working.
It’s now pulling all of the records from my Table, which is great.
The last piece of the puzzle is that the last ‘offset’ now gets stored as one of the parameters for when the Splunk input would run again (it runs on a polling interval).
This is slightly strange, as my understanding is that in the last iteration of pulling data from Airtable, ‘offset’ wouldn’t be in the response.
I wonder if there is a way to ‘default’ this parameter at the end of the process?
Thanks again though, that’s helped get me further along the path!
Graham.
Hi Chester,
Many thanks for this, I’ve tried it on my test system and it’s 99% working.
It’s now pulling all of the records from my Table, which is great.
The last piece of the puzzle is that the last ‘offset’ now gets stored as one of the parameters for when the Splunk input would run again (it runs on a polling interval).
This is slightly strange, as my understanding is that in the last iteration of pulling data from Airtable, ‘offset’ wouldn’t be in the response.
I wonder if there is a way to ‘default’ this parameter at the end of the process?
Thanks again though, that’s helped get me further along the path!
Graham.
Ah! I just edited my previous post to add the following two lines which should “reset” the offset when it reaches the end. I’m not sure if that means that splunk will be re-importing the same records, or if it handles de-duplication on its own.
else:
req_argse"params"]>"offset"] = None
Hi Chester,
Thanks again for taking a look at this. I gave it a go, but it still wasn’t behaving quite right.
But in the meantime, on the Splunk Answers Community site, I had a response and answer from Damien Dallimore, who wrote the Splunk REST API Modular Input.
So he needs to take all the credit for the one below:
class AirTableEventHandler:
def __init__(self,**args):
pass
def __call__(self, response_object,raw_response_output,response_type,req_args,endpoint):
if response_type == "json":
output = json.loads(raw_response_output)
#Print results from the first response
for record in outputu"records"]:
print_xml_stream(json.dumps(record))
offset = outputu"offset"]
#Create the pagination loop
while offset is not None:
#Construct and Get the next URL
next_url = response_object.url+'&offset='+offset
next_response = requests.get(next_url)
output = json.loads(next_response.text)
#Print out results from pagination looping
for record in outputu"records"]:
print_xml_stream(json.dumps(record))
#There is no "offset" value in the last JSON response.
#Need to check for its existence, otherwise get KeyError
#If doesn't exist, set to None and the while loop will exit
if "offset" in output:
offset = outputu"offset"]
else:
offset = None
else:
print_xml_stream(raw_response_output)
I’ve included it as it may help others who have a similar pagination query and know a bit about Python.