REST API Modular Input: How to implement a custom response handler to deal with pagination limits pulling from AirTable REST API?


#1

Hello All,

I’ve been trying to pull together a proof of concept using Splunk and a product called Airtable

They have a RESTful API that allows records to be accessed, which is great for use with the REST API Modular Input.

The API structure is:

https://api.airtable.com/v0/[Base]/[Table]

I’m 90% there, but I’m struggling with a ‘pagination’ problem.

The Airtable REST API allows you to pull all of the records in a table, but limits results to a maximum of 100. If there are more results than this, then the response includes an ‘offset’ field in the JSON response. You then need to issue the request again, with this as a parameter.

I know very little about JSON and even less about Python and Splunk Tool, so I don’t have the first clue about how to implement a Custom Response Handler to deal with this.

So far, thanks to previous questions and answers, I’ve got this far with my Response Handler:

class AirTableEventHandler:
 
     def __init__(self,**args):
         pass
 
     def __call__(self, response_object,raw_response_output,response_type,req_args,endpoint):
         if response_type == "json":
             output = json.loads(raw_response_output)
             for record in output["records"]:
                 print_xml_stream(json.dumps(record))
         else:
             print_xml_stream(raw_response_output)

This works for breaking up the results into Events, but doesn’t begin to deal with the pagination question.

I’ve truncated the output to 4 records, but this is the response that you’d actually get back when there are more than 100 records:

{"records":[{"id":"rec4rlLs4jDeqhFRR","fields":{"Outlet":"20-001","Location":["rec5lh6NgFH33y6ax"],"Rack":["recUOCOlAycENRdo5"],"Type":"Wall Socket","Line checked":true,"Found":true,"On-Drawing":true,"Location_Static":"AP.1.004","Rack_Static":"Rack 20","Location Name":["Physiotherapy Area (Male)"]},"createdTime":"2016-08-22T08:27:05.000Z"},{"id":"recrFjR0pyu2RIjp6","fields":{"Outlet":"20-002","Location":["rec5lh6NgFH33y6ax"],"Rack":["recUOCOlAycENRdo5"],"Type":"Wall Socket","Line checked":true,"Found":true,"On-Drawing":true,"Location_Static":"AP.1.004","Rack_Static":"Rack 20","Location Name":["Physiotherapy Area (Male)"]},"createdTime":"2016-08-22T08:27:05.000Z"},{"id":"recQxL1aNe7wdpM4g","fields":{"Outlet":"20-003","Location":["rec5lh6NgFH33y6ax"],"Rack":["recUOCOlAycENRdo5"],"Type":"High Socket","Line checked":true,"Found":true,"On-Drawing":true,"Location_Static":"AP.1.004","Rack_Static":"Rack 20","Location Name":["Physiotherapy Area (Male)"]},"createdTime":"2016-08-22T08:27:05.000Z"},{"id":"recLK0w2w1HTF1Jwu","fields":{"Outlet":"20-004","Location":["rec5lh6NgFH33y6ax"],"Rack":["recUOCOlAycENRdo5"],"Type":"High Socket","Found":true,"On-Drawing":true,"Location_Static":"AP.1.004","Rack_Static":"Rack 20","Location Name":["Physiotherapy Area (Male)"]},"createdTime":"2016-08-22T08:27:05.000Z"}],"offset":"itrATjKH4HwrwifHj/rectLCyqqrnhD0jr9"}

The crucial bit here being the:

 "offset":"itrBTjKH4HwrwifHj/rectLCyqqrnhD0jr9"
This changes with each 100 record response.

So to get the next 100 results, I’d need to do something like:

 https://api.airtable.com/v0/[Base]/[Table]?offset=itrBTjKH4HwrwifHj/rectLCyqqrnhD0jr9
(And then the rest of the "URL Arguments" as defined in the Modular Input)

Would anyone be able to suggest a Custom Response Handler that could iterate over this offset key, to capture all of the results?

In general, Airtable looks to be a very flexible DB solution for some SMB use-cases, so would be great if I could get this working with Splunk.

Thanks in advance,