Nimble Web API
Last updated
Last updated
This job triggers an asynchronous Web API scraping job. It uses this endpoint:
For each WebAPI scraping job you wish to create you must create a separate Orchestra task. This way, you can use Orchestra to trigger your data ingestion on a manual or event-based schedule. This has a number of advantages:
You can co-ordinate tasks outside of Nimble. Once the Nimble scraping jobs are completed you can run any data transformation or clean jobs.
When Nimble web scraping jobs run, Nimble credits are consumed. Running these operations on a schedule you set explicitly ensures these costs do not go out of hand.
We aggregate metadata from the Nimble task in the same place as the metadata from other operations in your Pipeline
With Nimble you can provide a URL that Nimble will scrape using it's AI scraper. This will extract the relevant information from the web page and return it to you in a structured format.
As Orchestra cannot accept the data directly from Nimble, you will need to configure a delivery method in Nimble to store the data. This can be an AWS S3 bucket or a GCP storage bucket (see below for more information on delivery methods).
Orchestra uses the asynchronous task flow to make a Web API request to Nimble.
Before using Orchestra to trigger a Nimble task it is recommended to build the request you wish to make using the Nimble playground. This will allow you to test the request and ensure it is working as expected. The playground can be found .
Orchestra supports the following delivery methods for Nimble:
AWS S3 bucket
GCP storage
These parameters are required to run the Nimble Web API task.
Request URL
String
URL
https://www.example.com
Storage type
Enum
AWS - S3 or GCP - Google Storage
AWS - S3
Storage URL
String
n/a
s3://bucket/prefix
If we receive the following error codes from Nimble, we'll raise an error and the task will move to a failed state.
401
Unauthorised
We will raise an error and parse the raw error message from the Nimble response as the Orchestra message
Other error code
We will raise an error with the HTTP Reason as the Orchestra message
You must configure a delivery method before you can use Nimble in Orchestra. Insturctions on how to configure a delivery method can be found . Be sure to configure your storage with the correct permissions, instructions for this can be found .