Batch Processor Documentation
    • Dark
      Light
    • PDF

    Batch Processor Documentation

    • Dark
      Light
    • PDF

    Article summary

    Purpose

    Provide a means to process a file of many address records through the LightBox goecoder at the same time. Batch processing is significantly more efficient due to the dramatic reduction in network latency caused by individual geocoding requests. Our recommendation is to use Batch Processing for workloads outside of interactive geocoding or autocomplete use cases.  Multiple files can be uploaded and processed simultaneously.  Each file or job can have a unique or consistent configuration of the data attributes to retrieved.  Batch processes are secured as they are only accessible to the user creating the uniquely identified job, each job has a private job identifier.  Batch jobs can be started, canceled and checked for the current process status including an estimated time to complete

    Note: 
    Batch Processes currently only accept comma-delimited files.

    Features

    • Obtain a secure upload and download link for the CSV files 

    • A secure job token, used to monitor the job with secure access to the job’s files

    • The ability to start or cancel a job

    • Supports very large files > 5GB  

    • Flexible field output

    • Job processing status with estimated time to complete

    The process involves:

    • Obtaining a secure and unique upload URL and a job token to a secure cloud storage

    • Uploading the CSV file to a specific job token

    • Start the geocoding process

    • Use the job token to monitor the job

    • From the job token, retrieve a secure download link to the completed CSV file

    • Download the CSV file

    Requirements

    The LightBox APIs are hosted in the cloud and therefore have no platform requirements. Application requirements include:
    • A network connection to the LightBox API server
    • Ability to parse JavaScript Object Notation (JSON) API responses
    • Secure HTTPS connection
    • LightBox authentication key
    • LightBox authentication key

    Connecting your account

    When your LightBox user account is created, a unique API key is also generated. The API key should be kept secret at all times and can only be used for API requests. The key is required in all API calls.

    To retrieve your unique API key:

    • Log in to the LightBox Developer Portal 
    • Select Apps from the menu bar
    • In your approved App, note your API key (under Consumer Key)

    Performing API requests

    All API requests must be made over secure HTTPS connections. Requests made over HTTP will fail.

    The base URL of the API server that all API requests will be made to is: https://api.lightboxre.com/ followed by a version number https://api.lightboxre.com/v1

    Authentication

    LightBox APIs uses a token-based authentication. All requests to the LightBox APIs must be authenticated. The token to be passed via an HTTP header with key 'x-api-key' and value <Your authentication token>

    Pass your unique API key in the authorization header of every LightBox API call.  LightBox uses this information to authenticate your identify and determine whether you have sufficient permissions to complete the operation. curl -X GET -H ‘x-api-key: (api_key)’  https://api.lightboxre.com/

    API Requests

    File Upload < 5GB

    Use this endpoint to retrieve an upload URL, to be used to upload your CSV file to a secure location. 

    POST /batch/files/upload

    Example requests

    curl -d '{"fileName": "file.csv"}' -H "Content-Type: application/json" -X POST http://api.lightboxre.com/v1/batch/files/upload

    Request Body

    {
      "fileName": "file.csv"
    }
    FieldDescription
    fileNameName of your CSV file

    Response

    Media type: application/json

    {
      "$ref": "string",
      "signedUrl": {
        "expiry": "60 minutes",
        "token": "3b4ca2ae-4d64-46a3-b08b-2d088682f78c",
        "url": "https://foo.amazonaws.com/uat/files/5555555-4d64-46a3-b08b-2d088682f78c/filename.csv?X-Amz-Algorithm=AWS4-5555555SHA256&X-Amz-Credential=555555555PFKBAAUEOSV3%2F20240731%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240731T181736Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&x-id=PutObject&X-Amz-Signature=55555555555c8f7d61848317ba3a80863eb3d2310c74467edad295928"
      }
    FieldDescription
    $refURL reference back to this call
    signedUrl.expiryExpire date of this signed URL. After the expire time this URL becomes invalid. 
    signedUrl.tokenJob token. This token is used to reference this process or job to get job status, clean up files, cancel a job or retrieve a download link.
    signedUrl.urlPre signed URL

    File Upload > 5GB

    Use this endpoint to retrieve a multipart file upload URL for files that are greater than 5GB in size, which is used to upload large CSV files to a secure location. The total set of files is considered for a single batch processing job. Each file must be a well-formed CSV file.

    POST /batch/files/initiate-upload

    Example requests

    curl -d '{"fileName": "file.csv", "numberOfParts": 5}' -H "Content-Type: application/json" -X POST http://api.lightboxre.com/v1/batch/files/initiate-upload

    Request Body

    {
      "fileName": "file.csv",
      "numberOfParts": 5
    }
    FieldDescription
    fileNameName of your CSV file
    numberOfPartsNumber of parts that you want to upload. When your CSV file is larger than 5GB the files must be broken into multiple parts. 

    Response

    Media type: application/json

    {
      "$ref": "string",
      "signedUrl": {
        "expiry": "60 minutes",
        "token": "3b4ca2ae-4d64-46a3-b08b-2d088682f78c",
        "urls": [
          "https://foo.amazonaws.com/uat/files/5555555-4d64-46a3-b08b-2d088682f78c/filename.csv?X-Amz-Algorithm=AWS4-5555555SHA256&X-Amz-Credential=555555555PFKBAAUEOSV3%2F20240731%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240731T181736Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&x-id=PutObject&X-Amz-Signature=55555555555c8f7d61848317ba3a80863eb3d2310c74467edad295928"
        ],
        "uploadId": "ef.KCNxfpJwo1ZUaDgUsN9W4aiYScrjmv1xNOYpSYaLmgh_UDguAo51N8FeNK4LS95NPZpEVoGtjaF0DMFS7u1goMFDkai3tDaCjUpsmBfRmSjRXSIXRwSB7jaxqHNef"
      }
    }
    FieldDescription
    $refURL reference back to this call
    signedUrl.expiryExpire date of this signed URL. After the expire time this URL becomes invalid. 
    signedUrl.tokenJob token. This token is used to reference this process or job to get job status, clean up files, cancel a job or retrieve a download link.
    signedUrl.urlsCollection of pre-signed URLs, one for each part
    signedUrl.uploadIdUpload ID to denote the group of files to be uploaded. This property is used in the complete upload call telling the system that you have completed all parts of the upload. 

    Complete Multi-part upload for files > 5GB

    This call is made once all the parts of a large file has been uploaded.

    POST /batch/files/complete-upload

    Example requests

    curl -d '{"parts": [{"partnumber": 1, "eTag":"555"}],"token": "555", "uploadID": "555"}' -H "Content-Type: application/json" -X POST http://api.lightboxre.com/v1/batch/files/complete-upload

    Request Body

    {
      "parts": [
        {
          "partNumber": 2,
          "eTag": "KCNxfpJwo1ZUaDgUsN9W4aiYScrjmv1xNOYpSYa"
        }
      ],
      "token": "3b4ca555-4d64-46a3-b08b-2d088682f78c",
      "uploadID": "ef.555o1ZUaDgUsN9W4aiYScrjmv1xNOYpSYaLmgh_UDguAo51N8FeNK4LS95NPZpEVoGtjaF0DMFS7u1goMFDkai3tDaCjUpsmBfRmSjRXSIXRwSB7jaxqHNef"
    }
    FieldDescription
    partsCollection of file parts
    parts[0].partNumberDenotes a part
    parts[0].eTagAs you use the signed URL to upload a multipart file, the eTag will be returned. This value must be sent back in with the part.
    tokenJob/Process token for this batch process
    uploadIDuploadID is provided when calling /batch/files/initiate-upload

    Response

    Media type: 204 No Content

    Download URL 

    This endpoint will provide a download URL for the completed file. 

    GET /batch/files/download/{token}

    Example requests

    curl -X GET -H ‘x-api-key: (api_key)’ https://api.lightboxre.com/v1/batch/files/download/3b4ca555-4d64-46a3-b08b-2d088682f78c

    https://api.lightboxre.com/v1/batch/files/download/3b4ca555-4d64-46a3-b08b-2d088682f78c

    Parameters

    ParameterTypeDescriptionUsage
    tokenpathJob/Process tokenrequired

    Response

    Media type: application/json

    {
      "$ref": "string",
      "signedUrl": {
        "expiry": "60 minutes",
        "token": "3b4ca2ae-4d64-46a3-b08b-2d088682f78c",
        "url": "https://foo.amazonaws.com/uat/files/5555555-4d64-46a3-b08b-2d088682f78c/filename.csv?X-Amz-Algorithm=AWS4-5555555SHA256&X-Amz-Credential=555555555PFKBAAUEOSV3%2F20240731%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240731T181736Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&x-id=PutObject&X-Amz-Signature=55555555555c8f7d61848317ba3a80863eb3d2310c74467edad295928"
      }
    FieldDescription
    $refURL reference back to this call
    signedUrl.expiryExpire date of this signed URL. After the expire time this URL becomes invalid. 
    signedUrl.tokenJob token. This token is used to reference this process or job to get job status, clean up files, cancel a job, or retrieve a download link.
    signedUrl.urlPre signed URL

    Clean up files 

    This endpoint provides the means to remove data related to a job from the system. 

    Note:
    We have an automatic cleanup process that will remove all data related to a job or process after a expiry time.

    DELETE /batch/files/{token}

    Allows direct removal of the files, input, and output associated with the Batch Processing Job. 

    Example requests

    curl -X DELETE -H ‘x-api-key: (api_key)’ https://api.lightboxre.com/v1/batch/files/3b4ca555-4d64-46a3-b08b-2d088682f78c

    https://api.lightboxre.com/v1/batch/files/3b4ca555-4d64-46a3-b08b-2d088682f78c

    Parameters

    ParameterTypeDescriptionUsage
    tokenpathJob/Process tokenrequired

    Response

    HTTP Code 202, Accepted, No Content returned, This status is returned if the job is currently running. The system will first cancel the job then remove the files.

    HTTP Code 204, Successful, No Content returned

    Start a Job 

    Once the data has been uploaded and you have a job token, this endpoint will add the job to the queue for processing. 

    POST /batch/process/token

    Example requests

    curl -d '{see below}' -H "Content-Type: application/json" -X POST http://api.lightboxre.com/v1/batch/process/{token}

    Parameters

    ParameterTypeDescriptionUsage
    tokenpathJob/Process tokenrequired

    Request Body

    {
      "apis": [
        {
          "name": "geocoding-engine",
          "url": "/v1/search?text={1}+{2}+{3}+{4}",
          "outputColumns": [
            {
              "source": "features.0.geometry.coordinates.0",
              "target": "latitude"
            }
          ]
        }
      ]
    }
    FieldDescription
    apiscollection of APIs to be used in the process. Currently, we support a single API
    apis[0].nameName of the API to be used. This is not used by the processor but allows you to name the API in your JSON object for clarity
    apis[0].urlURL to this API, note the {1}, {2} etc. These are column references for where the address columns are in the CSV. In this example column 1 holds the address, column 2 holds the city, column 3 holds the state, column 4 holds the zip code.
    apis[0].outputColumnsCollection of columns mapped from the geocode response to your output file. 
    apis[0].outputColumns[0].sourceSource column, output from the geocode response. In the example above you will see features.0. features is an array and 0 is the index for this array. 
    apis[0].outputColumns[0].targetTarget field name. This field will be added to the output CSV file with this name and the value from the source mapping. 

    Response

    Media type: application/json

    {
      "$ref": "string",
      "status": {
        "fileName": "file.csv",
        "jobDescription": {
          "apis": [
            {
              "name": "geocoding-engine",
              "url": "/v1/search?text={1}+{2}+{3}+{4}",
              "outputColumns": [
                {
                  "source": "features.0.geometry.coordinates.0",
                  "target": "latitude"
                }
              ]
            }
          ],
          "rowCount": 5003
        },
        "performance": {
          "averageSpeed": 3.6,
          "currentSpeed": 3.6,
          "etc": "01 23:59:59",
          "processedPercentage": 0.5,
          "processedRecords": 3456,
          "startedAt": "2024-07-24T10:23:54.430884195-04:00",
          "updatedAt": "2024-07-24T10:23:54.430884195-04:00",
          "totalRecords": 3456
        },
        "status": "STARTING"
      }
    }
    FieldDescription
    $refURL reference back to this call
    status.fileNameName of the input file
    status.jobDescriptionDescription object that was passed in /batch/process/{token}
    status.performancePerformance metrics on the current process. This information can help provide signal as to how fast the process is running and when it might be complete.
    status.statusWhat status the current process is in. Possible values are:
    • STARTING
    • UPLOADING
    • PROCESSING
    • CANCELLED
    • COMPLETED

    Return status on a Job/Process 

    This endpoint will provide status on a job or process. 

    GET /batch/process/{token}

    Example requests

    curl -X GET -H ‘x-api-key: (api_key)’ https://api.lightboxre.com/v1/batch/process/3b4ca555-4d64-46a3-b08b-2d088682f78c

    https://api.lightboxre.com/v1/batch/process/3b4ca555-4d64-46a3-b08b-2d088682f78c

    Parameters

    ParameterTypeDescriptionUsage
    tokenpathJob/Process tokenrequired

    Response

    Media type: application/json

    {
      "$ref": "string",
      "status": {
        "fileName": "file.csv",
        "jobDescription": {
          "apis": [
            {
              "name": "geocoding-engine",
              "url": "/v1/search?text={1}+{2}+{3}+{4}",
              "outputColumns": [
                {
                  "source": "features.0.geometry.coordinates.0",
                  "target": "latitude"
                }
              ]
            }
          ],
          "rowCount": 5003
        },
        "performance": {
          "averageSpeed": 3.6,
          "currentSpeed": 3.6,
          "etc": "01 23:59:59",
          "processedPercentage": 0.5,
          "processedRecords": 3456,
          "startedAt": "2024-07-24T10:23:54.430884195-04:00",
          "updatedAt": "2024-07-24T10:23:54.430884195-04:00",
          "totalRecords": 3456
        },
        "status": "STARTING"
      }
    }
    FieldDescription
    $refURL reference back to this call
    status.fileNameName of the input file
    status.jobDescriptionDescription object that was passed in /batch/process/{token}
    status.performancePerformance metrics on the current process. This information can help provide signal as to how fast the process is running and when it might be complete.
    status.statusWhat status the current process is in. Possible values are:
    • STARTING
    • UPLOADING
    • PROCESSING
    • CANCELLED
    • COMPLETED

    Cancel a Job/Process 

    This endpoint provides you with the means to cancel a job/process. 

    DELETE /batch/process/{token}

    Example requests

    curl -X DELETE -H ‘x-api-key: (api_key)’ https://api.lightboxre.com/v1/batch/process/3b4ca555-4d64-46a3-b08b-2d088682f78c

    https://api.lightboxre.com/v1/batch/process/3b4ca555-4d64-46a3-b08b-2d088682f78c

    Parameters

    ParameterTypeDescriptionUsage
    tokenpathJob/Process tokenrequired

    Response

    HTTP Code 202, Accepted, No Content returned

    Geocoding Mapping Options (Source Output Columns)

    This table outlines the mapping options from source to target and what each field represents. We recommend carrying the fields marked with a '*' in the Best Practice column as these fields are the most important. 





    geocoding.query.text
    AddressInputAddress input text
    geocoding.query.parsedText.streetNumber
    ParsedStreetNumberStreet number as parsed from the input text
    geocoding.query.parsedText.streetName
    ParsedStreetNameStreet name as parsed from the input text
    geocoding.query.parsedText.locality
    ParsedCityCity name as parsed from the input text
    geocoding.query.parsedText.region
    ParsedStateState as parsed from the input text
    geocoding.query.parsedText.postalCode
    ParsedZipCodeZip code as parsed from the input text
    features.0.geometry.coordinates.0*LatitudeLatitude portion of the location of the address
    features.0.geometry.coordinates.1*LongitudeLongitude portion of the location of the address
    features.0.properties.label*AddressLabelMatched address label
    features.0.properties.streetNumber*MatchedSteetNumberMatched address street number
    features.0.properties.streetName*MatchedStreetNameMatched address street name
    features.0.properties.postalCode*MatchedZipCodeMatched address zip code
    features.0.properties.localAdmin
    MatchedLocalAdminMatched address local admin
    features.0.properties.locality*MatchedCityMatched address city name
    features.0.properites.county*MatchedCountyMatched address county name
    features.0.properties.regionCode*MatchedStateMatched address state
    features.0.properties.countryCode
    MatchedCountryMatched address country
    features.0.properties.continent
    MatchedContinentMatched address continent
    features.0.properties.score*GeocodeScoreOverall address matched score. 
    features.0.properties.scoreComponents.streetNumber
    GeocodeStreetNumberScoreStreet number score
    features.0.properties.scoreComponents.streetName
    GeocodeStreetNameScoreStreet name score
    features.0.properties.scoreComponents.locality
    GeocodeCityScoreStreet city score
    features.0.properties.scoreComponents.postalCode
    GeocodeZipCodeScoreStreet zip code score
    features.0.properties.changeFlag.streetType
    ChangeFlagStreetTypeA flag that denotes a change from the input value to the matched value, possible values are (Fuzzy, Added, Changed, Removed, Alias)
    features.0.properties.changeFlag.streetName
    ChangeFlagStreetNameA flag that denotes a change from the input value to the matched value, possible values are (Fuzzy, Added, Changed, Removed, Alias)
    features.0.properties.changeFlag.locality
    ChangeFlagCityA flag that denotes a change from the input value to the matched value, possible values are (Fuzzy, Added, Changed, Removed, Alias)
    features.0.properties.changeFlag.postalCode
    ChangeFlagZipCodeA flag that denotes a change from the input value to the matched value, possible values are (Fuzzy, Added, Changed, Removed, Alias)
    features.0.properties.attributes.fipsCode
    FIPSCodeCounty FIPS Code
    features.0.properties.attributes.address_lid*ADDRESS_LIDUnique LightBox ID for this address
    features.0.properties.attributes.parcel_lid.0*PARCEL_LIDRelated parcel record to this address
    features.0.properties.attributes.assessment_lid.0*ASSESSMENT_LIDRelated assessment record to this address
    features.0.properties.attributes.building_lid.0*STRUCTURE_LIDRelated structure record to this address
    features.0.properties.attributes.precisionCode*PrecisionCodeSee Geocode Precision Code for details

    HTTP Error Codes

    HTTP Response status codes along with a brief summary of their commonly accepted usage. These status codes are returned by LightBox APIs for each request:
    200
    The request succeeded.
    201
    The object was created successfully
    202Accepted, no content
    204Successful, no content
    204
    The server has successfully fulfilled the request and that there is no additional content to send in the response payload body. Typically returned on a DELETE
    400
    One or more of the request parameters were invalid.
    401
    The client must authenticate itself to get the requested response. Note: This could also be due to your trial key has expired.
    404
    The server cannot find the requested resource. This can also mean that the endpoint is valid but the resource itself does not exist.
    429
    Too many requests were made in a short period of time, or you have exceeded your request-lot pool.
    500
    The server has encountered an error it does not know how to handle.
    503Service Unavailable.





    Was this article helpful?

    Changing your password will log you out immediately. Use the new password to log back in.
    First name must have atleast 2 characters. Numbers and special characters are not allowed.
    Last name must have atleast 1 characters. Numbers and special characters are not allowed.
    Enter a valid email
    Enter a valid password
    Your profile has been successfully updated.