<@U04SE163WC8> I am making a custom connector usin...
# ask-ai
j
@kapa.ai I am making a custom connector using the airbyte gui. I have a connector which uses pagination. however, if you go past the max page number it just keeps repeating the previous data points. how can i get airbyte to stop pagination when next page = null?
Copy code
"page": {
        "prev": null,
        "next": null,
        "current": 1,
        "total": 1
      },
u
In your case, you can use the "stop condition" feature of Airbyte's pagination methods to stop requesting more pages when the
next
field is
null
. In the Airbyte connector builder, you can use the "custom" option for pagination and specify the "stop condition" using a jinja placeholder. For example, if your API response includes a
next
field in the
page
object to indicate the next page, the stop condition should be
{{ response.page.next is none }}
. This condition will evaluate to
true
when there is no next page, and the connector will stop requesting more pages. Here is an example of how you can configure this in the connector builder:
Copy code
yaml paginator:     type: "DefaultPaginator"     pagination_strategy:       type: "CursorPagination"       cursor_value: "{{ response.page.next }}"       stop_condition: "{{ response.page.next is none }}"
In this configuration,
cursor_value
is used to obtain the next page cursor from the API response, and
stop_condition
is used to determine when to stop requesting more pages. Please note that this is a simplified example and you may need to adjust it based on the specific structure of your API response. For more information, you can refer to the [Airbyte documentation on pagination](https://docs.airbyte.com/connector-development/connector-builder-ui/pagination#cursor-pagination).
j
@kapa.ai it was set to PageIncrement before, does it need to be CursorPagination for this to work?
u
A message from kapa.ai
u
Yes, for the stop condition to work as described, you need to use
CursorPagination
. The
PageIncrement
method is used when the API supports pagination by incrementing a page number in the request. However, it doesn't support a stop condition based on the response content.
CursorPagination
is more flexible and allows you to specify a
stop_condition
based on the content of the API response. This makes it possible to stop pagination when the
next
field is
null
, as in your case. Here's how you can set it up:
Copy code
yaml paginator:     type: "DefaultPaginator"     pagination_strategy:       type: "CursorPagination"       cursor_value: "{{ response.page.next }}"       stop_condition: "{{ response.page.next is none }}"
In this configuration,
cursor_value
is used to obtain the next page cursor from the API response, and
stop_condition
is used to determine when to stop requesting more pages. For more information, you can refer to the [Airbyte documentation on Cursor Pagination](https://docs.airbyte.com/connector-development/connector-builder-ui/pagination#cursor-pagination).
j
@kapa.ai I am making a custom connector using the GUI. if I do: type: RecordSelector extractor: type: DpathExtractor field_path: - reviews I get 1 record response: { "store": "as", "page": { "current": 1, "total": 1 }, "total": 2, "ext_id": "2312312", "list": [ { "store": "as", "review_id": 123123123, "answer_text": "Thanks for your feedback" }, { "store": "as", "review_id": 321912312312, "answer_text": "See ya" } ] } Obviously its still needs further unnesting. However, if I set field_path to reviews, list I get 0 records. Why?
u
A message from kapa.ai
u
Based on the information provided, it seems like you're trying to extract the
list
array from your API response. However, the field path you're using (
reviews, list
) doesn't match the structure of your API response. In your API response, the
list
array is at the top level, not nested under a
reviews
field. Therefore, you should set the Field Path to
list
to select the array with the actual records. Here's how you can set it up:
Copy code
yaml type: RecordSelector extractor:   type: DpathExtractor   field_path:     - list
This configuration will treat each object in the
list
array as a separate record. For more information, you can refer to the [Airbyte documentation on Record Selection](https://docs.airbyte.com/connector-development/connector-builder-ui/record-processing#record-selection).
j
@kapa.ai maybe i was unclear. this is the raw response:
Copy code
{
  "status": 200,
  "body": {
    "reviews": {
      "list": [
        {
what i showed you was the record response
u
A message from kapa.ai
u
I see, thank you for the clarification. If the
list
array is nested under the
reviews
field in the
body
of your API response, you should set the Field Path to
body, reviews, list
to select the array with the actual records. Here's how you can set it up:
Copy code
yaml type: RecordSelector extractor:   type: DpathExtractor   field_path:     - body     - reviews     - list
This configuration will treat each object in the
list
array as a separate record. For more information, you can refer to the [Airbyte documentation on Record Selection](https://docs.airbyte.com/connector-development/connector-builder-ui/record-processing#record-selection).
j
@kapa.ai the point in my original question was that making the field path reviews works and gives me the following record response: { "store": "as", "page": { "current": 1, "total": 1 }, "total": 2, "ext_id": "2312312", "list": [ { "store": "as", "review_id": 123123123, "answer_text": "Thanks for your feedback" }, { "store": "as", "review_id": 321912312312, "answer_text": "See ya" } ] } but doing reviews, list returns nothing
u
A message from kapa.ai
u
I see, thank you for the clarification. It seems like the
list
array is nested under the
reviews
field in your API response, and you're trying to extract the
list
array as separate records. The issue you're facing might be due to how the DpathExtractor works. The DpathExtractor uses [dpath](https://pypi.org/project/dpath/) library to extract nested fields from a dictionary. It's possible that the DpathExtractor is not able to correctly extract the
list
array when it's specified in the field path. As a workaround, you could try using the
FieldPath
feature to first extract the
reviews
field, and then process the
list
array in a separate step. Here's an example of how you can do this:
Copy code
yaml type: RecordSelector extractor:   type: DpathExtractor   field_path:     - reviews
This will give you a record that includes the
list
array. You can then process this record in a separate step to extract the individual items in the
list
array. If this doesn't work, it might be a bug in the DpathExtractor or the connector builder UI. In that case, I would recommend [creating an issue](https://github.com/airbytehq/airbyte/issues) in the Airbyte GitHub repository to report this problem. For more information, you can refer to the [Airbyte documentation on Record Selection](https://docs.airbyte.com/connector-development/connector-builder-ui/record-processing#record-selection).