<@U04SE163WC8> in which order will the http_method...
# ask-ai
y
@kapa.ai in which order will the http_methods been called?
u
Based on the provided knowledge sources, I don't have specific information about the order in which HTTP methods are called in Airbyte. The documents don't contain a clear sequence or order of HTTP method calls. However, I can provide some relevant information about HTTP methods in Airbyte: According to the [HttpRequester documentation](https://docs.airbyte.com/connector-development/config-based/understanding-the-yaml-file/reference#/definitions/HttpRequester), Airbyte supports both GET and POST methods for fetching data from sources. The
http_method
field in the HttpRequester configuration specifies which method to use:
Copy code
yaml http_method: string
Examples given are:
Copy code
GET
or
Copy code
POST
This suggests that the HTTP method used depends on how the specific connector is configured, rather than following a predetermined order of calls. If you need more detailed information about the sequence of HTTP method calls in Airbyte, you may want to check the documentation for specific connectors or examine the source code of the relevant components.
y
@kapa.ai I meant how do I know when is which method used from airbyte. So when I am using the pythoncdk, I am getting a class that is inheriting from HTTpStream class. So is airbyte first using parse_response or first read_records. thats what I dont understand and of course not only for this methods also from the rest
u
A message from kapa.ai
u
I understand you're asking about the order of method calls when using the Python CDK with a class inheriting from HTTPStream in Airbyte. Let me explain the sequence based on the information provided in the knowledge sources. The main flow of operations for reading data in Airbyte's Python CDK, specifically for HTTP-based connectors, typically follows this order: 1. The process starts with the
read_records
method, which is called on the stream object. 2. Inside
read_records
, the
_read_pages
method is called. This method handles pagination and making HTTP requests. 3. For each page of data: a. The
_send_request
method is called to make the HTTP request. b. After receiving the response, the
parse_response
method is called to process the data from the API response. c. The
next_page_token
method is then called to determine if there are more pages to fetch. This sequence is evident from the following code snippet from the [HttpStream class](https://github.com/airbytehq/airbyte/discussions/33814):
Copy code
python def read_records(         self,         sync_mode: SyncMode,         cursor_field: List[str] = None,         stream_slice: Mapping[str, Any] = None,         stream_state: Mapping[str, Any] = None,     ) -> Iterable[StreamData]:         yield from self._read_pages(             lambda req, res, state, _slice: self.parse_response(res, stream_slice=_slice, stream_state=state), stream_slice, stream_state         )
As you can see,
read_records
calls
_read_pages
, which in turn calls
parse_response
. The
parse_response
method is typically implemented in your custom stream class to handle the specific format of the API response:
Copy code
python def parse_response(self, response: requests.Response, **kwargs) -> Iterable[Mapping]:     res = response.json()     yield from res.get("data", [])
It's important to note that other methods like
path
,
request_params
, and
next_page_token
are called as needed during this process to construct requests and handle pagination. This flow ensures that Airbyte can efficiently read data from HTTP APIs, handle pagination, and process the responses into a format suitable for further processing in the Airbyte pipeline. [Source: How the Airbyte CDK read command works](https://github.com/airbytehq/airbyte/discussions/33814)
y
@kapa.ai do you have a best practice page for the python cdk to build custom connectors?
u
A message from kapa.ai
u
Yes, Airbyte provides a guide on [Best Practices for Connector Development](https://docs.airbyte.com/connector-development/best-practices). While this guide is not specific to the Python CDK, it provides general advice that applies to all connectors, including those built with the Python CDK. For Python CDK specific best practices, you can refer to the [Python CDK documentation](https://docs.airbyte.com/connector-development/cdk-python). This includes: - [Basic Concepts](https://docs.airbyte.com/connector-development/cdk-python/basic-concepts) for understanding the classes required to implement an Airbyte Source. - [Full Refresh Streams](https://docs.airbyte.com/connector-development/cdk-python/full-refresh-stream) for creating your first full refresh stream. - [Incremental Streams](https://docs.airbyte.com/connector-development/cdk-python/incremental-stream) for understanding how to write a
stream_slices
function and what a
cursor_field
is. - [Python Concepts](https://docs.airbyte.com/connector-development/cdk-python/python-concepts) for tips on useful Python knowledge. - [Tutorial for implementing an HTTP source connector](https://docs.airbyte.com/connector-development/tutorials/custom-python-connector/getting-started). Remember, connectors that follow these best practices will be labelled as "Airbyte Certified" to indicate they've passed a high quality bar and will perform reliably in all production use cases.
y
@kapa.ai do I only have to manipulate them if I am using post,put or patch request? def request_body_data( self, _stream_state_: Optional[Mapping[str, Any]], _stream_slice_: Optional[Mapping[str, Any]] = None, _next_page_token_: Optional[Mapping[str, Any]] = None, ) -> Optional[Union[Mapping[str, Any], str]]: """ Override when creating POST/PUT/PATCH requests to populate the body of the request with a non-JSON payload. If returns a ready text that it will be sent as is. If returns a dict that it will be converted to a urlencoded form. E.g. {"key1": "value1", "key2": "value2"} => "key1=value1&key2=value2" At the same time only one of the 'request_body_data' and 'request_body_json' functions can be overridden. """ return None def request_body_json( self, _stream_state_: Optional[Mapping[str, Any]], _stream_slice_: Optional[Mapping[str, Any]] = None, _next_page_token_: Optional[Mapping[str, Any]] = None, ) -> Optional[Mapping[str, Any]]: """ Override when creating POST/PUT/PATCH requests to populate the body of the request with a JSON payload. At the same time only one of the 'request_body_data' and 'request_body_json' functions can be overridden. """ return None
u
A message from kapa.ai
u
Yes, you're correct. The
request_body_data
and
request_body_json
methods are typically used when you need to send data in the body of HTTP requests, which is common for POST, PUT, and PATCH requests. -
request_body_data
is used when you want to send a non-JSON payload in the body of the request. If this method returns a string, it will be sent as is. If it returns a dictionary, it will be converted to a urlencoded form. -
request_body_json
is used when you want to send a JSON payload in the body of the request. Here's an example of how you might override these methods in your custom stream class:
Copy code
python def request_body_json(     self,     stream_state: Optional[Mapping[str, Any]],     stream_slice: Optional[Mapping[str, Any]] = None,     next_page_token: Optional[Mapping[str, Any]] = None, ) -> Optional[Mapping[str, Any]]:     # Your logic here to construct the JSON payload     return {"key": "value"}
Remember, you can only override one of
request_body_data
and
request_body_json
in a given class. If you need to send both JSON and non-JSON data, you might need to create separate stream classes for each type of request. For GET requests, you typically don't need to override these methods, as GET requests usually don't have a body. Instead, you would use the
request_params
method to specify any query parameters for the request. For more information, refer to the [Python CDK documentation](https://docs.airbyte.com/connector-development/cdk-python/basic-concepts).