Jenny Brown
08/04/2021, 11:29 PMuser
08/04/2021, 11:40 PMspec
method
The way we’ve done it in python connectors is to programmatically enrich the specific connector’s specification with the common specification.
For example let’s say we have an abstract class representing a file-based connector. Given a stream of bytes, this file-based connector can read any file format: CSV, JSON, Parquet, Avro, etc.. compressed or otherwise.
But the stream of bytes can come from anywhere: S3 connector, SFTP, GCS, etc..
the abstract class expects a configuration on how to interpret the stream of bytes e.g
{
"format": "csv",
"delimiter": ",",
"encoding": "utf-8",
"compression": "gzip"
}
and that would be the same no matter where the stream of bytes is coming from. And then depending on the destination we want a separate “chunk” of configuration that is storage-layer-dependent e.g: role ID & bucket name for S3 or Service account credentials for GCS etc..
then we could create a common superclass FilebasedConnector
whose spec
method returns the “abstract” specification, and override that in each of the child classes to enrich it with the storage-specific parametersuser
08/04/2021, 11:40 PM