Hello team, we just created a <PR #5561> for the n...
# contributing-to-airbyte
y
Hello team, we just created a PR #5561 for the new
DynamoDB
destination. Please let us know if anything else needs to be done!
u
This is awesome!
u
We'll take a look at it and walk through it with you; thanks šŸ™‚
u
@Yiqing Wang nice! thanks for reaching out Iā€™m trying to think through the UX for consuming data written by this connector. Whatā€™s your use case with dynamo db/how do you intend on querying the data? Iā€™m specifically wondering whether the UUID key is sufficiently usable
u
As we are using DynamoDB as the database for our backend system, it would be great to enables Airbytes using DynamoDB as the destination. As I can query the data using
JavaBaseConstants.COLUMN_NAME_DATA
in DynamoDB, I think UUID is good enough here.
sync_time
as the sort key in the PR is used to limit the query time range.
u
@Yiqing Wang when you say you query the data using
COLUMN_NAME_DATA
, are you doing a full scan of tables in dynamo to consume this data? isnā€™t that very expensive? šŸ˜… would it be better to key the records in dynamo on their primary key rather than a random UUID?
u
What is the default primary key for a record? Is it determined by the specific data?
u
sometimes the data declares the primary key in the catalog yes
u
my main concern is whether a random UUID primary key is very usable for the user without having to do a full scan of the data. Maybe a full scan is acceptable here though? Iā€™m not sure how else the data would be added. Might I suggest we use the PK as the key when it is present, and otherwise using the random UUID?
y
May I ask how can I dynamically parse the primary key from catalog without knowing the key exists?
u
you can inspect the catalog and for each configured stream, seeing if the
user_configured_primary_key
field is set
u
although the downside here is that the only method of adding data is an overwriteā€¦ (if we use the primary key)
u
We have a sync_time in place using as the sort key. So the partition key (
user_configured_primary_key
) and the sort key will play together as the grouped primary key. Thus we can add the data without using overwrite as long as the partition key is unique for each sync.
u
that would be excellent
u
Nice, we will try to implement this change.
u
In class ConfiguredAirbyteStream, why the primary key is List<List<String>>?
y
Copy code
@JsonProperty("primary_key")
public List<List<String>> getPrimaryKey() {
    return primaryKey;
}
u
A primary key is described by a list of strings which indicates the ā€œpathā€ of the field in case it is nested in the record. A composite key consists of multiple lists
u
We find there is no user_configured_primary_key in the testing files, so it is hard for us to test this feature.