Devs, have couple of questions around schema compa...
# general
l
Devs, have couple of questions around schema compatibility. • We have some fields which were defined as dimensions. Can we move them to metrics without recreating the table? • If above answer is no, can we enable star tree index aggregation for some dimension columns? • Upgrade from
0.7.1
to
0.10.0
.causing compatibility issues with old avro boolean fields with default values. Any specific migration steps here to be followed?
k
• you can change dimensions to metrics • I dont think startree index cares if the columns are marked as dimensions or metrics • Not sure about it.. whats the error.. can you please file an issue
l
• you can change dimensions to metrics
It's failing @Kishore G. FieldSpecs are different classes (DimensionFieldSpec vs MetricFieldSpec) https://github.com/apache/pinot/blob/release-0.10.0/pinot-spi/src/main/java/org/apache/pinot/spi/data/FieldSpec.java#L334 org.apache.pinot.spi.data.FieldSpec#equals
Copy code
if (EqualityUtils.isNullOrNotSameClass(this, o)) {
      return false;
    }
k
ah, looks like a validation check we have added for safety
l
that means we can't change a dimension to metric at all?
k
I dont think anything should fail internally if you force that change
l
I can test that. How do I force incompatible change?
k
I dont know if there is a way to force it (will be a great contribution).. if you are brave enough - you can change the schema in Zookeeper directly
l
okay. Will give a try and post my findings here. Thanks @Kishore G for guidance.
• you can change dimensions to metrics
Interestingly, these changes are allowed in 0.7.1 After 0.10.0 upgrade, these changes are not allowed and reported as incompatible. However, the code snippets I posted above looks like they are there in 0.7.1 as well. Somehow, the path got activate only after upgrade. cc: @Ravi Singal
Figured out why this is happening after 0.10 upgrade (from: 0.7.x) 0.7.1 behavior addSchema code path (POST /schemas) was not validating the compatibility with old schema. updateSchema code path (PUT /schemas) was validating the compatibility with old schema. 0.10.0 behavior Code duplication is removed and addSchema code path is also using updateSchema code path. Hence, both apis started checking schema compatibility. cc: @Kishore G
Yet another schema compatibility issue in 0.10.0. Just the following change in granularity of a date time field causing compatibility issues. version-1
Copy code
{
  "name": "start_time_millis",
  "dataType": "LONG",
  "format": "1:MILLISECONDS:EPOCH",
  "granularity": "1:MILLISECONDS"
}
version-2
Copy code
{
  "name": "start_time_millis",
  "dataType": "LONG",
  "format": "1:MILLISECONDS:EPOCH",
  "granularity": "15:MILLISECONDS"
}
Not sure if this really is an incompatible change? Can we please fix this? Currently it's too restrictive and almost doesn't allow any changes to schema apart from adding a new field.
m
It is indeed backward incompatible, right? Once you change the granularity, the existing data that was interpreted with previous value, will now be interpreted with new value, which is incompatible.
l
Not sure if that is an incompatible change given datatype and unit are exactly same.
m
What happens to the old that that doesn’t conform to the granularity?
l
imho, current implementation is too restrictive. it almost allows no change other than column addition. Few examples which we came across after upgrade (0.10.0 from 0.7.1) • change in default value • change from dimension to metric • change in granularity • changing the maxLength of a column These changes were allowed (due to a bug however) in 0.7.1 Checking for compatibility is very important. But, making it too restrictive also makes it unusable and something we need to fix I believe. cc: @Kishore G @Xiang Fu
m
IIRC, there was a config added to bypass it. @Rong R?
l
I too remember seeing a
force
flag in one of the old PRs. However, that looks to be reverted.
m
Please file a GH issue.
l
Sure @Mayank. Will file an issue. One more added to the above list • changing the maxLength of a column I'm sure this is no way an incompatible change.
m
Yes, it can be incompatible if the length is reduced. But the check should catch that
Also, in case you are interested, contributions are always welcome
👍 1
l