Mayank
1. Backward compatible schema changes are safe (e.g. adding a new column, safe type changes int -> long, etc). Backward incompatible changes such as deleting a column, or changing to incompatible data type are not allowed.
2. At LinkedIn, we usually ensure that a change is done in phases so as to not break a deployment. For example, you could deploy the change off by default to all components, and then turn them on in a way that does not break. Would need a bit more info on your specific change to comment on how to achieve that.
3. We have internal tools at LinkedIn, but would be great to have them in the open source as well. One project in our roadmap that is in this direction is to build a performance validation framework.
4. There are different ways we evaluate changes. For changes that are limited to a single node you can use PerfBenchmarkRunner along with QueryRunner (to run a specific qps) on two different setups. For a change that impacts scatter/gather and needs entire cluster we have tools internally to do so. But hoping that the project mentioned above can evolve into something that the community can also use.
Dan Hill
07/05/2020, 3:29 AM