Hi Ray, We haven't really published any benchmarks because a lot of it depends on the workload, the richness of metadata you have besides the specific backends and the resources you are running DataHub with.
There is a perf test harness located at :
https://github.com/datahub-project/datahub/tree/master/perf-test that you can use to benchmark it with your setup. There are quite a few companies in the community that are running it at scale in production with millions of entities and 10-s of millions of relationships (edges) in some cases. I think the Grab team
reported ingesting their entire Hive warehouse (~80K datasets in ~ 15 mins) if memory serves me right.