Right.. Entities themselves would be rather small, we'd want to use them to represent physical resources which dataset are tied to. the important thing would be the relationships.
so the Entities might have a single string identifier field, and then each on will be associated with some subset of our datasets.
I guess the question I'm getting after is should the number of entities that datahub can handle be able to scale up linearly with resources, or are there known bottlenecks at application level that impose some order-of-magnitude limit on the cardinality of entity sets?