I am using the Azure AD source to ingest users and...
# ingestion
m
I am using the Azure AD source to ingest users and groups from Azure AD, but I'm using the
groups_pattern
and
users_pattern
since I only want to ingest specific users and groups. My AD contains thousands of entries and it creates a huge log of filtered items, which is just polluting the logs and not having any real value. I still want the logs since when things go sideways, I need to know what is going on, so redirecting the logs to
/dev/null
is not an option. I could hack it with grep, but I'd like to know if there is way to disable some reporting? From me reading the code, I don't think there is, but I might have missed something. I think the reporting is done via introspection of a
dataclass
, so the
filtered
list is being printed if defined. Would there be a way (by modifying the existing code) to disable that list using a param passed to the
AzureADSourceReport
constructor? And instead of recording all the filtered names, I could simply keep a count...
Copy code
@dataclass
class AzureADSourceReport(SourceReport):
    filtered: List[str] = field(default_factory=list)

    def report_filtered(self, name: str) -> None:
        self.filtered.append(name)
e
I think you could. To my knowledge, we simply convert the object into a string without checking what fields are in them. You could add a numFiltered field in there with the count, and a noFilteredTracking flag that is set by the constructor
Feel free to send out a PR on this!!! We can discuss the best way in the PR as well
m
I have coded something, but I'm also adding support for nested groups as it is missing and breaking my ingestion. It will all be in the same PR. I'll also reduce the code's cyclomatic complexity as it is very high. Taking next week off, so PR will be ready early March.
e
Awesome!!! We will be waiting for that amazing PR