curious why are you interested in moving off of Neo4j DataHub #getting-started

Join Slack

curious: why are you interested in moving off of N...

# getting-started

lively-judge-30357

06/28/2021, 6:24 PM

curious: why are you interested in moving off of Neo4j?

high-hospital-85984

06/28/2021, 6:26 PM

Main reason is probably the lack of (reasonably priced) managed solutions among our trusted providers (AWS, Aiven).

high-hospital-85984

06/28/2021, 6:26 PM

Also simplifying the deployment by using ES is a plus

➕ 1

big-carpet-38439

06/28/2021, 7:36 PM

cost is the primary reason we've heard

big-carpet-38439

06/28/2021, 7:36 PM

but yes separately reducing operational complexity

mammoth-bear-12532

06/28/2021, 7:51 PM

@lively-judge-30357 @orange-night-91387: DGraph would be a great contribution. The two of you should talk. I'm happy to facilitate and help with design input.

orange-night-91387

06/28/2021, 7:58 PM

I have an initial POC that I've written up in our fork, but the performance metrics I've been getting aren't as optimistic as I had hoped. Been still fiddling with the implementation to see if I can get it faster, but have been focused on other higher priority things for us currently. I can convert it over to mesh with the OSS version and put it up in a branch when I get a chance so I can get more eyes on it. Haven't tried reaching out directly to the DGraph team for implementation suggestions yet either. There aren't a lot of open implementations for references esp. in Java.

orange-night-91387

06/28/2021, 8:01 PM

Our primary concern is ingestion speed, Graph DBs are notoriously slow for streamed data sources. Also have found this whitepaper, not sure if anyone has put out any actual implementations of it since it was relatively recent: https://arxiv.org/pdf/1905.08337.pdf , it still uses a more batch-style approach though for speed-ups.

lively-judge-30357

06/28/2021, 8:44 PM

Enrico has been doing some performance work on his Spark-DGraph connector. DGraph seems to scale pretty decently

lively-judge-30357

06/28/2021, 8:54 PM

at least, for what he was doing.

orange-night-91387

06/29/2021, 9:20 PM

Yeah that was a big draw, that it is designed distributed first as opposed to Neo4J which is limited by its master-slave architecture. I haven't hit any problems with scaling the size of the data, just didn't get the several orders of magnitude ingestion speed boost I was hoping for with my connector to the current DAO logic 🙂 when I last checked the upserts still got really bogged down (only performing slightly better than Neo) when I hammered it with a ton of events. Most of my tests have been local though and it totally could just be me poorly implementing the connector or just not properly configuring the DGraph servers. In our deployments Neo4J is a big bottleneck that prevents us from ingesting lots of events in a timely fashion. Big bursts can slow us down to hour long updates which makes the graph not as usable.

mammoth-bear-12532

06/29/2021, 10:14 PM

@orange-night-91387 does it speed up if you are doing appends (basically adding new edges + nodes to the graph) or removing edges / nodes. Without doing read-modify-write?

orange-night-91387

06/30/2021, 3:51 PM

IIRC DGraph was performing much better on modifications of existing nodes than Neo, but they both performed less than ideal on the initial load, but it's been awhile since I ran the tests.

2 Views

Open in Slack

Previous Next