This message was deleted.
# general
s
This message was deleted.
e
1. I did. one of the motivations for this is to use SQL for querying the experiments, and AFAIK, DVC does not support this 2. read the artifacts? what about the parameters? I think those are stored in a database right? 3. we store the plots as base64 strings 4. fair point. I didn't think much of the example. maybe one where we train several models and compare them would be better for the documentation no worries, I highly appreciate your feedback!
E.g. creating an api on top of their file conventions that helps extracting the needed info, or even construct a db mirror of their files.
yes! we're considering doing this. It's too early to tell what will happen since this is the first release. we never expected this big of a reaction towards the project since it's in such an early stage. I think right now we'll start focusing on talking to early adopters and figure out next steps. so if you get a chance, please give it a try and let me know what you think!
f
Well, I always follow with great interest what you are doing, I have already learned a lot from you😉 Just to be sure I am on the right track: the ultimate goal is to equip ploomber with experiment tracking capabilities, right? I am now in a new company and data scientists are using notebooks a lot (in my previous one we used them very very little). I was considering using ploomber, and I was thinking of using it together with dvc. But actually if I am not wrong the ploomber and dvc pipeline definition are very similar, and it’s not so obvious how to make them play together. Overall dvc is my favorite option for tracking ml runs, it would be awesome to make it play well together with ploomber. But it’s not clear to me how. Anyway, we are not there yet so I haven’t added either of the two (+ some colleagues insist that we should only use azure ml 😖). I am telling you this just to let you have a potential user point of view (I may miss some information or be wrong, just this was my chain of thoughts).
e
Just to be sure I am on the right track: the ultimate goal is to equip ploomber with experiment tracking capabilities, right?
yes! but we also want people to be able to use the tools individually, so it'll be an independent tool with a ploomber integration
Overall dvc is my favorite option for tracking ml runs, it would be awesome to make it play well together with ploomber. But it’s not clear to me how.
interesting. I guess the dvc pipeline feature is well-integrated with the tracking feature. but they are not designed to work independently (e.g., ploomber for pipelines, dvc for tracking). if you find a workaround, let me know! we could write a blog post about it!
f
All iterative tools are extremely modular, so I would be extremely surprised if it is not possible. And actually the tracking/versioning is done by git 😅 What happens if you run a [dvc exp](https://dvc.org/doc/command-reference/exp/run) to run “fit.py” instead of ‘’’ track_execution("fit.py", parameters=p, quiet=True, database="experiments.db") ‘’’ ? You could run a notebook instead of a .py too, no difference…
e
You could try calling the sqlitetracker directly. however, you won't get the feature that stores the plots. now that I think about it, I think we should move that logic there, cause currently, it lives on a different package