Hi guys, I am a data engineer in an ai team. I stu...
# general
p
Hi guys, I am a data engineer in an ai team. I stumbled over daft and see alot of benefits using it 🙂 I saw the llm.generate() function. I was wondering is also working with llm-proxy providers like liteLLM? I also heard in the video about it, that it is running in batches like batch inference. But I was wondering, if there might be a nice implementation to run the real batch_api from AzureOpenai, Openai or other providers: https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/batch?tabs=global-bat[…]ndard-input%2Cpython-key&pivots=programming-language-python
Copy code
import os
from openai import AzureOpenAI
    
client = AzureOpenAI(
    api_key=os.getenv("AZURE_OPENAI_API_KEY"),  
    api_version="2025-03-01-preview",
    azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT")
    )

# Upload a file with a purpose of "batch"
file = client.files.create(
  file=open("test.jsonl", "rb"), 
  purpose="batch",
  extra_body={"expires_after":{"seconds": 1209600, "anchor": "created_at"}} # Optional you can set to a number between 1209600-2592000. This is equivalent to 14-30 days
)


print(file.model_dump_json(indent=2))

print(f"File expiration: {datetime.fromtimestamp(file.expires_at) if file.expires_at is not None else 'Not set'}")

file_id = file.id
j
Curious to hear what you think the behavior should look like for the batch APIs — these APIs could take up to hours to run Would you expect daft to also be able to do things such as keep track of unique requests to the backend?
p
That is a really good question I don't know the answer to, because one does not want to let the cluster run 24h -> maybe 24/7. I was thinking maybe about: • creating jsonl files • sending the request • creating a new column with request id to fetch status • an additional fetch based on that column So, maybe something like send + fetch function. But I understand, that this is something untypical so far for dataframes. Probably the reason why langchain also did not implement something like this so far.
j
We view Daft as a data engine rather than a dataframe library actually. For example, we don’t really care for being best-in-class for analytics Very interesting usecase though.
❤️ 1
I might play around with some APIs and come for your feedback 🙂
e
@Peer Schendel if you are looking to run inference in a vectorized fashion, llm-generate works with openai proxies like LiteLLM through ENV vars or with api_key / base_url explicit overrides. Either way, it works great for keeping all of your inference requests organized
❤️ 1
p
@Everett Kleven: thanks for the details! I will try it out since it is really handy to work with daft vectorized :) @jay: thank you for the answer :) I appreciate your help