Hello. Our team having memory management and cpu i...
# ask-for-help
u
Hello. Our team having memory management and cpu issue. I use cpu, not gpu. question below is about CPU only. cpu question: 1) I found that python process handle only one core because of global lock. even if I set multi_thread=True is just work for concurrency not for parallelism. So If I want to use all of cores, I have to make processes as much as numbers of cores. but when I run bentoml, it make only one runner process I guess this manipulate only one cores. what if I use all of my core in one model. what should I do? 2) I test performance between using one runner and using two runner. first I send 5000 request in one runner. second I send 2500 request to one runner and rest 2500 request to other runner. both take similar time. I thought If I use 2 runner, It use two core so it will take less time. What's wrong with me? memory question: If I run the one process, tensorflow take lots of memory(not model memory, library memory). If I run multiple runner, It should take multiple library memory as much as number of process. I want to share this memory management among runners.... please give me advise... I need help..