In the meantime I solved a lot of stuff I will do a push probably today, but I get a new hw and try to make it work first
- I will try the install process in a live setup, I think you answered my main concern.
- I meant something like a test framework for plugin devs. I created my plugin, fine, it is working on my machine, fine. How hard it is to try out with a fresh setup? I have no idea. (I have better idea bcs I started to go after the dockerized solutions, and found some promising images.) It would be nice to have a small writeup with "what to do steps" if sb with close to zero python ecosystem knowledge, but basic ops knowledge can follow to make sure the developed plugin will work as-is on other machines too. (But probably this is only my insecureness with the lang end ecosystem, and it's not a problem for others.)
- solved, but thx
I definitely want to use grafana. I don't think that we need to add more functionality to the webui. We have good tools and those scales better than reimplementing them again and again in every project.
There is a huge debate between pull and push data consumption. Both have pros and cons. I think in general the pull method has better "real time" performance (you don't need to do the modification/send when things change, you only need to inc/dec/set a variable). Also, most of the "problems" you mentioned are already handled. Prometheus is starting to be the de-facto monitoring solution in the docker/k8s world, I had no problem with it so far.
I'm not pleased with the concept of influx as a monitoring solution, its a kinda good timeseries db, but you need to write quite a lot machinery if you want to send/get/aggregate data (compared to prometheus, where the whole thing is handled by the server). Also the clients need to know who is consuming the data, they need to handle connection errors, etc. (Compared with prom. where you open a single endpoint and you don't need to know if it is consumed by one or ten nodes with minute or in hour intervals, the only need that you need to do is generating a text output for every request.)
Also the duplicated work is not really much I think, I wrote this from the ground (without knowing python but with experience with a bunch of other programming languages) in about 8-12hr (which contains the whole install and "I have no idea what I'm doing" phase with documentation reading, and googleing "how to transform float to string in python"), I can live with it if nobody else thinks that its useful
tl;dr: I think I have a close to good output with a lot of new metrics (not commited yet); I will work towards to add my local code, and some infra code to the repo, and test how its actually works with a prom/graf setup (and later on, start a grafana dashboard)
I will check back (I hope) shortly!