VL.PythonNET and AI worflows like StreamDiffusion in vvvv gamma - wip

link管理

链接快照平台

输入网页链接，自动生成快照
标签化管理网页链接

VL.PythonNET

The heart of this development is embedding the Python runtime within a vvvv process, allowing for direct interaction with Python code. This enables the use of libraries, such as PyTorch, TensorFlow, HuggingFace transformers, as well as the the usual suspects like NumPy and Pandas, natively in the vvvv environment.

StreamDiffusion

As a first example, I’ve applied this to StreamDiffusion. After a lot of optimization work, we now have what I believe to be the fastest implementation available. Additionally, by achieving direct texture input and output, latency is reduced further as the data never leaves the GPU, creating a truly interactive experience.

Current Status and Early Access

This isn’t quite ready for prime time; the setup for StreamDiffusion with Cuda and TensorRT acceleration is complex, and I want to improve on that. But I’ve started a super early access program for those who can contribute to its development. A donation to support this project will get you early access, my support in setting it up, and a mention on the forthcoming project website.

If you’re interested in getting ahead of the curve and are in a position to support this project, drop me a line at my forename at gmail dot com or:

Element Chat
Instagram (some more videos there)
LinkedIn
Twitter
Facebook

Live Demo

Introduction and demo at the 24th vvvv worldwide meetup:

Outlook and further possibilities

The horizon for this integration is vast and with more development time, this can get really big.

ComfyUI

One particularly exciting potential is to integrate ComfyUI, enabling the auto-import of ComfyUI workflows. As well as potentially being able to use ComfyUI nodes as vvvv nodes seamlessly. While ComfyUI is not geared towards real-time, it is a flexible and powerful GenAI toolkit.

Large Language Models

Already in the works, incorporating local LLMs, like the new LLaMA3 or Mistral to integrate text or code generators.

Music and Audio Generation

Lately there are better and better music generation models and they could be used to generate endless music streams that are interactively influenced.

Training and Fine-Tuning Models

While more complex than just running a model, it opens the door for real-time live training for interactive projects that could learn over time.

Usability

Exploring multithreading and running Python in a background thread could improve the experience and will make it possible to run vvvv for visuals in a different framerate.

Also, vvvv’s node factory feature could be used to automatically import Python scripts or libraries and build a node set for it. For example, get the complete PyTorch library as nodes for high-performance data manipulation on the GPU.

Licensing

Currently, I do not intend to offer it for free or as open-source. The library will be available under a commercial license. However, an affordable hobbyist/personal use license will be available in a few months.

That’s it for now, I’ll update here if something new happens. If you have any questions or ideas, add them here. For issues or problems, please create a new thread or write in the Element channel.

Guys, I will never tire of saying that this is fascinatingly awesome.
This is one of the best things to happen to VVVV in years.

But I have a big request. Although I find neural networks and especially ComfyUI interesting and community is interested in them, can you specifically beta test VL.PythonNET? I have some experimental Python scripts that I cannot reproduce in VVVV and I would like to try to run them.

Although I find neural networks and especially ComfyUI interesting and community is interested in them, can you specifically beta test VL.PythonNET?

Yes, of course, VL.PythonNET is the core of this development. You do not need to use any neural network, you can just run any Python code, as long as you create a venv with the right dependencies or your Python installation or the machine has everything installed to run the script.

No, currently, I do not intend to offer it for free or as open-source. The library will be available under a commercial license. However, an affordable hobbyist/personal use license will be available in a few months.

EDIT: I’ve added a licensing section in the text above.

I can officially confirm that this is a game changer and an exceptional addition to vvvv armada.

@tonfilm if there is any way I can help you with or if you need me to provide you content feel free to ask!

Thanks again for all the hard work!

StreamDiffusion can now use all sd21 control nets with the sd-turbo model, including TensorRT acceleration:

As ControlNet is another network that needs to be evaluated, the performance impact is about 40%, it went from 45fps to about 25fps on my laptop 4090 GPU. A desktop 4090 GPU could reach 40-60fps.

Yes, I have looked at it, it is nice to get the app set up, but it would need to have some modification to work with vvvv. I am already on their discord and will talk with the developers to evaluate options when I have more time.

But the apps on there are mostly not real-time and my interest is mainly in high-performance real-time AI projects. I don’t see a big benefit in having non-realtime standalone apps in vvvv. But if you have use cases that you couldn’t do otherwise, please let me know.

Small update:
As a commissioned work the Wav2Vec audio AI model family have been added and the inference works well in real-time:

In this picture: ehcalabres/wav2vec2-lg-xlsr-en-speech-emotion-recognition · Hugging Face

Now also with Emotion2Vec in real-time with live audio input which seems to work better than Wav2Vec:

And AST audio classification in real-time with live audio input:

It uses the MIT/ast-finetuned-audioset-10-10-0.4593 which can detect over 500 audio classes .

This work is absolutely stunning!

As mentioned already, this is bringing gamma to absolutely new heights, tapping into this huge ecosystem.

👏👏👏

hello @tonfilm ,

after I updated to the latest and definitely easier to install VL.StreamDiffusion I faced an issue with performance.
FPS keep dropping as time passes by, for instance, the example patch when initializes starts with ±27 fps (fairly good for a laptop equipped with a 3070) but after a while (less than 10 mins) it drops down to 9 fps and obviously it crashes at some point.

Any hints or ideas on what could cause this?

nissidis:

but after a while (less than 10 mins) it drops down to 9 fps and obviously it crashes at some point.

It sounds like a throttling issue of the laptop, as the GPU is running at 100% all the time.

I could not reproduce this here, could you check if ram or vram is growing after some time?

Let’s continue this on the Element channel or make a new thread. This Thread is more for information purposes.