Description
Inspired by the CES 2025 keynote from NVIDIA:
https://www.youtube.com/live/k82RwXqZHY8?feature=shared&t=3030
The demo should combine Qt Quick 3D and an AI image generation service into an application for creating illustrations. The designer retains a lot of granular control of the detailed layout of the scene by arranging 3D models and setting up the perspective within a scene. The AI then takes this 3D scene as input, and generates high-fidelity variations of the rendered scene based on a text prompt.
Runtime environment
NVIDIA's idea of running the AI model in WSL while the application UI is available on Windows is not directly relevant for this demo, unless it turns out to be the easiest solution (e.g. AI models available as readily-packaged docker containers). Using an online AI service rather than AI on the local device might be a reasonable first step.
Iterations
In a first iteration, the application should target desktop computers, but also work in web assembly. A second iteration should make a relevant subset of the application available on mobile platforms. A 3rd iteration might take the scene composition process into a Virtual Reality environment, and integrate voice interaction.
Technologies
The goal of the demo is to showcase how to use modern Qt frameworks and APIs in a productivity desktop application project; showing how such an application could have a customised look and feel is optional.
The application UI should be realised using Qt Quick (Controls). The design of the desktop and mobile application can either standardise on one of the Qt Quick Controls styles that are available on all platforms (Fluent WinUI3 style or one of the Qt Company design systems), or use the respective native style.
The first iteration can focus on mouse and keyboard interaction; the 2nd iteration needs to be easy to use from a touch display; the 3rd iteration in XR requires more UI research.
3D scene composition and model assets
Using available Open Source 3D models, or generating simple (untextured) models using an AI service (hugging face or similar). The demo application should include some pre-generated models for a good out-of-the-box experience.
Placing models?
Camera placement?
Lighting?