Generate lip-synced or audio-reactive video from a single image using the distilled 8-step LTX-2 model.