Post by account_disabled on Mar 7, 2024 8:47:03 GMT
micro-signals such as videos. In layman's terms, although the effect of using rnfrer will be good, the computing resources required are also very scary. This is not very economical. Of course, although enI has obtained various financings, it is still not that wealthy, so they did not directly invest resources but thought of another way to solve the problem of high computing costs. Here we must first Introducing the concept of "len latent", it is a kind of "dimensionality reduction or compression" that aims to express the essence of information with less information. Let’s give an inappropriate but easy-to-understand example. It’s as if we can use a three-dimensional view to save and record the structure of a
simple three-dimensional object without having to save the three-dimensional object itself. enI has developed a video compression network for this purpose to first reduce the dimensionality of the video to Rich People Phone Number List latent space and then use the compressed video data to generate he. This can reduce the input information and effectively reduce the computational pressure brought by the rnfrer architecture. . In this way, most of the problems solved ENI successfully set Wensheng video model into the paradigm of large language models that had achieved huge success in the past, so it was difficult to think about it. In addition, enI's training route selection is also slightly different. They chose "original size
and duration training" instead of the commonly used method in the industry of "cutting the video into a preset standard size and duration before training." Such training brings many benefits to r ① The generated video can better customize the duration ② The generated video can better customize the video size ③ The video will have better framing and composition The first two points are easy to understand. Three points enI gave an example. They made a model comparison between cropped size video training and original size video training. The left side is the video generated by the model after training on the cropped size video. The right side is the video generated by the model after training on the original size video. In addition, for