Data Machina #229

11 months ago 96

Generative AI Video. Stability AI SVD. Meta AI EMU. The Bayesian Brain. Probabilistic ML. UltraFastBERT. Gorilla OpenFunctions. Voyage-01 Embeddings. GAIA. Orca 2. ALERTA-NET.

Generative AI Video. Tinkering around and playing with GenAI in regulated industries like finance, insurance, pharma, or telecomm can cost you a fortune. But in media, marketing, and creative industries you can play with stuff like GenAI video for fun & profit. Here’s the latest on GenAI video.

Stability AI SVD image-to video. This week, Stability AI introduced Stable Video Diffusion (SVD) Image-to-Video, a diffusion model that takes in a still image as a conditioning frame, and generates a video from it. This model was trained to generate 25 frames at resolution 576x1024 given a context frame of the same size, finetuned from SVD Image-to-Video [14 frames]. Checkout the repo, model: Stable Video Diffusion Image-to-Video Model Card.

Meta AI EMU SOTA text-to-video. Ten days ago, Meta AI researchers introduced Emu Video, a state of the art, simple method for text-to-video generation based on diffusion models. Emu Videos a unified model that can generate videos based on a variety of inputs: text only, image only, and both text and image. Checkout project website, demo, paper: Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning.

Baidu VideoGen HD text-to-video. Baidu VideoGen is a text-to-video generation model, that generates high-definition video with high frame fidelity and strong temporal consistency using reference-guided latent diffusion. VideoGen leverages Stable Diffusion, to generate an image with high content quality from the text prompt, as a reference image to guide video generation. Checkout paper, demos: VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation.

Alibaba I2VGen-XL Video Synthesis. Researchers at Baidu, introduced I2VGen-XL, a model that addresses the scarcity of well-aligned text-video data, the complex inherent structure of videos, and the difficulty of a model to simultaneously ensure semantic and qualitative excellence. was trained on 35 million single-shot text-video pairs and 6 billion text-image pairs. Checkout paper, repo and demos: I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models.

Shanghai AI Lab SEINE generative video transitions. SEINE, is a model that aims to generate high-quality long videos with smooth and creative transitions between scenes and varying lengths of shot-level videos. The model uses a random-mask video diffusion model that auto-generates transitions based on textual descriptions. and different scenes as inputs, combined with text-based control. Checkout paper, model, demos: SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction.

PKU-YUAN Video-LLaVA video-image understanding. This is a Large Vision-Language Model (LVLM) for visual-language understanding that learns from a mixed dataset of images and videos, mutually enhancing each other. Video-LLaVA unifies visual representation in a single mode, and outperforms models designed specifically for images or videos. Checkout paper, repo and demo: Video-LLaVA: Learning United Visual Representation by Alignment Before Projection.

Some AI activities for the weekend. The weather in London is cold and miserable. And I bet you’re sick & tired of reading about the OpenAI drama, and all the paid PR/ media propaganda on the Q* breakthrough, super AGI achieved, and AI existential risk. So don’t venture outside, instead:

Play Death by AI, a survival party game. Can you survive my judgment and cheat death? Invite your friends to the game. Describe a deadly scenario. Enter a survival strategy. AI decides if you survive.

Join The Church of AI. At some point AI will have God-like powers. Experience what happens when an intelligent machine is able to expand its intelligence exponentially for eternity.

Learn about Q-Learning. All of the sudden, there are so many “experts” talking about Q-learning! It’s worthwhile to teach them a lesson.

Have a nice week.

Subscribe now

10 Link-o-Troned

LeCun: “Ignore the Deluge of Complete Nonsense about Q*

Natural Intelligence is All You Need [tm]

The Bayesian Brain & Probabilistic Reasoning

ML and The Economics Team @Instacart

[free course] Probabilistic ML, 2023

Intro to Voyage-01 Embeddings - A New SOTA Model

Explaining the Stable Diffusion XL Latent Space

Scalable Causal ML for Personalisations @Uber

The Hitchhiker's Guide from CoT Reasoning to Language Agents

[free] The ML Engineering Online Book for Language & Vision Models


Share Data Machina with your friends


the ML Pythonista

Gorilla OpenFunctions - An Open source Alt to GPT-4 Functions

UltraFastBERT - 78x Faster than BERT

Intel neural-chat-7B v3: The New Leader in Small, Open LLMs

Deep & Other Learning Bits

Hands-On Deep Q-Learning (blog, repo)

Intro to Equivariant Neural Nets ?–? What, Why and How??

Diffuse, Attend, and Segment - Zero Shot Segmentation

AI/ DL ResearchDocs

Meta AI et. al - GAIA: A Benchmark for General AI Assistants

MS Research - Orca 2: Teaching Small LMs to Reason (paper, repo)

ALERTA-Net: A Temporal Distance-Aware RNN for Predicting Stocks Movements

data v-i-s-i-o-n-s

TensorSpace - Interactive, 3D Visualisations of Neural Nets

How I Improved Strava Training Log Visualisations (blog, code)

[free book] "Graphic presentation" (1939) – A Classic on Info Visualization

MLOps Untangled

Huggingface Model to Sagemaker: Auto MLOps with ZenML

How to Build a Feature Store: A Comprehensive Guide

Version Control & Rollback Strategies for ML Models with Modelbit

AI startups -> radar

Munch - AI for Increasing Video Engagement in Social Media

Forward CarePods - AI-Powered Self-Healthcare Pods

Aptus - AI-powered Machine Readable Financial Regulations

ML Datasets & Stuff

NVIDIA HelpSteer - 37K Prompts with Responses & Human Annotations

PolitiFact-Oslo: A New Dataset for Fake News Analysis & Detection

No Robots - 10K Instructions Created by Skilled Human Annotators

Postscript, etc

Enjoyed this post? Tell your friends about Data Machina. Thanks for reading.

Share

Tips? Suggestions? Feedback? email Carlos

Curated by @ds_ldn in the middle of the night.


View Entire Post

Read Entire Article