AI Revolution Hub
Posts
22 | AI Insights Every Business Leader Needs to Know | Google Cloud Next '24 | AI Music Innovation

22 | AI Insights Every Business Leader Needs to Know | Google Cloud Next '24 | AI Music Innovation

AI Breakthroughs: Stanford's Report, Google's Gemini 1.5 Pro, Job Insights & Music Innovations

Wilson CELY
April 19, 2024

GEN AI AT WORK
The AI Index Report from Stanford HAI

The Stanford Institute for Human-Centered AI (HAI) has recently published its seventh annual AI Index report. This year's edition covers a range of topics, including the emergence of multimodal foundation models, significant financial investments in generative AI, new performance benchmarks, shifting global attitudes, and new major regulations.

For those who may not be familiar, HAI is a prestigious institution affiliated with Stanford University and is highly regarded within the US tech community. Its mission is to advance AI research, education, policy, and practice to improve the human condition.

We understand that you may have a busy schedule and might not have time to read the full 500-page report. That's why we've summarized the key findings in 5 points to provide a quick and easy-to-digest overview.

1. Generative AI becomes a magnet for massive investments

Corporate investment was down overall last year, but generative AI funding soared.

The report shows most of that private generative AI investment happened in the U.S.

No surprise there! 🤷🏽‍♀️

2. Google Leads the Charge in Foundation Model Releases

Foundation models are large-scale, versatile models that can perform a wide range of tasks. For instance, OpenAI's GPT-3 and GPT-4 are foundation models that empower ChatGPT users to generate code, compose songs, or create imaginative images.
Due to the substantial resources required to train these models, the majority are now developed by the industry, while academia contributes only a limited number.

Google released the most in 2023. This is why Google's CEO and high management have been getting so much criticism for not being the leading force in AI. 🤷🏽‍♂️

As a reminder, the transformer architecture that enabled GPT models from OpenAI came from Google's DeepMind research.

3. Closed source vs Open source

The current hot topic in the AI field is whether foundation models should be open or closed. There are strong arguments on both sides, with some asserting that open models pose a risk, while others claim that open models fuel innovation.

But there’s less discussion about whether there are meaningful performance trade-offs

HELM: A benchmark for evaluating AI on general language understanding.
MMLU: Measures AI performance on multiple-choice questions across various subjects.
Chatbot Arena: A test of AI's ability to converse naturally and effectively.
HumanEval: Assesses AI's coding proficiency by generating and solving programming challenges.
SMiLE-bench: Evaluates AI's mathematical reasoning through structured math problems.
MATH: A benchmark focused on the AI's capabilities in solving advanced math questions.
GSNBk: A test of AI's skill in game simulations, strategy, and knowledge.
AgentBench: Measures the proficiency of AI in tasks that require agent-based decision-making skills.

4 Foundation models have gotten super expensive (Money and Carbon footprint)

Last week we explained the cost of training an LLM; AI companies rarely reveal the expenses involved in training their models, but the AI Index went beyond the typical speculation by collaborating with the AI research organization Epoch AI.

To come up with their cost estimates, the report explains, the Epoch team “analyzed training duration, as well as the type, quantity, and utilization rate of the training hardware” using information gleaned from publications, press releases, and technical reports.

The AI Index team also estimated the carbon footprint of certain large language models. The report notes that the variance between models is due to factors including model size, data center energy efficiency, and the carbon intensity of energy grids.

NOTE : Another chart in the report (not included here) shows a first guess at emissions related to inference—when a model is doing the work it was trained for—and calls for more disclosures on this topic.
As the report notes: “While the per-query emissions of inference may be relatively low, the total impact can surpass that of training when models are queried thousands, if not millions, of times daily.”

5) AI can’t beat humans at everything... yet

AI is Ahead of human:
AI performance has surpassed humans on benchmarks like:

image classification,
English comprehension
visual reasoning, and most

AI is Still Behind
AI performance lags humans in complex tasks like:

Planning,
visual common sense reasoning
Competition-level mathematics (But they’re not far behind) .

Source. https://aiindex.stanford.edu/report/

GEN AI STARTUP
Google Cloud Next Keynote 2024

Here is all you need to know about the keynote:

Google announces the Cloud TPU v5p, its most powerful AI accelerator yet

Google has introduced its new Gemini large language model (LLM) and Cloud TPU v5p, which is an upgraded version of its Cloud TPU v5e. The company claims that the new Cloud TPU v5p offers significantly faster performance and is more cost-effective for training large language models.

Google’s Gemini 1.5 Pro can now hear

Google's Gemini 1.5 Pro update enables the model to process audio files and extract information directly, without depending on written transcripts. This enhancement improves its performance compared to the previous version and introduces new features such as inpainting and outpainting for generating images.

Google announces Axion, its first custom Arm-based data center processor

Google Cloud has unveiled its first custom-built Arm processor, Axion, which delivers improved performance and energy efficiency compared to competing instances. Technical documentation and availability details for Axion are expected to be released later this year.

AI editing tools are coming to all Google Photos users Google Photos is introducing AI-powered editing tools for all users, enabling them to improve their pictures even without advanced editing skills.

Google’s Gemini Pro 1.5 enters public preview on Vertex AI

Google Cloud announced a new AI platform called Vertex AI. It has a model garden that includes various open-source and closed-source models including Gemini 1.5 Pro, Llama, Claude 3, Stable Diffusion Mixol 8, and x 7B Wizard Coder.

One particularly interesting model is Gemini 1.5 Pro, which can process vast amounts of information in a single stream because of its 1 million token context window. This allows it to answer questions about an hour-long video, for instance.

Vertex AI also includes Code Gemma, a new fine-tuned lightweight open model designed for coding, and Vertex AI Agent Framework, a new framework to build agents. However, the speaker finds this framework to be less sophisticated than what he was expecting.

GEN AI AT WORK
AI Ecosystem Job Tracker

In this ever-evolving industry, it's crucial to keep up with the newest developments and career prospects.

Our job tracker will help you do just that by providing up-to-date information on open positions in various AI-focused organizations.

Image linked to our Platform

GEN AI PROTECTION
Boston Dynamics: Atlas 2

Boston Dynamics has introduced a new fully electric Atlas robot.

The new robot is said to be stronger, more dextrous, and more agile than any previous generation. It looks futuristic and is claimed to be ready for commercial use.

It is designed to be used in existing human infrastructure because the world is already built for humans.
A vast amount of data is available to train humanoid robots.
The robot is expected to be a valuable asset for Hyundai in their automotive factories.
The humanoid form factor allows robots to move in efficient ways in a world designed for people.

Look at the teaser video, it looks so impressive.

GEN AI STARTUP
AI Music Arms Race: Udio Challenges Suno with Crisper Generative AI Songs

The AI music generation space is heating up with Udio, a new competitor to Suno, that promises to provide musicians with a powerful tool to create and profit from their AI-generated music.

Udio's Debut

Udio, an AI model capable of generating high-fidelity songs from text prompts, emerged from stealth just a week ago. Despite being a newcomer, Udio has already raised $10 million in seed funding from prominent investors and celebrities, including Andreessen Horowitz, Instagram co-founder Mike Krieger, and musicians will.i.am and Common.

Udio's impressive debut begs the question: How is this startup already producing music comparable to the established Suno AI, with potentially crisper sound?

The Minds Behind Udio

Udio was created in December 2023 by a team of former Google DeepMind researchers, including CEO David Ding, Conor Durkan, Charlie Nash, Yaroslav Ganin, and Andrew Sanchez. This pedigree likely contributes to Udio's strong performance straight out of the gate.

This is how I’ve created my summer hit with a simple prompt.

1) Hit up Udio (it's in Beta and free to make songs, link on the comments)
2) Drop in a simple prompt (no fancy tricks needed)
3) Have a blast!

Here are the links for both Suno and Udio.

Have fun!

Thank you, see you next week!

Reply

or to participate.