The AI Daily Brief
Posts
"A Radical Plan to Make AI Good, Not Evil"

"A Radical Plan to Make AI Good, Not Evil"

That's how Wired described Anthropic's "Constitutional AI" approach

Nathaniel Whittemore
May 09, 2023

Welcome to The AI Breakdown, the most interesting & important news and conversations in AI.

First, the News:

Elon Musk snarks at White House choice of AI czar and reverses course suggesting end-to-end AI in Teslas
OpenAI used ChatGPT-4 to label the GPT-2 neurons and explain their role in the model
A GPT that combines text, audio, visual, thermal, and depth data? Meet Meta’s new open source ImageBind
IBM gets back in the game with enterprise AI platform WatsonX

The Most Interesting Discussion

Is it possible for an AI language model to have values? Anthropic is betting yes.

The company today released more information about what it calls “Constitutional AI.” The idea is that to give “language models explicit values determined by a constitution, rather than values determined implicitly via large-scale human feedback”

In today’s post, the company made a few critiques of human feedback models, including: 1) people having to interact with “disturbing outputs”; 2) scalability; 3) difficulty. The “Constitutional AI” approach instead gives the system a set of principles and guides the model itself to provide feedback.

Anthropic published the list of principles it trains Claude on based on a number of sources including the Universal Declaration of Human Rights, Apple’s Terms of Service, Deepmind’s Sparrow rules, encouragement of consideration of non-Western perspectives and more. You can read them in full here.

The Most Interesting Research Discussion

GPT4 labeling neurons in GPT2 — could it help with explainability and alignment?

new research from OpenAI used gpt4 to label all 307,200 neurons in gpt2, labeling each with plain english descriptions of the role each neuron plays in the model.
this opens up a new direction in explainability and alignment in AI, helping make models more explainable and… twitter.com/i/web/status/1…
— Siqi Chen (@blader)
5:44 PM • May 9, 2023

Today on The AI Breakdown

Share The AI Breakdown, win free Midjourney or ChatGPT accounts!

We so appreciate all you taking the time to read The AI Breakdown. We’ve just added new referral rewards.

If you refer just 1 other subscriber you’ll be entered to win a weekly drawing for a free year of Midjourney’s base plan
If you refer 2 other subscribers you’ll also be entered to win a weekly drawing for a free 6-months of ChatGPT Plus

Thanks for reading!