Google is reclaiming its throne
And, more on Meta's ImageBind & 3D-AI space
GM! Welcome to The Status Code.
We’re like rugged rangers, protecting precious AI content for you to explore.
Here’s what we have for today:
🛋The Metaverse in your living room
👑Google is reclaiming its throne
🧊 A peek into 3D-AI space
(Estimated reading time: 4 minutes )
Not subscribed yet? Stay in the loop with AI weekly by reading the Status Code for five minutes.
Two main stories of last week. If you have only ~2 minutes 45 seconds to spare
1/ 🛋The Metaverse in your living room
Since the Metaverse incubation, Zuckerberg claimed that it would be the future of the internet.
In 2016, he wanted a billion people on Facebook in virtual reality asap.
Here we are, 7 years later where Gen-AI has massively taken over.
This week, Big M introduced open sources multisensory AI model that combines six types of data called
Another exciting open-source project? Absolutely.
Meta claims that ImageBind is the first model to combine six data types into a single embedding space.
It’s similar to AI image generators like DALL-E, Stable Diffusion, and Midjourney.
They pair words with images and generate visual scenes on the basis of a text description.
But here’s what’s different with IB.
It links images/video text, 3D measurements (depth), audio, temperature data (thermal), and motion data (from inertial measuring units).
On Midjourney, you can use prompts like,
“a pirate wearing a dress voyaging around the sea in a beach ball” and get a rather realistic photo of this absurd scene.
With multimodal AI tools like ImageBind, there’s more potential.
You can create a video of the pirate with ‘Aye Aye’ sounds and bring out the splashing noise of the water.
It’s a futuristic reality that fuels movement on a physical stage and helps us better perceive the environment.
However, it’s in the research phase.
But it promises a future that can create multisensory, sensory experiences with generative AI systems.
People feel that combining data from primary senses like visions comes naturally to us.
The data-hungry system of Multimodal learning is on its way to creating the ‘ultimate algorithm.’ With 6 modalities, this is a true multimodal AI with more to add on in the future.
P.S. Meta also dropped their ad tool this Thursday. It’s called AI Sandbox.
And it lets you edit, test, and design ads for Facebook and Instagram.
2/ 👑Google is reclaiming its throne
Google is back for ..everything!
Their I/O event is the talk of the town this week. And their stocks jumped 5% on Wednesday.
Last year's event focused on devices and features, but this year it's all about AI.
So, what's new?
1/ Help Me Write
No more Chrome extensions are needed. Google's bringing them a "Help Me Write" feature to draft emails.
And it's coming to Docs, Slides, and Sheets too!
2/ Magic Editor
Think of it as an upgraded version of Apple's cutout feature. You can now change lighting and remove objects from photos.
Rumors were flying about Big BARD, but guess what? We got an even better BARD!
Share to Gmail & Docs from results
Waitlist open to everyone
Coding upgrades and citation features
Big BARD is still in progress, but they announced something cooler - Gemini.
Gemini is a GPT competitor and a Google Deepmind project. They said BARD will slowly transition into Gemini.
This means they had planned out BARD to be experimental after all.
4/ PaLM 2
PaLM 2 is the successor to Google's PaLM model. It's trained on 5.4 trillion words (10x more than PaLM).
And it comes in four sizes:
They only shared info about Gecko, the smallest of the bunch. And it'll be available offline and work on phones.
This is a sign that the next wave of AI will be native and offline.
However, there's a catch. With recent privacy concerns, Google hasn't mentioned which data they used in PaLM 2. They also haven't shared the hardware setup for training PaLM.
They are adding multimodality to the search. As it’s their top revenue model, they will also feature an ad after a search summary roundup for a topic.
Question→ search summary→Ad →more results and traditional links.
They are also bringing up conversational mode. It looks similar to that of Bing but powerful.
SEO will take a big hit from this. And, if you have a blog or a website, waiting for some time would be a good decision.
And we also got the workshop feature. The email assistant “help me write” is pretty solid.
Did you join the waitlist? If not, click here.
1 trend you can pounce on. Reading time: ~1 minute 20 seconds
🧊 A peek into 3D-AI space
Japanese creativity is famous for unique designs in both 3D and 2D, like Origami and Anime.
This began in the 80s when Hideo Kodama invented the 3D printer using a process called stereolithography (SLA).
Today, 3D printers start at $200, but the actual cost is in their design.
Planning the framework, filling in the colors, and predicting their movement.
But we’re here to tell you that this is changing.
This week, Google partnered with Adobe Firefly to create 3D canvases using their geospatial API. They’ll bring something unique by combining the power of mapping and photography.
OpenAI released a new version of its PLAY-E model, called SHAP-E, which makes fine textures and complex shapes from text inputs.
This simplifies rendering and processing. The technology uses neural radiance fields (NeRFs), a VR/AR tech that transforms 3D visuals into photorealistic environments.
But how can 3D AI development help humans?
1/ Medical diagnosis
AI models can analyze and create human body models.
Imagine tumor shapes created before surgery for a test operation. A team of researchers did that, and the operation process was precise.
Design flaws are the enemy of car manufacturing. AI can help create these models and test them in various scenarios. If we feed them enough data from crash test footage, we can get a new era of vehicle safety.
The development of 3D AI models can help to introduce AR models better. Google, Apple, and Microsoft have three projects for just that.
Google: “Project Tango” - uses 3D sensors to create maps of real-world environments
Apple: “ARKit” - uses motion sensors of the iPhone to create AR experiences
Microsoft: “Hololens” - uses a holographic headset for AR experiences
Blockade Labs is also working on its project Skybox. It generates virtual environments from text prompts. It's not perfect, but it works.
As you can see, Huggingface has the least number of models in object detection.
So, this is a big opportunity to invest and contribute in this space.
🦾What happened this week
Cyborg Content released their AI blog post writer
Embedding.store, a hosted embedding marketplace was released.
No more re-prompting with InfiniteGPT
Eluna AI turns your thoughts into text
Google’s new generative AI, PaLM 2 is on its way
Open AI’s new Shape-E tool allows you to generate 3D objects
Google is here with more AI tools for Workspace
AI chatbot is going to replace human order-takers at Wendy’s
Google teases Project Tailwind - a prototype AI notebook
H20 AI launches H20 GPT and LLM Studio
Google upgrades Brad to compete with ChatGPT
AI model can detect Parkinson’s
Microsoft helping AMD expand into AI chips
Danish health tech startup Teton.ai collected €4.8 million in funding
Bessemer Venture Partners is committing $1B to invest in A.I. startups
Ashton Kutcher raised a $243M AI fund
AI startup Rewind gets a $350M valuation
Firmbase lands $12M in seed funding to upgrade AI financial planning tool
Allen Institute for AI raises $30M to boost more startups
Fintech firm Fundly.ai bags $3 million in seed funding
Lavita AI Raises $5M in Seed Funding
🐤Tweet of the week
AI Twitter these days. 👇🧵
— hardmaru (@hardmaru)
May 12, 2023
😂 Meme of the week
That’s it for this week, folks! If you want more, be sure to follow our Twitter (@CallMeAIGuy)
🤝 Share The Status Code with your friends!
Did you like today's issue?