Everyone is Doing Deep Research
Ongoing Responses to DeepSeek
Genomic AI Model, AI Co-Scientist, and New States of Matter
and more...
Last year's new gadget is this year's e-waste
Remember the Humane AI Pin? The one Marques Brownlee called the "Worst Product [He's] Ever Reviewed" last year? Well, it turns out the startup's $240M in funding couldn't save it from a bad review by MKBHD.
Humane announced this week that the AI Pin is being bricked next week, on February 28 at 3pm EST, less than 1 year after launch (April 2024), and support will be terminated effective at that time. The company will sell the remainder of its assets to HP for $116M, which will surely cause those of us deja vu for those of us former Palm webOS users.
All AI Pin devices purchased since the product launched are still under warranty. Users are "encourage[d] to recycle [the] Ai Pin through an e-waste recycling program."
New Models and More Deep Research Tools
Perplexity rolls out "Deep Research" tool for free
This week, Perplexity rolled out its Deep Research tool. Like the similarly titled Deep Research tools first released as part of Google Gemini and then as part of ChatGPT Pro, Perplexity's agentic research tool can review hundreds of sources on the web in a matter of minutes and generate impressive reports for users. Unlike Google Gemini and ChatGPT's Deep Research tools, Perplexity's is available for free.
If any trademark attorneys from OpenAI, Google, or Perplexity are reading this, I'm so sorry.
Grok 3 released, including DeepThink
Entering the reasoning model race, Elon Musk's xAI released Grok 3 this week. In addition to boasting impressive benchmark results in science, math, and coding tasks, Grok 3 has similarly loose guardrails on sensitive topics such as politics and explicit content compared to its rival models from OpenAI, Anthropic, and Google. It is available to X Premium members, and for a limited time available to all.
In addition, Grok 3 comes with DeepThink, yet another agentic research tool. At least this one isn't also called Deep Research. DeepThink will only be available to users at the X Premium+ tier.
How do they compare?
In light of Perplexity and xAI entering the agentic research field this week, Tom's Hardware compared performance between each tool in Perplexity, Grok, and Gemini. The TL;DR is that Perplexity and Grok each won some of the tests, with Perplexity taking the overall crown. Given that ChatGPT's Deep Research is only available at the $200/month tier, it was left out of this comparison.
But fret not, as TechRadar compared ChatGPT and Perplexity's Deep Research tools. The ultimate conclusion was that "ChatGPT’s Deep Research was undeniably better in its final form from this brief test. ... Perplexity’s Deep Research, on the other hand, is great for those who want a lot of information collated quickly and relatively cheaply. It's a bit like a good abstract for a scholarly dissertation. ... Either way, human oversight is mandatory if you want to catch errors, verify sources, and ensure that conclusions make sense."
So there we have it: the Deep Research power rankings, unscientifically aggregated without complete independent verification, surely soon to be out-of-date:
ChatGPT Pro ($200/month)
Perplexity (free)
xAI Grok 3 (free for a limited time, then $32.92/month)
Google Gemini Advanced ($20/month)
Ongoing Responses to DeepSeek
As we all have heard by now, DeepSeek made waves when it launched its R1 reasoning model a few weeks ago. While the model is open weight and could be run locally without any concerns over data privacy, users worldwide have been rightfully concerned about the implications of a model subject to the Chinese Communist Party's censorship regime, among other data privacy concerns from users and governments. In a LinkedIn post a few weeks ago, I highlighted concerns around users submitting to the jurisdiction of Chinese courts when using DeepSeek's mobile app or website.
South Korea suspends downloads over data privacy concerns
On the subject of DeepSeek's data privacy concerns, South Korea followed in Texas' footsteps this week by ordering a halt on downloads of DeepSeek's mobile app. DeepSeek remains available to South Korean users via web or on mobile devices where it was already downloaded.
Per Reuters, the South Korean Personal Information Protection Commission cited statements from DeepSeek personnel acknowledging that the company had "partially neglected" some of its obligations under South Korea's data protection laws.
Perplexity releases R1 1776 fork of DeepSeek-R1
Perplexity saw an opportunity in the open weight nature of R1. Recognizing that "a major issue limiting R1's utility is its refusal to respond to sensitive topics, especially those that have been censored by the Chinese Communist Party (CCP)," Perplexity has released and open sourced "a version of the DeepSeek-R1 model that has been post-trained to provide unbiased, accurate, and factual information." It calls this model R1 1776.
Perplexity's announcement details the process it undertook to remove China's censorship from R1, and it's quite impressive. Check out the link above for details.
R1 1776 is available for free on HuggingFace or to developers through Perplexity's Sonar API.
Arc Institute and Nvidia release Evo 2 genomic AI
This week, Arc Institute and Nvidia announced the Evo 2 genomic AI. This open source model is trained on 9.3T tokens of DNA, and purports to be capable of understanding genomic interactions on a scale never previously possible. By purpose-building an LLM from the ground up using DNA as its native language, Arc Institute and Nvidia aim to accelerate research into genetic mutations and disease.
From Arc Institute co-founder Patrick Hsu on X:
This enables it to reason about and understand biological interactions across diverse length scales, from individual molecules to entire bacterial genomes or eukaryotic chromosomes. ... In other words, if you have a genetic mutation, Evo 2 has an opinion on whether or not it might cause disease.
Google launches AI co-scientist
Google Research has launched a new scientific research AI agent, which it calls an AI co-scientist. "AI co-scientist is a collaborative tool to help experts gather research and refine their work."
In the Limitations and Outlook portion of its announcement research paper, Google notes that they "look forward to responsible exploration of the potential of the AI co-scientist as an assistive tool for scientists." They go on to highlight that the AI co-scientist "project illustrates how collaborative and human-centred AI systems might be able to augment human ingenuity and accelerate scientific discovery.
Paranoid About Androids
Well, I needed a reason to rewatch Terminator and Blade Runner.
Clone Robotics announced Protoclone this week, "the world's first bipedal, musculoskeletal android." "The Protoclone is a faceless, anatomically accurate synthetic human with over 200 degrees of freedom, over 1,000 Myofibers, and 500 sensors." It looks terrifying.
Figure AI's Helix humanoid robot, also announced this week, is less terrifying. Figure touts Helix as "a generalist Vision-Language-Action (VLA) model that unifies perception, language understanding, and learned control to overcome multiple longstanding challenges in robotics."
While Clone's website and social media tout their achievements at simulating human bodies with striking anatomical precision, Helix's robots appear more focused on productive everday tasks like picking things up and putting them down.
I just want a robot to pass the butter.
A New Challenger Approaches
Mira Murati (co-founder, ex-CTO, and brief interim CEO of OpenAI) has officially launched her AI startup, Thinking Machines Lab. Murati leads as CEO, with John Schulman (her fellow OpenAI co-founder) as Chief Scientist. While the company's roadmap is not public, it is touting an "[e]mphasis on human-AI collaboration."
Altman considering open source small models
Surely having nothing to do with market trends favoring open source AI recently, OpenAI CEO this week to ask whether users would rather the company focus on an open source reasoning model like o3-mini or something that can run on a phone. The poll was close, but a model like o3-mini won out.
Anthropic's jailbreak challenge concludes
Following up on a prior week's story, Anthropic's jailbreak challenge has concluded. The company has paid $55,000 to the winners. From ML Researcher Jan Leike:
After 5 days, >300,000 messages, and est. 3,700 collective hours our system got broken. In the end 4 users passed all levels, 1 found a universal jailbreak.
Microsoft discovers a new state of matter
While this isn't my typical AI update, this caught my attention for the impact it is likely to have on AI development among other areas.
This week, Microsoft announced Majorana 1, a breakthrough in quantum computing. Named for Italian physicist Ettore Majorana, the Majorana 1 chip uses a new state of matter (which Majorana himself theorized) called topological superconductivity.
From Microsoft:
The advance stems from Microsoft’s innovations in the design and fabrication of gate-defined devices that combine indium arsenide (a semiconductor) and aluminum (a superconductor). When cooled to near absolute zero and tuned with magnetic fields, these devices form topological superconducting nanowires with Majorana Zero Modes (MZMs) at the wires’ ends.
Unlike quantum computers up until this point which impressed with around 100 qubits (Google's best-in-class Willow chip, for example, has 105), Microsoft says they believe Majorana 1 paves the way to scale quantum processing up to 1,000,000 qubits in a small form factor.
This will still require a high level of control and extreme conditions (near absolute zero temperature, for example), but it is always impressive when concepts of theoretical physics are observed and able to be controlled.