Joanna Stern talks about her experiences with AI, her new book, and her plans to start a media company, highlighting the evolving role of AI in daily life.
The Latest
An Interview with Joanna Stern About Living With AI
Gaza Is Rebuilding With Lego-Like Bricks Made From Rubble
Gaza is pioneering a recycling project that transforms rubble into interlocking bricks, aiding reconstruction amid material shortages and destruction.
The Inference Shift
Cerebras Systems plans to raise its IPO price amid rising AI compute demand, signaling a shift toward heterogeneous AI hardware beyond GPUs.
vLLM V0 to V1: Correctness Before Corrections in RL
Hugging Face reports that vLLM V1 achieved backend parity with V0 after fixing logprob semantics, runtime defaults, weight updates, and fp32 lm_head, prior to RL objective changes.
Unlocking asynchronicity in continuous batching
Explores how asynchronous batching improves GPU utilization by decoupling CPU and GPU tasks, reducing idle time in continuous inference workflows.
SpaceX and Anthropic, xAI’s Two Companies, Elon Musk and SpaceXAI’s Future
Exploring Musk’s involvement with SpaceX, Anthropic, and xAI, and what it reveals about his future plans in AI and space tech.
Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context — Best Sub-100M Retrieval Quality
IBM releases two new open-source multilingual embedding models built on ModernBERT, supporting 200+ languages and code retrieval, under Apache 2.0 license.
The best argument I’ve heard for why AI won’t take your job
A leading SaaS CEO argues AI will augment, not replace, human workers, emphasizing the durability of the last mile of human expertise amid fears of job loss.
EMO: Pretraining mixture of experts for emergent modularity
AI researchers introduce EMO, a mixture-of-experts model that naturally develops modular structure during pretraining, enabling selective expert use with minimal performance loss.
Orthrus-Qwen3: up to 7.8×tokens/forward on Qwen3, identical output distribution
Orthrus-Qwen3 enhances Qwen3 models with parallel token generation, delivering up to 7.8× speedup and maintaining exact output fidelity, according to developers.