THIS WEEK IN AI — Week of 2nd March 25
This Week in AI — 2nd March 25 from Mastering LLM (Large Language Model)
From the latest AI developments, you will learn:
- Microsoft’s Phi-4: A compact 14B parameter model excelling in mathematical reasoning.
- Mistral OCR: Processes up to 2000 pages per minute with multilingual support.
- Manus AI Agent: Autonomously executes tasks and achieves top results on the GAIA benchmark.
- Google’s AI Mode: A conversational search experience powered by Gemini 2.0.
- OpenAI’s NextGenAI Consortium: A $50M initiative to advance AI in healthcare, agriculture, and education.
Additionally, get insights on trending AI tools, open source projects, and must-read research papers.
Latest AI development
Phi-4: Microsoft’s Newest Small Language Model Specializing in Complex Reasoning
- Small Size, Big Impact Phi-4 is a 14-billion-parameter model that shines in complex reasoning, especially mathematics, proving that a smaller, efficient design can rival larger models.
- A Math Genius in the AI World It outperforms larger models on math benchmarks, making it a standout choice for advanced problem-solving tasks.
- Open to All: Explore and Learn Available on Azure AI Foundry and Hugging Face, Phi-4 is easy to access for researchers, educators, and developers, encouraging experimentation and learning.
- Innovation with Responsibility Microsoft pairs Phi-4 with safety tools and evaluations through Azure AI, ensuring it’s used ethically and responsibly.
- Source:https://techcommunity.microsoft.com/blog/aiplatformblog/introducing-phi-4-microsoft%E2%80%99s-newest-small-language-model-specializing-in-comple/4357090
Mistral OCR’s AI-ready document processing
- Mistral AI has launched a new API, Mistral OCR, which can extract and comprehend detailed information from complex documents with exceptional speed and accuracy.
- The API can process up to 2000 pages per minute and supports multilingual analysis across thousands of languages, including Hindi and Arabic.
- Benchmark tests show that Mistral OCR outperforms competitors like Google’s Document AI, Azure OCR, and GPT-4o in various document analysis categories.
- The technology can be deployed on-premises, making it ideal for organizations handling sensitive data.
- Source: https://mistral.ai/en/news/mistral-ocr
China’s Manus AI Agent Breaks New Ground in Autonomous Task Execution
- Manus operates independently on real-world tasks — from resume screening to property research — using its own dedicated computer instance, eliminating the need for human intervention during execution.
- The agent combines advanced skills like coding, visual creation, and web browsing — even handling freelance platforms like Upwork and Fiverr, showcasing versatility beyond typical chatbots.
- It achieved state-of-the-art results on GAIA, a rigorous benchmark for AI assistants, outperforming giants like ChatGPT and Gemini in complex problem-solving.
- While currently invite-only, the startup plans to open-source Manus’s models later this year, potentially democratizing access to high-level AI autonomy.
- Source: https://manus.im/
Google’s AI Mode Ushers in a Conversational Future for Search
- AI Mode replaces traditional search with dynamic conversations — Powered by a custom Gemini 2.0 model, it executes “query fan-out” to launch simultaneous searches across diverse sources (real-time web data, Knowledge Graph, product databases), synthesizing comprehensive answers with linked citations.
- Follow-up questions drive deeper exploration — Users can refine searches conversationally (e.g., “What happens to heart rate during deep sleep?”), mimicking chatbot interactions while retaining access to curated web links for verification.
- AI Overviews get smarter with Gemini 2.0 — Upgraded to handle complex coding, advanced math, and multimodal queries (text/voice/images), now available to teens and without sign-in requirements, signaling broader adoption.
- Premium access signals Google’s AI ambitions — Currently exclusive to $19.99/month Google One AI subscribers in the U.S., AI Mode tests a paid tier for power users, positioning Google against rivals like Perplexity and ChatGPT.
- Source: https://blog.google/products/search/ai-mode-search
OpenAI’s $50M NextGenAI Consortium Reshapes Academic Research and Education
- NextGenAI injects $50M into 15 top institutions — Harvard, MIT, Oxford, and others will receive grants, compute resources, and API access to advance AI-driven projects in healthcare, agriculture, and education, aiming to “catalyze progress faster than any one institution could alone”.
- Real-world applications target high-impact challenges — Harvard researchers are using OpenAI tools to slash rare disease diagnosis times, while Ohio State applies AI to digital health and agriculture. Libraries like Boston Public are digitizing historical texts for public access.
- Building on ChatGPT Edu’s momentum — The consortium follows OpenAI’s 2024 launch of affordable AI tools for universities, even as rivals like Perplexity plan free Pro subscriptions for students, signaling a race to dominate academia’s AI adoption.
- Filling a U.S. research funding gap — Launched amid NSF budget cuts and layoffs, NextGenAI positions OpenAI as a critical ally for institutions, though critics note it may steer academia toward proprietary models over open-source alternatives.
- Source: https://openai.com/index/introducing-nextgenai/
Prepare for your next AI role
Prepare for Large Language Model & GenAI interviews by learning real interview questions from FAANG and Fortune 500 companies.
Learn all the answers in a structured framework specifically designed & tested in companies like Google, Microsoft, Nvidia, Apple etc.
LLM Interview Prep Course (LLM50 to get 50% off): https://www.masteringllm.com/course/llm-interview-questions-and-answers
Trending AI Tools
- Google Search AI Mode: Get well-reasoned answers to tough questions.
- Windsurf Wave 4: Agentic coding with IDE integration.
- Aya Vision — Cohere’s new SOTA multilingual visual model
- Data Science Agent in Colab: The future of data analysis with Gemini
- Browser Operator: Meet Opera’s AI Browser Operator
AI Must Read Papers
- MultiAgentBench : Evaluating the Collaboration and Competition of LLM agents
- ABC: Achieving Better Control of Multimodal Embeddings using VLMs
- START: Self-taught Reasoner with Tools.
- Babel — Open Multilingual Large Language Models Serving Over 90% of Global Speakers
- Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
- DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking
We truly value your input. Please share your thoughts in the comments to help us improve.