THIS WEEK IN AI — Week of 9th Feb 25
4 min readFeb 19, 2025
This newsletter gives you everything you want to know in AI from News, Jobs, Tools, Projects in a structured way weekly.
Latest AI development
Sam Altman speaks on GPT-5
- $500B Stargate Project: Aims to enable future models to develop new scientific knowledge.
- Programming Ranking: OpenAI’s internal model is ranked 50th globally, with potential to become №1 by year’s end.
- Open Source Shift: OpenAI plans to move towards open source, acknowledging society’s readiness for the associated trade-offs.
- Rapid AI Progress: Compared to “trying to outrun the calculator,” AI is expected to surpass human abilities in every general domain.
- Source: https://www.youtube.com/watch?v=8LmfkUb2uIY
GitHub Copilot: The agent awakens
- Agent Mode in VS Code: GitHub Copilot now offers an autonomous agent mode that iterates on its own code, detects errors, and even suggests terminal commands for self-healing.
- Copilot Edits GA: The new Copilot Edits feature is generally available in VS Code, enabling natural language-driven, multi-file inline edits in a conversational workflow.
- Enhanced AI Integration: The tool now integrates Gemini 2.0 Flash in the model picker, improving code completions, chat, and overall AI performance.
- Project Padawan Preview: A first look at autonomous SWE agents shows Copilot taking on routine development tasks — like generating fully tested pull requests — directly from issues.
- Source: https://github.blog/news-insights/product-news/github-copilot-the-agent-awakens/
ByteDance unveils Goku AI image and video creation
- Unified High-Performance Architecture: Goku sets benchmark records in both image and video quality with a unified design.
- Advanced Rectified Flow Technique: Enables seamless transitions between images and videos, powered by training on 160M images and 36M videos.
- Enhanced Goku+ for Marketing: Specifically optimized for advertising, it creates photorealistic human avatars and product demos.
- Specialized + Platform Tools: Offers dedicated features to turn product photos into video clips and facilitate realistic human–product interactions for commercial content.
- Source: https://arxiv.org/pdf/2502.04896
The Anthropic Economic Index
- Gradual Integration, Not Replacement: Over 36% of occupations use AI in at least 25% of their tasks, indicating AI is steadily being integrated rather than completely replacing jobs.
- Augmentation Over Automation: Approximately 57% of tasks are augmented by AI, with only 43% fully automated — emphasizing a collaborative human-AI work model.
- Tech-Heavy Adoption: Software development and technical writing roles are leading AI usage, particularly in mid-to-high wage positions, highlighting varied readiness across sectors.
- Transparent Research Approach: The findings are based on millions of anonymized Claude.ai conversations, with the underlying dataset openly sourced for deeper analysis by researchers and policy experts.
- Source: https://www.anthropic.com/news/the-anthropic-economic-index
Perplexity drops blazing new Sonar model
- Ultra-Fast Performance: Sonar delivers responses 10x faster than competitors like Gemini 2.0 Flash, powered by Cerebras inference infrastructure for near-instant answer generation.
- Superior Quality: In tests, Sonar outperformed GPT-4o and Claude 3.5 Sonnet in user satisfaction, factual accuracy, world knowledge, and other key benchmarks.
- Widespread Availability: All Perplexity Pro subscribers now receive Sonar as their default model, with API access expected soon under the same architecture.
- Upcoming Voice Mode: Perplexity CEO Aravind Srinivas teased that Voice Mode will be the only product reliably offering real-time voice answers for free.
- Source: https://www.perplexity.ai/hub/blog/meet-new-sonar
OpenAI roadmap for GPT4.5 & GPT5
- Integrated Advanced Reasoning: GPT-5 will embed o3’s capabilities along with other OpenAI tech, creating a unified system that dynamically adjusts intelligence levels.
- Tiered Access Model: Free users get unlimited access to GPT-5 at “standard intelligence,” while Plus and Pro tiers unlock progressively higher performance and advanced tools.
- Predecessor Release: Before GPT-5, GPT-4.5 (codenamed “Orion”) will be launched as the final non-chain-of-thought model, marking the transition toward more reasoning-based AI.
- Timeline: According to Altman, GPT-4.5 is expected in weeks, with GPT-5 following in months, and o3 will no longer be released as a standalone model.
- Source: https://x.com/sama/status/1889755723078443244
Gemini Flash 2.0 leads new AI agent leaderboard
- Comprehensive Evaluation: 17 top LLMs were benchmarked across 14 tests covering tool usage, long context, complex interactions, and more.
- Top Performer: Flash 2.0 led the leaderboard with a 0.938 score, outperforming pricier competitors.
- Open-Source Rise: Models like Mistral’s latest Small release are achieving scores on par with premium offerings at lower costs.
- Future Inclusion: DeepSeek’s V3 and R1 models were not tested due to missing function calling support, but will be evaluated if updated.
- Source: https://www.galileo.ai/blog/agent-leaderboard
Trending AI Tools
- Tough Tongue — Multimodal AI agent for navigating difficult conversations.
- Pikadditions — New video-to-video feature that enables users to integrate any subject or object into existing footage
- Le Chat — Mistral’s revamped AI assistant platform with 10x response speed and new iOS and Android apps
- Memex- Memex is a general-purpose, Level 3 autonomy builder
AI Tutorials
- Train your own R1 reasoning model with Unsloth (GRPO) — https://unsloth.ai/blog/r1-reasoning
- Optimizing Qwen2.5-Coder Throughput with NVIDIA TensorRT — https://developer.nvidia.com/blog/optimizing-qwen2-5-coder-throughput-with-nvidia-tensorrt-llm-lookahead-decoding/
Open Source AI Projects
- Zonos-v0.1 — Open-source, real-time TTS models with voice cloning
- Complex Function Calling Benchmark (ComplexFuncBench) — Designed for complex function calling evaluation
AI Must Read Papers
- DeepMind AI surpasses math olympiads — https://arxiv.org/pdf/2502.03544
- Fino1: On the Transferability of Reasoning Enhanced LLMs to Finance — https://arxiv.org/pdf/2502.08127
- Light-A-Video: Training-free Video Relighting via Progressive Light Fusion — https://arxiv.org/pdf/2502.08590
- TextAtlas5M: A Large-scale Dataset for Dense Text Image Generation — https://arxiv.org/pdf/2502.07870
We truly value your input. Please share your thoughts in the comments to help us improve.