Table of Contents

🖥️ Microsoft Copilot Adds “Computer Use” Capability, AI Agents Take Key Step in Desktop Automation

Microsoft has added a new “computer use” feature to its Copilot Studio platform, enabling AI agents to directly manipulate the graphical user interfaces (GUIs) of desktop applications and websites. This means Copilot is no longer limited to systems with APIs; it can act like a human clicking buttons and filling forms, intelligently adapting to interface changes and automatically fixing process errors. This significantly broadens the scope of automation tasks, marking a key step towards more powerful desktop automation.

Highlights:

  1. Interface Control: Copilot agents can now directly interact with any program or website that has a GUI, whether it’s desktop software or a browser page, treating them like operable tools. This opens new possibilities for automating legacy systems or specific applications lacking APIs.
  2. Intelligent Adaptation: Built-in AI reasoning allows it to “understand” and adapt to application interface changes in real-time, autonomously resolving issues encountered during automation. Unlike traditional RPA, it’s less prone to errors caused by interface updates, making it more stable and reliable.
  3. No-Code Operation: Users can describe the desired automation task using natural language and test/adjust instructions via a side-by-side video preview (showing the AI’s reasoning and planned UI actions), eliminating the need for coding and significantly lowering the barrier to entry.
  4. Cloud Hosting: All automation tasks run on Microsoft’s cloud servers, freeing businesses from deploying and managing complex RPA server environments, reducing maintenance costs and hassle while ensuring data security.
  5. Wide Applications: Can be used for various scenarios like cross-system automated data entry, automated market research by collecting and organizing online information, and simplifying financial operations by extracting invoice data and entering it into accounting systems, boosting efficiency.

Value & Impact:

  • For Professionals: This technology represents an intelligent upgrade to RPA, lowering the threshold for UI automation. Tasks previously requiring professional RPA developers might now be created by business users via natural language. Its adaptability reduces maintenance costs. For businesses already in the Microsoft ecosystem, it’s a powerful, seamlessly integrated enhancement for automation capabilities.
  • For General Users: While primarily aimed at enterprise development, this feature hints at future changes in how we interact with computers. Imagine using voice or simple text commands to have an AI perform complex computer operations for you, like “organize these files and send them to so-and-so,” greatly enhancing personal productivity.

Related Links:

🔍 Anthropic Claude Gains Autonomous Research Ability, Deeply Integrates Google Workspace to Create Context-Aware Work Partner

Anthropic has rolled out two major updates for its AI assistant Claude: first, a new “Research” feature enabling Claude to act like an intelligent agent, autonomously searching the web and user-authorized work files for multi-step, in-depth information gathering; second, deep integration with Google Workspace (Gmail, Calendar, Docs), allowing Claude to understand user work context to provide assistance without needing repeated explanations or file uploads. The goal is to make Claude a more knowledgeable and capable intelligent work partner.

Highlights:

  1. Autonomous Research: Claude acts like a researcher, proactively conducting multi-round searches across the web and your authorized work files based on your query, systematically exploring the issue, and finally delivering a detailed report with citations.
  2. Deep Integration: Through a secure connection, Claude can directly access your Gmail, Google Calendar, and Docs content (with permission), understanding your email exchanges, schedule, and document information to better grasp your needs and offer relevant help.
  3. Context Awareness: By combining public web information with private work context, Claude can provide highly personalized and relevant assistance. For instance, it can combine meeting invitation emails, related documents, and follow-up emails to automatically summarize meeting minutes and identify action items.
  4. Enterprise Enhancements: Enterprise plan users get enhanced document cataloging features, utilizing RAG technology for efficient and accurate information retrieval within large internal document repositories.
  5. Phased Rollout: The “Research” feature is currently in early testing for select paid users in the US, Japan, and Brazil; Workspace integration is available in beta for all paid users (Team/Enterprise plans require admin enablement).

Value & Impact:

  • For Professionals: Claude is attempting to bridge the “context gap” in AI assistants. By securely accessing personal work data, it can offer personalization and contextual support far beyond generic AI, potentially evolving from a Q&A tool into a true workflow partner, improving information processing and task execution efficiency. Its autonomous research capability also allows it to handle more complex information gathering and analysis tasks.
  • For General Users: This means future AI assistants might better understand your personal needs and work habits. Imagine an AI automatically summarizing email key points, reminding you to prepare meeting materials based on your calendar, or even proactively providing relevant background info while you write a document, making work and life easier. Of course, data privacy and security will be paramount concerns for users.

Related Links:

💰 OpenAI Reportedly in Talks to Acquire Programming Platform Windsurf for $3 Billion, Upping Stakes in AI Coding Race

According to multiple media reports, OpenAI is in advanced talks to acquire AI coding assistant developer Windsurf (formerly Codeium) for potentially up to $3 billion. If successful, this would be OpenAI’s largest acquisition to date. Windsurf focuses on using AI to boost developer productivity and is a strong competitor to tools like GitHub Copilot and Cursor. The potential acquisition highlights OpenAI’s ambitions in the fiercely competitive AI coding market.

Highlights:

  1. Massive Deal: The rumored $3 billion price tag is an extremely high valuation for a company with reported ~$40 million ARR, signaling OpenAI’s strategic focus on the AI coding sector.
  2. Target Company: Windsurf (formerly Codeium), founded in 2021, is a startup that has raised over $200 million, and its AI coding tools are popular among developers.
  3. Soaring Valuation: Before the acquisition talks, Windsurf was reportedly seeking a new funding round at a valuation close to $3 billion, more than double its $1.25 billion valuation from 2023.
  4. Strategic Move: Acquiring Windsurf would directly bolster OpenAI’s position in the AI coding assistant market, enabling it to better compete against rivals like Microsoft’s GitHub Copilot and Anthropic’s Claude.
  5. Potential Conflict: Interestingly, OpenAI’s venture fund previously invested in Cursor, a direct competitor to Windsurf. This acquisition could spark discussions about platform neutrality and potential conflicts of interest.

Value & Impact:

  • For Professionals (especially Developers): OpenAI’s entry could accelerate consolidation and competition in the AI coding tool space. Developers might benefit from more powerful tools, but the market could become more concentrated with fewer choices. It underscores the trend of AI empowering development, making AI-assisted programming skills increasingly important.
  • For General Users: While this is a developer-focused tool, improved AI coding efficiency ultimately speeds up software development and iteration. The apps and software we use might get new features and improvements faster. It also reflects how AI is penetrating various industries and changing traditional work methods.

Related Links:

🚗 Li Auto’s MindGPT 3.0 Launches, Upgrading In-Car AI Assistant’s Deep Thinking Capabilities

Li Auto announced that its self-developed MindGPT large model has been upgraded to version 3.0 and fully integrated into its smart assistant “Li Xiang Tongxue.” The core of this upgrade is a significant enhancement in the AI’s “deep thinking” capabilities, reportedly comparable to leading industry models like DeepSeek. The new version also introduces features like chain-of-thought display, reflection-based re-retrieval, improved voice understanding, and irrelevant dialogue filtering, aiming for a smarter, more reliable, and transparent in-car AI interaction experience.

Highlights:

  1. Top-Tier Benchmark: Li Auto claims MindGPT 3.0’s deep thinking ability reaches industry-leading levels, directly comparing it to well-known DeepSeek models.
  2. Transparent Thinking: Adds structured chain-of-thought display, allowing users to see the AI’s reasoning process to reach an answer, increasing explainability and user trust.
  3. Reflective Optimization: Possesses “reflection and re-retrieval” capability; the AI proactively reviews initial answers and seeks more comprehensive, accurate information to supplement and optimize them.
  4. Enhanced Voice: Improved understanding and fault tolerance for colloquial or less clear voice commands, better adapting to interactions in noisy car environments.
  5. Dialogue Filtering: Includes “irrelevant history dialogue filtering” to prevent the AI from incorrectly linking unrelated topics in multi-turn conversations, improving response relevance and accuracy.
  6. Ecosystem Expansion: Enhanced tool-calling capabilities, such as querying real-time stock or ticketing information, expanding the in-car assistant’s functional scope.

Value & Impact:

  • For Professionals (Auto Industry): This demonstrates the trend of verticalization and deepening AI applications within the auto industry. Developing or deeply customizing large models optimized for in-car scenarios to address specific pain points (complex interactions, multi-turn dialogues, noise interference) is becoming key. Li Auto aims to create a differentiated smart cockpit experience by enhancing the model’s thinking, reasoning, and dialogue management, crucial for smart car competition.
  • For General Users (especially Car Owners): In-car AI assistants are becoming smarter and more reliable. “Chain-of-thought display” makes the AI less of a black box, increasing trust. “Reflective optimization” and “dialogue filtering” lead to smoother interactions and more accurate answers. Future cars might be not just transportation tools but intelligent living and working spaces.

Related Links:

✈️ Fliggy AI “Ask” Debuts, Multi-Agent System Crafts Personalized Travel Plans

Online travel platform Fliggy has launched a new AI product called “Ask” (问一问), positioned as an intelligent travel assistant. Its uniqueness lies in using a multi-agent collaboration model, simulating various expert roles (like itinerary planner, transport advisor, hotel consultant) working together to understand complex personalized user needs. It also deeply integrates real-time data from Fliggy’s platform (flights, hotels, attractions), ensuring generated plans are realistic and actionable, and directly guides users to booking, closing the loop from conversation to transaction.

Highlights:

  1. Multi-Agent Collaboration: Simulates an AI expert team (handling itinerary, transport, hotels, budget, etc.) using division of labor and autonomous decision-making to address complex travel planning needs.
  2. Real-Time Data: Directly accesses and integrates dynamic proprietary data from Fliggy, like real-time flight prices, hotel availability, attraction info, and user reviews, ensuring timely and executable plans.
  3. Conversation to Transaction: Allows users to generate, modify, and confirm travel plans within the chat interface, with direct links to corresponding flight/hotel booking pages, shortening the path from planning to booking.
  4. Personalized Customization: Users can edit specific parts of the AI-generated plan; the AI can also automatically adjust and generate new matching plans based on budget requirements.
  5. Technology Driven: Reportedly powered by Alibaba’s Tongyi Qianwen large models to support its complex natural language understanding and agent collaboration capabilities.

Value & Impact:

  • For Professionals (Travel Industry): “Ask” exemplifies how AI is evolving from information providers to transactional agents in vertical industries. By integrating core platform data and booking functions, it simplifies the complex travel planning and purchasing process, potentially reshaping online travel services. The multi-agent architecture offers a new approach to solving complex planning problems. Platforms with rich domain-specific data have a significant advantage in developing such vertical AI applications.
  • For General Users: This means future travel planning could become as simple as chatting with a human consultant. Just tell the AI your ideas and budget, and it can generate reliable, bookable personalized plans, saving the hassle of searching and comparing across various websites and apps, making travel planning easier and more efficient.

Related Links:

🔥 Meta Apps Block Apple Intelligence, Platform AI Ecosystem Battle Escalates

Reports indicate that Meta’s major iOS apps, including Facebook, Instagram, WhatsApp, and Threads, have blocked users from invoking Apple’s newly launched Apple Intelligence features within the apps, such as using its Writing Tools or generating Genmoji. The move is widely interpreted as Meta prioritizing its own Meta AI services and reflects long-standing competition and disagreements between Meta and Apple regarding AI strategy, privacy philosophies, and platform policies.

Highlights:

  1. Functionality Blocked: Meta’s main iOS apps have disabled or prevented calls to Apple Intelligence features.
  2. Core Features Obstructed: Users cannot use Apple Intelligence’s writing tools (generation, rewriting, proofreading) or generate/share Genmoji within these apps.
  3. Past Features Removed: Even previously available iOS keyboard stickers and Memoji options in Instagram Stories have reportedly been removed.
  4. Prioritizing Own AI: The most likely reason is Meta wants users to prioritize its deeply integrated Meta AI services over Apple’s system-level AI features.
  5. Long-Standing Feud: Meta and Apple have long had conflicts over data privacy (Apple reportedly rejected Llama integration due to privacy concerns), App Store policies, etc., which likely contribute to this decision.

Value & Impact:

  • For Professionals (Tech Industry): This clearly shows AI is the new core battleground for platform competition. Controlling AI interactions, acquiring user data for model training, and promoting proprietary, closed AI ecosystems are becoming paramount. Meta’s strategy essentially erects an “AI wall” between its app empire and Apple’s OS. While potentially sacrificing seamless cross-app user experience, it helps Meta solidify its own AI platform and lock users into its ecosystem. This suggests future user AI experiences might become more fragmented due to platform barriers.
  • For General Users: This means your AI experience might become inconsistent across different apps. Within Meta’s apps, you might only be able to use Meta AI and cannot conveniently access system-level AI features provided by Apple. This could cause inconvenience and reflects how competition between tech giants can sometimes prioritize platform interests over seamless user experience.