Tag: AI model

  • Microsoft’s newly launched Quake II AI demo struggles with blurry graphics

    Microsoft’s newly launched Quake II AI demo struggles with blurry graphics

    On April 5, 2025, Microsoft launched an AI-generated version of the classic first-person shooter, Quake II, initially released in 1997. This tech demo is available directly in a web browser and is powered by Microsoft’s Muse AI model, developed in collaboration with the UK-based game studio Ninja Theory.

    Muse AI creates gameplay and visuals in real-time, providing a playable experience on the fly without relying on pre-designed assets or the original game engine.

    The demo features a single level of Quake II at a resolution of 640×360, an improvement over earlier AI gaming experiments like WHAM-1.6B. Although innovative, Microsoft has noted that the gameplay feels basic compared to modern standards.

    Players can explore, shoot, and interact with the environment; however, there are noticeable limitations, such as blurry enemy graphics and laggy controls. Furthermore, the AI has difficulty with object permanence, often “forgetting” items that move out of the player’s view for more than 0.9 seconds.

    A research exploration with limitations

    Microsoft has framed this project as a research experiment instead of a fully developed gaming experience. In their announcement, researchers highlighted that Muse AI’s abilities are still developing.

    “We could roam freely, adjust the camera, shoot, and even explode barrels similar to the original game,” they noted. However, they also highlighted drawbacks like inaccuracies in damage indicators and limited interaction quality.

    Read Also: Microsoft’s Copilot transforms into supercharged AI assistant that books flights, fills forms

    The Muse AI model was trained on over one billion images and controller actions from games like Bleeding Edge. While it shows promise for preserving classic games and prototyping new ones without relying on original hardware or engines, Microsoft Gaming CEO Phil Spencer acknowledged that it still lacks the polish of traditional game development.

    “This should be regarded as a research exploration,” Microsoft stated in their blog post.

    The demo is available on Microsoft’s Copilot Labs platform but has limitations on playtime and interactivity. While it may not fully capture the enjoyment of Quake II, it offers a glimpse into how generative AI could transform game development in the future.

  • Google’s Gemini 2.0 Flash removes watermarks and fills missing pieces in images

    Google’s Gemini 2.0 Flash removes watermarks and fills missing pieces in images

    Google’s latest AI model, Gemini 2.0 Flash, has made headlines for its ability to remove watermarks from images, including those from renowned stock media platforms like Getty Images. While impressive, this feature raises concerns about copyright infringement.

    The model was announced on December 11, 2024, and has been available for developer testing since then. As of March 14, 2025, it remains experimental, accessible through Google’s developer-facing tools like AI Studio.

    Read also: Google to replace Assistant with Gemini AI, Rolling Out to Phones, Watches, and More

    Technical capabilities of Gemini 2.0 flash

    Gemini 2.0 Flash is celebrated for its advanced multimodal capabilities, which allow it to generate and edit images easily. It can remove watermarks and fill in the gaps left behind, although it struggles with semi-transparent watermarks and those covering large areas of images. Users have noted that the model is exceptionally skilled at this task, making it a popular tool among those experimenting with AI image editing.

    Gemini 2.0 Flash’s capabilities extend beyond image manipulation; it can generate images of celebrities and copyrighted characters without restrictions. This lack of guardrails has raised eyebrows, as other AI models, such as Anthropic’s Claude 3.7 Sonnet and OpenAI’s GPT-4o, explicitly refuse to remove watermarks, citing ethical and legal concerns.

    Gemini 2.0 flash’s watermark feature raises legal red flags

    The ability of Gemini 2.0 Flash to remove watermarks has sparked controversy due to its potential for copyright infringement. Removing a watermark without permission is illegal under U.S. copyright law, except in rare cases.

    Read also: New memory feature lets Google’s Gemini AI Chatbot recall past conversations, queries

    While Google has not commented on these concerns, the model’s experimental status and lack of production use guidelines suggest that it is not intended for widespread commercial use.

    Nonetheless, the ease with which Gemini 2.0 Flash can remove watermarks has already led to its widespread use among users, highlighting the need for more precise guidelines on ethical AI usage.

    As noted by a media report, “Gemini 2.0 Flash will uncomplainingly create images depicting celebrities and copyrighted characters, and — as alluded to earlier — remove watermarks from existing photos”. This openness in functionality contrasts with other AI models prioritising ethical considerations, such as Anthropic’s Claude, which labels watermark removal as “unethical and potentially illegal”.

  • Google unveils Gemini 1.0, AI model

    Google unveils Gemini 1.0, AI model

    Google’s announcement on Wednesday on the debut of Gemini 1.0, their most advanced big language model to date, casts doubt on OpenAI’s continued supremacy in the generative AI field. 

    Chief Executive Officer Sundar Pichai characterized Gemini as “a new generation of AI models, inspired by how people understand and interact with the world” in a blog post.

    Gemini is a state-of-the-art generative AI system that was developed by the DeepMind and Research departments at Google. Pichai gushed about its capabilities, saying it’s top-notch in almost every area.

    Gemini outperforms previous multimodal models because it was built from the ground up as a multimodal AI, pre-trained and fine-tuned to reason about all types of inputs from the start, unlike conventional models that use an integrative approach.

    Read also: OpenAI, Scale AI partners to advance AI technology

    Gemini possesses exceptional coding skills and is well-versed in several common languages. Even Google’s award-winning generativeAI from last year, AlphaCode 2, used a modified version of Gemini to solve twice as many challenge questions as its predecessor.

    Although Google has not revealed the exact number of parameters used by Gemini, they did highlight the model’s adaptability and how well it works in many environments, ranging from massive data centers to small mobile devices. Three distinct Gemini editions—Nano, Pro, and Ultra—will be on sale to facilitate this transition.

    The primary purpose of the little Nano model is to perform operations directly on the gadget. After Nano, there’s Pro, which is more flexible and will eventually make its way into a lot of Google products. On Wednesday, Bard will start using an enhanced version of Pro that is apparently better at reasoning and understanding.

    In addition to continuing to expand its rollout into 2024, the enhanced Bard chatbot will be available in 170 countries, much like its normal predecessor. In addition to introducing Gemini Ultra, which boasts an even more powerful AI with extra features, Google promises to unveil Bard Advanced in the future year.

    API calls made by Google Cloud Vertex AI or Google AI Studio can also access Pro’s features. Eventually, Gemini capabilities will be incorporated into Google’s Search, Ads, Chrome, and Duet AI.

    Nevertheless, Gemini Ultra will not be available to stakeholders for testing and input until at least 2024, as it needs additional red-team testing before being suggested for release. Ultra is anticipated to be a highly effective tool for ongoing AI development once it is deployed.

    ‘Africa’s creative industry on Artificial intelligence’ holds in Lagos

    About Gemini

    Google has released a new and highly effective AI model called Gemini. It was designed to be multimodal, meaning it can comprehend and integrate various forms of information, such as text, code, voice, images, and video. Along with comprehending and producing top-notch code in a number of programming languages, Gemini can finish complicated jobs in a variety of fields, including mathematics, physics, and others. With advanced multimodal capabilities, the ability to understand and respond to human-style speech, language, and content, and the ability to drive data and analytics, it is anticipated to be the most powerful AI yet constructed.