The Menu

Tuesday, July 1, 2025

The Menu

News

Tuesday, July 1, 2025

The Menu

Trending Models (06/22-29): Black Forest Labs Releases FLUX.1-Kontext-dev, Tencent Releases Hunyuan-A13B

Freeman Lewin

Sunday, June 29, 2025

The AI community has been buzzing after FLUX.1-Kontext-Dev shot to the top of Hugging Face’s trending models list this week.

The AI community has been buzzing after Black Forest Labs' FLUX.1-Kontext-Dev shot to the top of Hugging Face’s trending models list this week. This model isn’t just another text-to-image generator – it represents a new breed of multimodal image editing AI. Developed in response to a long-standing frustration among creatives – FLUX.1-Kontext-Dev enables users to edit existing images using text prompts, a process that is strangely reminiscent of Adobe's earliest AI Demos (and on the heels of a recent integration, it makes sense. Who's going to start the countdown to acquisition?). In other words, you can take an image and tell the model in natural language how to modify it (e.g. “make that car red” or “put a cat on that chair”), and it will perform the edit with remarkable fidelity. Within days of its release, the model’s Hugging Face page garnered thousands of downloads and nearly a thousand likes, reflecting enormous community interest.

An Open-Weight Multimodal Image Editing Model

FLUX.1-Kontext-Dev marks a significant departure from conventional image generation tools. Rather than producing a picture from scratch given a prompt, it focuses on in-context image editing: iteratively modifying a provided image based on instructions. The model is a 12-billion-parameter “rectified flow” transformer that can apply both precise local edits and broad global changes to an image. Early users have praised one feature in particular – its ability to preserve characters and details across edits. In practical terms, if you ask it to alter a photo with a person or a cartoon character, FLUX.1-Kontext-Dev excels at keeping that subject’s identity consistent from one edit to the next. This addresses a common pain point in generative art, where the same character often looks different in every generated image.

Critically, BFL released this developer version of FLUX.1-Kontext with open model weights, albeit under its "unique" non-commercial license. This openness is a big deal in a landscape where many state-of-the-art image models (like Google’s Imagen-4 or OpenAI’s GPT-Image-1) remain proprietary and closed-source. By publishing FLUX.1’s weights openly, BFL has effectively “given the keys to the kingdom” to researchers and developers, empowering them to study, utilize, and build upon the model. As the company put it, making model weights openly accessible is fundamental to technological innovation. Indeed, the community has rapidly spun up integrations – with day-zero support announced for frameworks like ComfyUI, Hugging Face Diffusers, and TensorRT – ensuring that anyone can experiment with FLUX.1-Kontext-Dev on consumer hardware.

Beyond the Hype: A Shift Toward Control and Refinement

The rise of FLUX.1-Kontext-Dev signals a broader trend in the AI art world. We seem to be moving past the era of “wow, the AI made a picture from my text prompt”. Simply generating images isn’t novel anymore – users are now seeking more control and refinement over the creative process. Models like FLUX.1 enable the user to act as a director or editor, not just a passive prompter. You can iterate on an image, steering the output closer to your vision by issuing successive textual commands. This level of control was hard to achieve with earlier generative models, which often required you to retry prompts or manually tweak images between generations.

It’s notable that other new models are also emphasizing controllability and precision. For instance, the recently released OmniGen2 multimodal model highlights its support for instruction-guided image editing and even in-context generation combining multiple inputs. The authors of OmniGen2 explicitly frame it as a tool for “controllable and personalized generative AI.” The same ethos is present in FLUX.1-Kontext-Dev’s design. Users can precisely target which aspects of an image to change (global scenery vs. a local object) and trust the model to maintain the rest of the image content. This reflects a demand from creators: AI should adapt to their instructions and style, rather than creators having to adapt their vision to what the AI randomly produces. The enthusiastic reception of FLUX.1-Kontext-Dev suggests that the community is eager for such fine-grained control in image generation. It also underscores the value of open-source models – when the community has “the keys,” they can innovate on top of the model, integrate it into pipelines, and collectively push the technology forward.

Roundup: Other Top Trending Models on Hugging Face Last Week

It wasn’t just FLUX.1-Kontext-Dev making headlines. A variety of cutting-edge models have been trending on Hugging Face in the past week, reflecting how diverse the AI landscape has become. Here’s a quick look at a few notable ones:

Tencent Hunyuan-A13B-Instruct – A newly open-sourced large language model from Tencent that uses a Mixture-of-Experts architecture. It boasts a whopping 80 billion parameters (with 13B active) and is optimized for efficiency. Notably, Hunyuan-A13B supports an ultra-long context window of 256,000 tokens for long documents, without sacrificing performance. This model excels at reasoning and multilingual tasks, aiming to rival much larger LLMs in capability while remaining usable on more modest hardware. Released on June 27, 2025, it quickly gained attention for its combination of scale and smart optimization.
Google Gemma 3n (E4B Instruct) – An 8B parameter multimodal model from Google DeepMind, built on the same research that led to their upcoming Gemini models. Gemma is designed to be efficient on low-resource devices by using selective activation (it runs as if it were a 4B model). What sets Gemma apart is its ability to accept text, image, audio, and video inputs and generate text outputs in response. It also supports a hefty 32K token context, making it versatile for tasks from detailed image analysis to long-form conversations. Gemma’s open release under Google’s guidance shows the company’s shift toward more open models, and it’s been trending as researchers explore its capabilities in multi-modal understanding.
OmniGen2 – A powerful “anything-to-anything” multimodal generator introduced in mid-June 2025. Developed by the research group VectorSpace Lab, OmniGen2 is a unified model that can handle both text and image outputs. It has distinct pathways for language and vision, enabling it to perform a spectrum of tasks: visual understanding (describing or analyzing images), text-to-image generation (producing images from prompts), and instruction-driven image editing where it executes complex edits based on text instructions. In fact, OmniGen2 achieves state-of-the-art performance among open-source models on image editing tasks. It can also do in-context visual generation, meaning it can take a mix of inputs (like an existing image plus some text prompt) and generate a coherent new image. With an Apache-2.0 license and efficiency improvements over its predecessor, OmniGen2 has attracted attention as a versatile open model pushing the frontier of controllable image generation.
Menlo Jan-Nano-128k – An intriguing entry in the trending list, this is a compact 4 billion-parameter language model tuned for extremely long context. Jan-Nano-128k introduces a native 128,000-token context window (far beyond the typical 4K or 8K of models like GPT-4, but decidedly lower than Tencent's A13B), allowing it to process entire research papers or lengthy multi-document conversations in one go. Impressively, it achieves this without the usual performance degradation that plagues extended-context models. Developed by Menlo Research on a Qwen-3 base model, Jan-Nano-128k is essentially an experiment in pushing context length boundaries. Its strong performance on long-form QA benchmarks and an open Apache license have made it a favorite for those interested in large-context applications – from literature analysis to long dialogue summarization.

Each of these trending models showcases a different aspect of the AI boom: from giant LLMs and Google’s multimodal endeavors to open image generation and novel long-context solutions. The common thread is that innovation is coming from all directions – academia, big tech, and independent research labs – and often with an open-source ethos.

A John Kappa Template

Most Recent Articles

Jun 30, 2025

Trending Models (06/22-29): Black Forest Labs Releases FLUX.1-Kontext-dev, Tencent Releases Hunyuan-A13B

Trending Models (06/22-29): Black Forest Labs Releases FLUX.1-Kontext-dev, Tencent Releases Hunyuan-A13B

Trending Models (06/22-29): Black Forest Labs Releases FLUX.1-Kontext-dev, Tencent Releases Hunyuan-A13B

Freeman Lewin

The AI community has been buzzing after FLUX.1-Kontext-Dev shot to the top of Hugging Face’s trending models list this week.

The AI community has been buzzing after FLUX.1-Kontext-Dev shot to the top of Hugging Face’s trending models list this week.

The AI community has been buzzing after FLUX.1-Kontext-Dev shot to the top of Hugging Face’s trending models list this week.

An Open-Weight Multimodal Image Editing Model

Beyond the Hype: A Shift Toward Control and Refinement

Roundup: Other Top Trending Models on Hugging Face Last Week

Most Recent Articles

Baidu Unleashes 23 ERNIE Models on Hugging Face, Including 424B-Parameter AI Titan

Trending Models (06/22-29): Black Forest Labs Releases FLUX.1-Kontext-dev, Tencent Releases Hunyuan-A13B

Baidu Unleashes 23 ERNIE Models on Hugging Face, Including 424B-Parameter AI Titan

Trending Models (06/22-29): Black Forest Labs Releases FLUX.1-Kontext-dev, Tencent Releases Hunyuan-A13B