French AI startup Mistral has recently unveiled its latest model, Pixtral 12B, which has the ability to process images in addition to text. This 12-billion-parameter model, which is approximately 24GB in size, is now available for download on GitHub and the AI platform Hugging Face. Users can fine-tune and utilize the model under Mistral’s standard license, allowing for research and academic purposes without the need for a paid license. It is noteworthy that the number of parameters in a model is linked to its problem-solving abilities, with models possessing more parameters generally performing better.
Pixtral 12B, built upon Mistral’s existing text model Nemo 12B, can effectively respond to inquiries regarding various images of different sizes. By providing image URLs or using the base64 binary-to-text encoding scheme, users can gather information from the model. With similar multimodal models in the market like Anthropic’s Claude family and GPT-4o, Pixtral 12B theoretically has the capacity to perform tasks such as image captioning and object counting. Although a hands-on experience with Pixtral 12B was not possible at the time of this review due to the absence of functioning web demos, Mistral’s head of developer relations mentioned plans for testing the model on the company’s chatbot and API-serving platforms in the near future.
It remains unclear which image data Mistral utilized for the creation of Pixtral 12B. Many generative AI models, including Mistral’s previous models, are trained using substantial amounts of publicly sourced data from the internet, which may be protected by copyright. While some model vendors argue for the rights of “fair use” when accessing public data, disagreements with copyright holders have led to legal challenges for vendors like OpenAI and Midjourney. Pixtral 12B’s introduction follows Mistral’s successful $645 million funding round led by General Catalyst, valuing the startup at $6 billion. Despite being a relatively young company, Mistral is already considered by many within the AI community as Europe’s response to OpenAI. The company’s approach thus far has involved the release of free “open” models, the offer of managed versions for a fee, and the provision of consulting services to corporate clientele.
By developing Pixtral 12B, Mistral is expanding its AI model portfolio and reaching new heights in the field of artificial intelligence.