On Tuesday, Chinese AI company SenseTime unveiled the world’s biggest open-source multi-modal large-language model, which ChatGPT has triggered. The unveiling was accompanied by a logo of SenseTime (Photo/VCG).
The latest model, Intern 2.5, created by SenseTime in collaboration with Shanghai Artificial Intelligence Laboratory, Tsinghua University, the Chinese University of Hong Kong, and Shanghai Jiao Tong University, is a step forward for China’s mission to enhance AI technology and broaden its applications.
Intern 2.5, the open-source model with 3 billion parameters, is the biggest and most precise on ImageNet, which surpasses 65.0 mAP in COCO – an object detection benchmark dataset. This makes it stand out from other models globally, according to SenseTime.
The ImageNet project is a comprehensive visual database crafted to assist in exploring visual object recognition software. It offers a cross-modal open-task processing capacity that provides precise and reliable perception and comprehension assistance for diverse applications such as autonomous driving vehicles and robots, according to SenseTime.
Intern 2.5, a visual system with an advanced level of global scene understanding and the capability to resolve intricate problems, accomplishes this by setting out duties in written form, enabling it to define the task expectations of multiple situations conveniently.
The firm stated that this technology could provide instructions or responses based on visual images and questions related to tasks, showcasing its sophisticated understanding and capability of working out complicated problems in different contexts, such as image description, question-answering, visual reasoning, and text recognition.
SenseTime’s Open-Source AI Model release for its ChatGP platform is a breakthrough in artificial intelligence and natural language processing. With the ever-increasing advancements in AI, it is clear to see what the future holds with this technology.
The third-generation model released by SenseTime has proven more accurate with its voice recognition and conversation understanding accuracy than any previous models. It is an exciting step forward for natural language processing, offering greater possibilities for human interaction with machines. This open-source model could potentially have major implications for industries such as healthcare and retail.