Share this article

Latest news

With KB5043178 to Release Preview Channel, Microsoft advises Windows 11 users to plug in when the battery is low

Copilot in Outlook will generate personalized themes for you to customize the app

Microsoft will raise the price of its 365 Suite to include AI capabilities

Death Stranding Director’s Cut is now Xbox X|S at a huge discount

Outlook will let users create custom account icons so they can tell their accounts apart easier

Microsoft teases Samba 3.8B, a new SSM superior to the recent Phi3-mini

The Microsoft researchers have uploaded their documentation on GitHub.

2 min. read

Published onJune 17, 2024

published onJune 17, 2024

Share this article

Read our disclosure page to find out how can you help Windows Report sustain the editorial teamRead more

In a time when language models are becoming more complicated and sometimes confusing, Microsoft researchers working with the University of Illinois at Urbana-Champaign have created something simple but powerful: Samba 3.8B.

This is not just another model in this sea of options; it’s like gold that brings together State Space Models (SSMs) and attention mechanisms, showing superiority over other models such asPhi3-minion main benchmarks. What sets Samba apart? Its ability to handle sequence lengths without limits and maintain a linear time complexity is truly fascinating.

In terms of its architecture, Samba combines Mamba, SwiGLU, and Sliding Window Attention (SWA) layers. The Mamba layers are like strong muscles that catch time-related meanings, while SWA ones deal with tricky non-Markovian dependencies.

When united, they form a high-performance system that can efficiently decode and handle complex dependencies. But wait, there’s more. Adding Multi-Layer Perceptron (MLP) layers, demonstrated as SwiGLU, strengthens the system’s ability to deal with nonlinear transformations and remember factual knowledge.

Introducing Samba 3.8B, a simple Mamba+Sliding Window Attention architecture that outperforms Phi3-mini on major benchmarks (e.g., MMLU, GSM8K and HumanEval) by a large margin.😮 And it has an infinite context length with linear complexity.🤯Paper:https://t.co/6OnfGG71Aj…pic.twitter.com/f4IZdT1wGB

It was not just about designing this architecture; the researchers scaled it to a 3.8B parameter model pre-trained on 3.2T tokens. The results were even more impressive, with Samba 3.8B surpassing other open-source language models with parameters up to 8B and excellent performance in various tasks, from commonsense reasoning to coding. Significantly, it has 18.1% more accuracy on GSM8K than Transformer++.

Now, how does this relate to you and me? Initially, Samba’s hybrid architecture could be seen as a potential solution for complex tasks in natural language processing.

The capacity to manage context lengths without limits and exceptional memory extrapolation abilities make it especially fitting for practical uses requiring a profound comprehension of large contexts. Picture a planet where language models grasp our intentions more clearly, enhancing technology’s intuitive and supportive nature.

The researchers’ findings and models are uploaded to GitHub if you wantto explore them further. This shows their dedication to progressing language modeling and providing tools for handling upcoming difficulties in the community.

So, if you’re a researcher, developer, or simply someone intrigued by how technology aids us in comprehending language better, Samba 3.8B is an important development to watch.

More about the topics:AI,microsoft

Claudiu Andone

Windows Toubleshooting Expert

Oldtimer in the tech and science press, Claudiu is focused on whatever comes new from Microsoft.

His abrupt interest in computers started when he saw the first Home Computer as a kid. However, his passion for Windows and everything related became obvious when he became a sys admin in a computer science high school.

With 14 years of experience in writing about everything there is to know about science and technology, Claudiu also likes rock music, chilling in the garden, and Star Wars. May the force be with you, always!

User forum

0 messages

Sort by:LatestOldestMost Votes

Comment*

Name*

Email*

Commenting as.Not you?

Save information for future comments

Comment

Δ

Claudiu Andone

Windows Toubleshooting Expert

Oldtimer in the tech and science press, with 14 years of experience in writing on everything there is to know about science, technology, and Microsoft