From Wikipedia, the free encyclopedia
Jamba
Developer(s) AI21 Labs
Initial release28 March 2024
Type
License Apache 2.0 License

Jamba is an open-weights large language model (LLM) developed by AI21 Labs. [1] [2] It utilizes a Mamba-based model built on a novel state space model (SSM) and transformer hybrid architecture. [3] [1] [4] It is a 52 billion parameter model trained using a mixture-of-experts (MoE) technique with 12B active parameters (number of parameters active per token). [2] [1] Jamba can fit up to 256K tokens in its context window and is the largest Mamba-variant LLM created, or 140k tokens in a single 80GB GPU. [2] [3]

Jamba performs well across a number of key measurements including throughput and efficiency while outperforming or matching other state-of-the-art models in its class on a wide range of performance benchmarks while having significantly greater context limits enabling use-cases that require increased context. [1] [2] The model is released with open weights under an Apache 2.0 license. [5] [4]

The company plans to release a beta-version instruct-tuned version on the AI21 Platform in the near future. [6]

Characteristics

  • Context window size: 256k tokens [6]
  • Parameters: 52 billion [6]
  • Architecture: Hybrid Mamba (SSM) Transformer using Mixture of Experts (MoE) [6]

See also

References

  1. ^ a b c d "Introducing Jamba: AI21's Groundbreaking SSM-Transformer Model". www.ai21.com. Retrieved 2024-03-29.
  2. ^ a b c d Kerner, Sean Michael (2024-03-28). "AI21 Labs juices up gen AI transformers with Jamba". VentureBeat. Retrieved 2024-03-29.
  3. ^ a b "AI21 Labs' Jamba infuses Mamba to bring more context to transformer-based LLMs". SiliconANGLE. 2024-03-28. Retrieved 2024-03-29.
  4. ^ a b "MLTimes - Time To Learn AI". mltimes.se. Retrieved 2024-03-29.
  5. ^ AI21. "Unveiling Jamba: AI21's Groundbreaking Hybrid SSM-Transformer Open-Source Model". www.prnewswire.com. Retrieved 2024-03-29.{{ cite web}}: CS1 maint: numeric names: authors list ( link)
  6. ^ a b c d "AI21 Labs enhances the capabilities of gen AI transformers through Jamba integration". Global Village Space | Technology. 2024-03-28. Retrieved 2024-03-29.