Skip to content
GCC AI Research

Beyond Attention: Orchid’s Adaptive Convolutions for Next-Level Sequence Modeling

MBZUAI · Notable

Summary

A new neural network architecture called Orchid was introduced that uses adaptive convolutions to achieve quasilinear computational complexity O(N logN) for sequence modeling. Orchid adapts its convolution kernel dynamically based on the input sequence. Evaluations across language modeling and image classification show that Orchid outperforms attention-based architectures like BERT and Vision Transformers, often with smaller model sizes. Why it matters: Orchid extends the feasible sequence length beyond the practical limits of dense attention layers, representing progress toward more efficient and scalable deep learning models.

Get the weekly digest

Top AI stories from the GCC region, every week.

Related

ORCA: A Challenging Benchmark for Arabic Language Understanding

arXiv ·

The paper introduces ORCA, a new public benchmark for evaluating Arabic language understanding. ORCA covers diverse Arabic varieties and includes 60 datasets across seven NLU task clusters. The benchmark was used to compare 18 multilingual and Arabic language models and includes a public leaderboard with a unified evaluation metric. Why it matters: ORCA addresses the lack of a comprehensive Arabic benchmark, enabling better progress measurement for Arabic and multilingual language models.

SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression

arXiv ·

The paper introduces Sparse-Quantized Representation (SpQR), a new compression format and quantization technique for large language models (LLMs). SpQR identifies outlier weights and stores them in higher precision while compressing the remaining weights to 3-4 bits. The method achieves less than 1% accuracy loss in perplexity for LLaMA and Falcon LLMs and enables a 33B parameter LLM to run on a single 24GB consumer GPU. Why it matters: This enables near-lossless compression of LLMs, making powerful models accessible on resource-constrained devices and accelerating inference without significant accuracy degradation.

Self-supervised DNA models and scalable sequence processing with memory augmented transformers

MBZUAI ·

Dr. Mikhail Burtsev of the London Institute presented research on GENA-LM, a suite of transformer-based DNA language models. The talk addressed the challenge of scaling transformers for genomic sequences, proposing recurrent memory augmentation to handle long input sequences efficiently. This approach improves language modeling performance and holds promise for memory-intensive applications in bioinformatics. Why it matters: This research can significantly advance AI's capabilities in genomics by enabling the processing of much larger DNA sequences, with potential breakthroughs in understanding and treating diseases.