Address
:
[go:
up one dir
,
main page
]
Include Form
Remove Scripts
Accept Cookies
Show Images
Show Referer
Rotate13
Base64
Strip Meta
Strip Title
Session Cookies
Skip to content
Navigation menu
Search
Powered by Algolia
Search
Log in
Create account
DEV Community
Close
#
transformers
Follow
Hide
Posts
Left menu
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
TurboQuant: How a Simple Spin Saves Gigabytes of GPU Memory
Bharath Kadaluri
Bharath Kadaluri
Bharath Kadaluri
Follow
Apr 8
TurboQuant: How a Simple Spin Saves Gigabytes of GPU Memory
#
turboquant
#
attention
#
transformers
#
llm
Comments
Add Comment
6 min read
Attention Residuals: How Kimi Is Rethinking Transformer Depth
Guatu
Guatu
Guatu
Follow
Apr 7
Attention Residuals: How Kimi Is Rethinking Transformer Depth
#
ai
#
transformers
#
llmarchitecture
#
attention
Comments
Add Comment
3 min read
RBF Attention Reveals Dot‑Product's Hidden Norm Bias
Simon Paxton
Simon Paxton
Simon Paxton
Follow
Apr 2
RBF Attention Reveals Dot‑Product's Hidden Norm Bias
#
transformers
#
attentionmechanisms
#
airesearch
#
aihardware
Comments
1
 comment
8 min read
Why a Perfect-Memory AI Agent Without Persona Drift is Architecturally Impossible
Tom Lee
Tom Lee
Tom Lee
Follow
Mar 20
Why a Perfect-Memory AI Agent Without Persona Drift is Architecturally Impossible
#
ai
#
agents
#
memory
#
transformers
Comments
Add Comment
4 min read
Anonymous User Claims Proof of d^2 Complexity for Attention Mechanisms, Challenging Transformer Optimization
Valeria Solovyova
Valeria Solovyova
Valeria Solovyova
Follow
Mar 5
Anonymous User Claims Proof of d^2 Complexity for Attention Mechanisms, Challenging Transformer Optimization
#
transformers
#
attention
#
optimization
#
complexity
Comments
Add Comment
10 min read
Advancing Tiny Transformers: Achieving 100% Accuracy in 10-Digit Addition with Sub-100 Parameter Models Using Digit Tokenization
Valeria Solovyova
Valeria Solovyova
Valeria Solovyova
Follow
Mar 1
Advancing Tiny Transformers: Achieving 100% Accuracy in 10-Digit Addition with Sub-100 Parameter Models Using Digit Tokenization
#
ai
#
transformers
#
efficiency
#
tokenization
Comments
Add Comment
16 min read
Standard Transformer Attention vs. Attention-Residuals: A Practical Comparison
Alan West
Alan West
Alan West
Follow
Mar 21
Standard Transformer Attention vs. Attention-Residuals: A Practical Comparison
#
transformers
#
deeplearning
#
attentionmechanism
#
pytorch
Comments
Add Comment
5 min read
Attention Is All You Need — Full Paper Breakdown
seah-js
seah-js
seah-js
Follow
Mar 7
Attention Is All You Need — Full Paper Breakdown
#
ai
#
transformers
#
deeplearning
#
machinelearning
1
 reaction
Comments
1
 comment
4 min read
Transformers: Revolutionizing Natural Language Processing!
Mariano Gobea Alcoba
Mariano Gobea Alcoba
Mariano Gobea Alcoba
Follow
Feb 25
Transformers: Revolutionizing Natural Language Processing!
#
transformers
#
nlp
#
attentionmechanism
#
huggingface
2
 reactions
Comments
Add Comment
2 min read
đź‘€ Attention Explained Like You're 5
Sreekar Reddy
Sreekar Reddy
Sreekar Reddy
Follow
Jan 14
đź‘€ Attention Explained Like You're 5
#
eli5
#
ai
#
transformers
#
tutorial
Comments
Add Comment
1 min read
What are Transformers, Why do they Dominate the AI World?
Yuvaraj
Yuvaraj
Yuvaraj
Follow
Feb 15
What are Transformers, Why do they Dominate the AI World?
#
ai
#
transformers
#
machinelearning
4
 reactions
Comments
Add Comment
5 min read
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account