Hacker News<p>FP8 is ~100 tflops faster when the kernel name has "cutlass" in it</p><p><a href="https://twitter.com/cis_female/status/1943069934332055912" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">twitter.com/cis_female/status/</span><span class="invisible">1943069934332055912</span></a></p><p><a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HackerNews</span></a> <a href="https://mastodon.social/tags/FP8" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>FP8</span></a> <a href="https://mastodon.social/tags/tflops" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>tflops</span></a> <a href="https://mastodon.social/tags/cutlass" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>cutlass</span></a> <a href="https://mastodon.social/tags/performance" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>performance</span></a> <a href="https://mastodon.social/tags/optimization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>optimization</span></a> <a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a></p>