mastodontech.de ist einer von vielen unabhängigen Mastodon-Servern, mit dem du dich im Fediverse beteiligen kannst.
Offen für alle (über 16) und bereitgestellt von Markus'Blog

Serverstatistik:

1,5 Tsd.
aktive Profile

#simd

0 Beiträge0 Beteiligte0 Beiträge heute
Hacker News<p>The messy reality of SIMD (vector) functions</p><p><a href="https://johnnysswlab.com/the-messy-reality-of-simd-vector-functions/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">johnnysswlab.com/the-messy-rea</span><span class="invisible">lity-of-simd-vector-functions/</span></a></p><p><a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HackerNews</span></a> <a href="https://mastodon.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a> <a href="https://mastodon.social/tags/SIMDFunctions" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMDFunctions</span></a> <a href="https://mastodon.social/tags/VectorProgramming" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>VectorProgramming</span></a> <a href="https://mastodon.social/tags/TechTrends" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TechTrends</span></a> <a href="https://mastodon.social/tags/CodingInsights" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CodingInsights</span></a></p>
Hacker News<p>The messy reality of SIMD (vector) functions</p><p><a href="https://johnnysswlab.com/the-messy-reality-of-simd-vector-functions/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">johnnysswlab.com/the-messy-rea</span><span class="invisible">lity-of-simd-vector-functions/</span></a></p><p><a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HackerNews</span></a> <a href="https://mastodon.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a> <a href="https://mastodon.social/tags/SIMDFunctions" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMDFunctions</span></a> <a href="https://mastodon.social/tags/VectorProgramming" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>VectorProgramming</span></a> <a href="https://mastodon.social/tags/TechReality" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TechReality</span></a> <a href="https://mastodon.social/tags/CodingChallenges" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CodingChallenges</span></a></p>
Hacker News<p>Finding a billion factorials in 60 ms with SIMD</p><p><a href="https://codeforces.com/blog/entry/143279" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">codeforces.com/blog/entry/1432</span><span class="invisible">79</span></a></p><p><a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HackerNews</span></a> <a href="https://mastodon.social/tags/Finding" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Finding</span></a> <a href="https://mastodon.social/tags/a" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>a</span></a> <a href="https://mastodon.social/tags/billion" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>billion</span></a> <a href="https://mastodon.social/tags/factorials" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>factorials</span></a> <a href="https://mastodon.social/tags/in" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>in</span></a> #60 <a href="https://mastodon.social/tags/ms" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ms</span></a> <a href="https://mastodon.social/tags/with" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>with</span></a> <a href="https://mastodon.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a> <a href="https://mastodon.social/tags/codeforces" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>codeforces</span></a> <a href="https://mastodon.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a> <a href="https://mastodon.social/tags/performance" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>performance</span></a> <a href="https://mastodon.social/tags/factorials" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>factorials</span></a> <a href="https://mastodon.social/tags/computing" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>computing</span></a></p>
nietras 👾<p>New blog post "Sep 0.11.0 - 9.5 GB/s CSV Parsing Using ARM NEON SIMD on Apple M1 🚀"</p><p>🛠️ New <a href="https://mastodon.social/tags/ARM" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ARM</span></a> <a href="https://mastodon.social/tags/NEON" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>NEON</span></a> <a href="https://mastodon.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a> parser based on <span class="h-card" translate="no"><a href="https://mastodon.social/@geofflangdale" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>geofflangdale</span></a></span> bulk move mask</p><p>📈 Sep <a href="https://mastodon.social/tags/performance" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>performance</span></a> up from 7 GB/s on <a href="https://mastodon.social/tags/Apple" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Apple</span></a> <a href="https://mastodon.social/tags/M1" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>M1</span></a> and 1.5x faster on <a href="https://mastodon.social/tags/Microsoft" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Microsoft</span></a> <a href="https://mastodon.social/tags/Cobalt" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Cobalt</span></a> 100 (4 GB/s to 6 GB/s) </p><p>🧑‍💻 <a href="https://mastodon.social/tags/csharp" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>csharp</span></a> SIMD and <a href="https://mastodon.social/tags/ARM" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ARM</span></a> assembly on <a href="https://mastodon.social/tags/dotnet" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dotnet</span></a> 9.0</p><p>👇<br><a href="https://nietras.com/2025/06/17/sep-0-11-0/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">nietras.com/2025/06/17/sep-0-1</span><span class="invisible">1-0/</span></a></p>
Gareth Lloyd (He/him)<p>I'm putting a talk together about <a href="https://fosstodon.org/tags/programming" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>programming</span></a> Mandelbrot image generator with insight into profiling and optimisation. Main part will be normal optimisations, <a href="https://fosstodon.org/tags/simd" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>simd</span></a>, <a href="https://fosstodon.org/tags/multithreading" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>multithreading</span></a>, and possibly gpu acceleration. </p><p>I'll also show micro benchmarking, hotspot/perf, intel advisor, and also inspecting assembly code.</p><p>Any other interesting bits I should look into putting into my talk?</p><p><a href="https://fosstodon.org/tags/cpp" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>cpp</span></a> <a href="https://fosstodon.org/tags/cplusplus" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>cplusplus</span></a></p>
Hacker News<p>SIMD-friendly algorithms for substring searching</p><p><a href="http://0x80.pl/notesen/2016-11-28-simd-strfind.html" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">http://</span><span class="ellipsis">0x80.pl/notesen/2016-11-28-sim</span><span class="invisible">d-strfind.html</span></a></p><p><a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HackerNews</span></a> <a href="https://mastodon.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a> <a href="https://mastodon.social/tags/algorithms" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>algorithms</span></a> <a href="https://mastodon.social/tags/substring" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>substring</span></a> <a href="https://mastodon.social/tags/searching" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>searching</span></a> <a href="https://mastodon.social/tags/performance" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>performance</span></a> <a href="https://mastodon.social/tags/optimization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>optimization</span></a></p>
Larry (Mr.Optimization)<p>I decided to share my Arm NEON optimizations for the FFmpeg Cinepak encoder. On Apple Silicon / RPI / NEON 32/64-bit, it gets a 250-300% speedup for encoding:</p><p><a href="https://github.com/bitbank2/FFmpeg-in-Xcode" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">github.com/bitbank2/FFmpeg-in-</span><span class="invisible">Xcode</span></a></p><p><a href="https://floss.social/tags/FOSS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>FOSS</span></a> <br><a href="https://floss.social/tags/Optimization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Optimization</span></a> <br><a href="https://floss.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a> <br><a href="https://floss.social/tags/NEON" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>NEON</span></a></p>
FCLC<p>and for the <a href="https://mast.hpc.social/tags/IEEE754" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>IEEE754</span></a> / <a href="https://mast.hpc.social/tags/floatingpoint" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>floatingpoint</span></a> nerds (you know who you are!) here's a much more definitive answer/breakdown of our IEEE Binary FP32 conformance for the Vector Unit! <a href="https://github.com/tenstorrent/tt-isa-documentation/blob/main/WormholeB0/TensixTile/TensixCoprocessor/SFPMAD.md#ieee754-conformance--divergence" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">github.com/tenstorrent/tt-isa-</span><span class="invisible">documentation/blob/main/WormholeB0/TensixTile/TensixCoprocessor/SFPMAD.md#ieee754-conformance--divergence</span></a></p><p><a href="https://mast.hpc.social/tags/RVV" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>RVV</span></a> <a href="https://mast.hpc.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a> <a href="https://mast.hpc.social/tags/HPC" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HPC</span></a></p>
Tweede golf<p>SIMD blog series: <span class="h-card" translate="no"><a href="https://hachyderm.io/@folkertdev" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>folkertdev</span></a></span> shows examples of using SIMD in the zlib-rs project. </p><p>Part 2 explains what to do when the compiler is not capable of using the SIMD capabilities of modern CPUs effectively. We end up with a basic, but very effective, example of a custom SIMD implementation beating the compiler. </p><p><a href="https://tweedegolf.nl/en/blog/155/simd-in-zlib-rs-part-2-compare256" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">tweedegolf.nl/en/blog/155/simd</span><span class="invisible">-in-zlib-rs-part-2-compare256</span></a> </p><p><span class="h-card" translate="no"><a href="https://fosstodon.org/@trifectatech" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>trifectatech</span></a></span></p><p><a href="https://fosstodon.org/tags/rustlang" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>rustlang</span></a> <a href="https://fosstodon.org/tags/datacompression" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>datacompression</span></a> <a href="https://fosstodon.org/tags/simd" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>simd</span></a></p>
RustNL<p>On stage at the Main Track of RustWeek: Martin Larralde. Giving us a crash course on DNA, the statistics behind genome research, and how Rust has sped up this research 10x 🦀 </p><p>And then SIMD to the rescue!</p><p>"SIMD in Rust is so much easier than in C!"</p><p><a href="https://fosstodon.org/tags/RustWeek" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>RustWeek</span></a> <a href="https://fosstodon.org/tags/rustlang" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>rustlang</span></a> <a href="https://fosstodon.org/tags/simd" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>simd</span></a> <a href="https://fosstodon.org/tags/research" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>research</span></a></p>
nietras 👾<p>Updated "Sep 0.10.0 - 21 GB/s CSV Parsing Using SIMD on AMD 9950X 🚀" to make it 300% clear the graph shows <a href="https://mastodon.social/tags/performance" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>performance</span></a> progression over different Sep, <a href="https://mastodon.social/tags/dotnet" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dotnet</span></a> versions and CPUs.</p><p>To show how runtime and library improvements go hand in hand with hardware changes. E.g. AVX512 <a href="https://mastodon.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a> <br>As it should be 👾</p>
nietras 👾<p>New blog post "Sep 0.10.0 - 21 GB/s CSV Parsing Using SIMD on AMD 9950X 🚀"</p><p>📈 Sep <a href="https://mastodon.social/tags/performance" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>performance</span></a> from 7 GB/s to 21 GB/s over last two years<br>🧑‍💻 <a href="https://mastodon.social/tags/csharp" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>csharp</span></a> <a href="https://mastodon.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a> and <a href="https://mastodon.social/tags/x64" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>x64</span></a> assembly on <a href="https://mastodon.social/tags/dotnet" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dotnet</span></a> 9.0<br>🛠️ Tweaks and new <a href="https://mastodon.social/tags/AVX512" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AVX512</span></a>-to-256 parser<br>🔢 Lots of benchmarks</p><p>👇<br><a href="https://nietras.com/2025/05/09/sep-0-10-0/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">nietras.com/2025/05/09/sep-0-1</span><span class="invisible">0-0/</span></a></p>
Hacker News<p>21 GB/s CSV Parsing Using SIMD on AMD 9950X</p><p><a href="https://nietras.com/2025/05/09/sep-0-10-0/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">nietras.com/2025/05/09/sep-0-1</span><span class="invisible">0-0/</span></a></p><p><a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HackerNews</span></a> <a href="https://mastodon.social/tags/CSVParsing" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CSVParsing</span></a> <a href="https://mastodon.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a> <a href="https://mastodon.social/tags/AMD9950X" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AMD9950X</span></a> <a href="https://mastodon.social/tags/HighPerformance" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HighPerformance</span></a> <a href="https://mastodon.social/tags/Computing" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Computing</span></a> <a href="https://mastodon.social/tags/DataProcessing" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataProcessing</span></a></p>
Hacker News<p>Faster sorting with SIMD CUDA intrinsics (2024)</p><p><a href="https://winwang.blog/posts/bitonic-sort/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">winwang.blog/posts/bitonic-sor</span><span class="invisible">t/</span></a></p><p><a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HackerNews</span></a> <a href="https://mastodon.social/tags/FasterSorting" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>FasterSorting</span></a> <a href="https://mastodon.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a> <a href="https://mastodon.social/tags/CUDA" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CUDA</span></a> <a href="https://mastodon.social/tags/Intrinsics" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Intrinsics</span></a> <a href="https://mastodon.social/tags/BitonicSort" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>BitonicSort</span></a> <a href="https://mastodon.social/tags/TechInnovation" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TechInnovation</span></a> #2024</p>
Hacker News<p>Three Fundamental Flaws of SIMD ISAs (2023)</p><p><a href="https://www.bitsnbites.eu/three-fundamental-flaws-of-simd/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">bitsnbites.eu/three-fundamenta</span><span class="invisible">l-flaws-of-simd/</span></a></p><p><a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HackerNews</span></a> <a href="https://mastodon.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a> <a href="https://mastodon.social/tags/ISAs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ISAs</span></a> <a href="https://mastodon.social/tags/Flaws" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Flaws</span></a> <a href="https://mastodon.social/tags/ISAs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ISAs</span></a> <a href="https://mastodon.social/tags/Architecture" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Architecture</span></a> <a href="https://mastodon.social/tags/TechInsights" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TechInsights</span></a></p>
Tweede golf<p>New blog series: <span class="h-card" translate="no"><a href="https://hachyderm.io/@folkertdev" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>folkertdev</span></a></span> shows how we use SIMD in the zlib-rs project.</p><p>SIMD is crucial to good performance, but learning how to use it can be daunting. In this series we'll show concrete examples of using SIMD in a real world project.</p><p>Part 1 explains how the compiler already uses SIMD for us, how to evaluate whether it's doing a good job, and how to use a more optimal version when the current CPU supports it. </p><p><a href="https://tweedegolf.nl/en/blog/153/simd-in-zlib-rs-part-1-autovectorization-and-target-features" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">tweedegolf.nl/en/blog/153/simd</span><span class="invisible">-in-zlib-rs-part-1-autovectorization-and-target-features</span></a></p><p><span class="h-card" translate="no"><a href="https://fosstodon.org/@trifectatech" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>trifectatech</span></a></span></p><p> <a href="https://fosstodon.org/tags/rustlang" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>rustlang</span></a> <a href="https://fosstodon.org/tags/datacompression" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>datacompression</span></a> <a href="https://fosstodon.org/tags/simd" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>simd</span></a></p>
mkretz<p>While implementing complex numbers for <a href="https://floss.social/tags/simd" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>simd</span></a> I tripped over failures wrt. negative zero. After multiple re-readings of C23 Annex G and considering the meaning of infinite infinities on a 2D plane (with zeros simply being their inverse) I believe <a href="https://floss.social/tags/C" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>C</span></a> and <a href="https://floss.social/tags/CPlusPlus" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CPlusPlus</span></a> should ignore the sign of zeros and infinities in their x+iy representations of complex numbers. <a href="https://compiler-explorer.com/z/YavE4MnMj" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">compiler-explorer.com/z/YavE4M</span><span class="invisible">nMj</span></a> provides some motivation.<br>Am I missing something?</p>
Hacker News<p>Towards fearless SIMD, 7 years later</p><p><a href="https://linebender.org/blog/towards-fearless-simd/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">linebender.org/blog/towards-fe</span><span class="invisible">arless-simd/</span></a></p><p><a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HackerNews</span></a> <a href="https://mastodon.social/tags/Towards" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Towards</span></a> <a href="https://mastodon.social/tags/fearless" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>fearless</span></a> <a href="https://mastodon.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a> #7 <a href="https://mastodon.social/tags/years" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>years</span></a> <a href="https://mastodon.social/tags/later" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>later</span></a> <a href="https://mastodon.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a> <a href="https://mastodon.social/tags/Programming" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Programming</span></a> <a href="https://mastodon.social/tags/Performance" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Performance</span></a> <a href="https://mastodon.social/tags/Optimization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Optimization</span></a> <a href="https://mastodon.social/tags/Technology" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Technology</span></a> <a href="https://mastodon.social/tags/Blog" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Blog</span></a></p>
Larry (Mr.Optimization)<p>Now it's done :)<br>66mms = 3x faster than the C version. The NEON code is processing 6 pixels at a time, but must do unaligned reads and writes to make efficient use of 8-slot registers. Only shifts and adds, no multiplies or divides. This version is much easier to port to the ESP32-S3.<br><a href="https://floss.social/tags/simd" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>simd</span></a> <a href="https://floss.social/tags/NEON" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>NEON</span></a> <a href="https://floss.social/tags/Optimization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Optimization</span></a></p>
Jiří Činčura ↹<p>(Not) Vectorizing the .NET Dictionary class</p><p><a href="https://gist.github.com/kg/f5bfe4c095f66d2dcda5f1e43e015cf1" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">gist.github.com/kg/f5bfe4c095f</span><span class="invisible">66d2dcda5f1e43e015cf1</span></a></p><p><a href="https://mas.to/tags/dotnet" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dotnet</span></a> <a href="https://mas.to/tags/simd" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>simd</span></a></p>