{"id":305,"date":"2022-04-01T21:31:10","date_gmt":"2022-04-01T19:31:10","guid":{"rendered":"https:\/\/threedots.ovh\/blog\/?p=305"},"modified":"2022-04-01T21:31:10","modified_gmt":"2022-04-01T19:31:10","slug":"opencl-on-metal-2-what-if-clvk-works","status":"publish","type":"post","link":"https:\/\/threedots.ovh\/blog\/2022\/04\/opencl-on-metal-2-what-if-clvk-works\/","title":{"rendered":"OpenCL on Metal #2 &#8211; what if clvk works?"},"content":{"rendered":"\n<p><em>For April Fools day, an alternate approach is discussed for the OpenCL on Metal project.<\/em><\/p>\n\n\n\n<p>Writing an OpenCL implementation from scratch would take a long time &#8211; what if we can leverage existing bricks for a proof of concept?<\/p>\n\n\n\n<p><a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/kpet\/clvk\" target=\"_blank\">CLVK<\/a> is an experimental, limited OpenCL implementation that runs on top of Vulkan. It uses <a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/google\/clspv\" target=\"_blank\">clspv<\/a>, a compiler to transform OpenCL C code into Vulkan compute shaders, with <a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/google\/clspv\/blob\/main\/docs\/OpenCLCOnVulkan.md\" target=\"_blank\">some<\/a><a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/google\/clspv\/blob\/main\/docs\/OpenCLCOnVulkan.md\" target=\"_blank\"> <\/a><a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/google\/clspv\/blob\/main\/docs\/OpenCLCOnVulkan.md\" target=\"_blank\">i<\/a><a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/google\/clspv\/blob\/main\/docs\/OpenCLCOnVulkan.md\" target=\"_blank\">m<\/a><a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/google\/clspv\/blob\/main\/docs\/OpenCLCOnVulkan.md\" target=\"_blank\">p<\/a><a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/google\/clspv\/blob\/main\/docs\/OpenCLCOnVulkan.md\" target=\"_blank\">o<\/a><a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/google\/clspv\/blob\/main\/docs\/OpenCLCOnVulkan.md\" target=\"_blank\">r<\/a><a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/google\/clspv\/blob\/main\/docs\/OpenCLCOnVulkan.md\" target=\"_blank\">t<\/a><a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/google\/clspv\/blob\/main\/docs\/OpenCLCOnVulkan.md\" target=\"_blank\">a<\/a><a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/google\/clspv\/blob\/main\/docs\/OpenCLCOnVulkan.md\" target=\"_blank\">n<\/a><a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/google\/clspv\/blob\/main\/docs\/OpenCLCOnVulkan.md\" target=\"_blank\">t<\/a><a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/google\/clspv\/blob\/main\/docs\/OpenCLCOnVulkan.md\" target=\"_blank\"> limitations<\/a>.<\/p>\n\n\n\n<p><a href=\"https:\/\/github.com\/KhronosGroup\/MoltenVK\">MoltenVK<\/a> is a Vulkan (Portability) implementation targeting Apple operating systems, allowing to run Vulkan implementations on top of <em>Metal<\/em>.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ cd ~\/VulkanSDK\/1.3.204.1\/ &amp;&amp; source setup-env.sh &amp;&amp; cd -\n~\/devlab\/opencl\/clvk\n$ git clone --recurse https:\/\/github.com\/kpet\/clvk\n&#91;...]\n$ cd clvk\/external\/clspv\n$ git diff # small change to use python3\ndiff --git a\/utils\/fetch_sources.py b\/utils\/fetch_sources.py\nindex f0ba884..a0d6a35 100755\n--- a\/utils\/fetch_sources.py\n+++ b\/utils\/fetch_sources.py\n@@ -1,4 +1,4 @@\n-#!\/usr\/bin\/env python\n+#!\/usr\/bin\/env python3\n \n # Copyright 2017 The Clspv Authors. All rights reserved.\n #\n$ cd ..\/.. ; .\/external\/clspv\/utils\/fetch_sources.py --deps llvm\n&#91;...]\n$ mkdir build &amp;&amp; cd build\n$ cmake ..\n$ make -j8<\/code><\/pre>\n\n\n\n<p>Does it work at first glance? Yes it does.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ .\/clpeak # linking to libOpenCL.dylib instead of OpenCL.framework\nPlatform: clvk\n  Device: Apple M1\n    Driver version  : 1.2 CLVK on Vulkan v1.1.189 driver 10105 (Macintosh)\n    Compute units   : 1\n    Clock frequency : 0 MHz\n\n    Global memory bandwidth (GBPS)\n      float   : 55.63\n      float2  : 57.66\n      float4  : 57.50\n      float8  : 57.88\n      float16 : 62.41\n\n    Single-precision compute (GFLOPS)\n      float   : 1267.18\n      float2  : 1502.12\n      float4  : 1525.60\n      float8  : 892.10\n      float16 : 1470.11\n\n    Half-precision compute (GFLOPS)\n      half   : 1333.62\n      half2  : 1508.59\n      half4  : 1591.51\n      half8  : 1525.85\n      half16 : 1423.71\n\n    No double precision support! Skipped\n\n    Integer compute (GIOPS)\n      int   : 472.07\n      int2  : 467.69\n      int4  : 469.41\n      int8  : 476.42\n      int16 : 461.98\n\n    Integer compute Fast 24bit (GIOPS)\n      int   : 480.22\n      int2  : 478.74\n      int4  : 437.78\n      int8  : 474.67\n      int16 : 475.28\n\n    Transfer bandwidth (GBPS)\n      enqueueWriteBuffer              : 28.62\n      enqueueReadBuffer               : 28.93\n      enqueueWriteBuffer non-blocking : 28.99\n      enqueueReadBuffer non-blocking  : 28.97\n      enqueueMapBuffer(for read)      : 550636.56\n        memcpy from mapped ptr        : 28.75\n      enqueueUnmap(after write)       : 727960.19\n        memcpy to mapped ptr          : 28.96\n\n    Kernel launch latency : 6.92 us<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>For April Fools day, an alternate approach is discussed for the OpenCL on Metal project. Writing an OpenCL implementation from scratch would take a long time &#8211; what if we can leverage existing bricks for a proof of concept? CLVK is an experimental, limited OpenCL implementation that runs on top of Vulkan. It uses clspv,&hellip;&nbsp;<a href=\"https:\/\/threedots.ovh\/blog\/2022\/04\/opencl-on-metal-2-what-if-clvk-works\/\" rel=\"bookmark\">Read More &raquo;<span class=\"screen-reader-text\">OpenCL on Metal #2 &#8211; what if clvk works?<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"neve_meta_sidebar":"","neve_meta_container":"","neve_meta_enable_content_width":"","neve_meta_content_width":0,"neve_meta_title_alignment":"","neve_meta_author_avatar":"","neve_post_elements_order":"","neve_meta_disable_header":"","neve_meta_disable_footer":"","neve_meta_disable_title":"","footnotes":""},"categories":[1],"tags":[],"class_list":["post-305","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/threedots.ovh\/blog\/wp-json\/wp\/v2\/posts\/305","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/threedots.ovh\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/threedots.ovh\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/threedots.ovh\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/threedots.ovh\/blog\/wp-json\/wp\/v2\/comments?post=305"}],"version-history":[{"count":3,"href":"https:\/\/threedots.ovh\/blog\/wp-json\/wp\/v2\/posts\/305\/revisions"}],"predecessor-version":[{"id":308,"href":"https:\/\/threedots.ovh\/blog\/wp-json\/wp\/v2\/posts\/305\/revisions\/308"}],"wp:attachment":[{"href":"https:\/\/threedots.ovh\/blog\/wp-json\/wp\/v2\/media?parent=305"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/threedots.ovh\/blog\/wp-json\/wp\/v2\/categories?post=305"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/threedots.ovh\/blog\/wp-json\/wp\/v2\/tags?post=305"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}