<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Vision Geek Newsletter]]></title><description><![CDATA[Essential news and learning resources for computer vision enthusiasts.]]></description><link>https://newsletter.visiongeek.io</link><image><url>https://newsletter.visiongeek.io/img/substack.png</url><title>Vision Geek Newsletter</title><link>https://newsletter.visiongeek.io</link></image><generator>Substack</generator><lastBuildDate>Sat, 18 Apr 2026 04:50:37 GMT</lastBuildDate><atom:link href="https://newsletter.visiongeek.io/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Vision Geek]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[arunponnusamy@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[arunponnusamy@substack.com]]></itunes:email><itunes:name><![CDATA[Arun Ponnusamy]]></itunes:name></itunes:owner><itunes:author><![CDATA[Arun Ponnusamy]]></itunes:author><googleplay:owner><![CDATA[arunponnusamy@substack.com]]></googleplay:owner><googleplay:email><![CDATA[arunponnusamy@substack.com]]></googleplay:email><googleplay:author><![CDATA[Arun Ponnusamy]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[OpenAI Codex, OpenCV Conference, Foundations of Computer Vision Book, Wildlife Species Detection Challenge]]></title><description><![CDATA[Vision Geek Newsletter #12]]></description><link>https://newsletter.visiongeek.io/p/openai-codex-opencv-conference-foundations</link><guid isPermaLink="false">https://newsletter.visiongeek.io/p/openai-codex-opencv-conference-foundations</guid><dc:creator><![CDATA[Arun Ponnusamy]]></dc:creator><pubDate>Sun, 25 May 2025 02:54:29 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/hhdpnbfH6NU" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hello there,</p><p>Welcome to the next edition of Vision Geek Newsletter, covering the essential news and learning resources in Computer Vision and Machine Learning, read by 1000+ Computer Vision practitioners.</p><p>Let&#8217;s get into it.</p><h2>OpenAI Codex</h2><div id="youtube2-hhdpnbfH6NU" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;hhdpnbfH6NU&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/hhdpnbfH6NU?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>OpenAI has introduced <a href="https://openai.com/index/introducing-codex/">Codex</a>, a powerful AI coding agent that converts natural language into code, streamlining programming tasks and enhancing developer productivity. It&#8217;s a cloud-based software engineering agent that can work on many tasks in parallel, powered by codex-1, a version of OpenAI o3 model optimized for software engineering. </p><p>It is not a full fledged IDE like <a href="https://www.cursor.com/">Cursor</a>. It is designed mainly to work with your GitHub repos. It can perform tasks like writing features, fixing bugs, answering codebase questions, and running tests (upto 30 minutes). Once it completes a coding task assigned, it can create a PR and a human can review it before merging it to the codebase providing more control to the human.</p><p>Codex-1 demonstrates high accuracy on internal (75%) and SWE (72.1%) benchmarks. The agents run in isolated containerized environments providing better security and autonomy. Codex is currently available to <a href="https://chatgpt.com/codex">ChatGPT</a> Pro, Team and Enterprise users today and will be available to Plus users soon.</p><h2>OpenCV Conference</h2><div id="youtube2-w6QxjZO211w" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;w6QxjZO211w&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/w6QxjZO211w?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>OpenCV recently hosted its first ever <a href="https://www.displayweek.org/event-enhancements/oscca/">conference</a> &#8220;OpenCV-SID Conference on Computer Vision &amp; AI (OSCCA)&#8221; as part of <a href="https://www.displayweek.org/">Display Week</a> 2025 (the premier international event for electronic display technologies)  in partnership with <a href="https://www.sid.org/">SID</a> (Society for Information Display).</p><p>It was an one-day in-person <a href="https://opencv.org/blog/introducing-the-opencv-sid-conference-on-computer-vision-ai/">event</a> featuring <a href="https://opencv.org/blog/speaker-lineup-for-oscca-opencvs-first-conference/">talks</a> from industry experts such as <a href="https://en.wikipedia.org/wiki/Gary_Bradski">Gary Bradski</a> (Founder of OpenCV), <a href="https://www.linkedin.com/in/satyamallick/">Satya Mallick</a> (CEO of OpenCV), <a href="https://www.linkedin.com/in/monicadsong/">Monica Song</a> (Product Manager at Google AI Frameworks) and <a href="https://www.linkedin.com/in/josephofiowa">Joseph Nelson</a> (CEO of Roboflow) etc. on various topics ranging from deep learning framework Keras to state-of-the-art augmented reality gaming to the latest advancements in single-shot detection.</p><p>Unfortunately, the recordings of the talks or slides are not publicly shared online. Hope to see relevant materials from the talks in <a href="https://io.google/2025/explore/technical-session-3">other forms</a> online in the near future.</p><h2>Foundations of Computer Vision Book</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MmX5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa98a32d5-30d9-4db6-b5d9-29fedb1b0c13_1024x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MmX5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa98a32d5-30d9-4db6-b5d9-29fedb1b0c13_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!MmX5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa98a32d5-30d9-4db6-b5d9-29fedb1b0c13_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!MmX5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa98a32d5-30d9-4db6-b5d9-29fedb1b0c13_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!MmX5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa98a32d5-30d9-4db6-b5d9-29fedb1b0c13_1024x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MmX5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa98a32d5-30d9-4db6-b5d9-29fedb1b0c13_1024x768.png" width="728" height="546" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a98a32d5-30d9-4db6-b5d9-29fedb1b0c13_1024x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1024,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:831869,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.visiongeek.io/i/164229663?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa98a32d5-30d9-4db6-b5d9-29fedb1b0c13_1024x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MmX5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa98a32d5-30d9-4db6-b5d9-29fedb1b0c13_1024x768.png 424w, https://substackcdn.com/image/fetch/$s_!MmX5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa98a32d5-30d9-4db6-b5d9-29fedb1b0c13_1024x768.png 848w, https://substackcdn.com/image/fetch/$s_!MmX5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa98a32d5-30d9-4db6-b5d9-29fedb1b0c13_1024x768.png 1272w, https://substackcdn.com/image/fetch/$s_!MmX5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa98a32d5-30d9-4db6-b5d9-29fedb1b0c13_1024x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>&#8220;<a href="https://visionbook.mit.edu/">Foundations of Computer Vision</a>&#8221; is a comprehensive <a href="https://www.amazon.com/Foundations-Computer-Adaptive-Computation-Learning/dp/0262048973">textbook</a> authored by Antonio Torralba, Phillip Isola, and William T. Freeman, published by the <a href="https://mitpress.mit.edu/9780262048972/foundations-of-computer-vision/">MIT Press</a> in April  16,2024. Designed for students, educators, and practitioners, the book offers an in-depth exploration of computer vision, integrating both classical methods and contemporary deep learning advancements. </p><p>It incorporates the latest deep learning techniques, offering insights into how modern approaches have transformed computer vision. Beyond algorithms, the text addresses the relationship between machine vision and human perception, emphasizing the interdisciplinary nature of the field. </p><p>Its coverage includes transformers, diffusion models, statistical image models, fairness, ethics, and research methodologies, topics not commonly found in other textbooks. Concepts are presented in concise chapters with extensive illustrations, examples, and exercises to facilitate intuitive learning.</p><p>For those interested in a structured and modern exploration of computer vision, this textbook serves as a valuable resource, bridging traditional concepts with the latest advancements in the field. Having said that, it&#8217;s hard to cover everything in a single book. This book does not cover in depth the many applications of computer vision such as shape analysis, object tracking, person pose analysis and face recognition.</p><h4>Fun Fact</h4><p>It has taken 10 years to complete this book (started at Nov 2010) because of the rapid development in the field after the book started, mainly due to the explosion of deep learning. </p><h2>Wildlife Species Detection Challenge</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!B_8R!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927b8099-cdf9-416f-9dfe-a3d4681d238b_560x280.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!B_8R!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927b8099-cdf9-416f-9dfe-a3d4681d238b_560x280.png 424w, https://substackcdn.com/image/fetch/$s_!B_8R!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927b8099-cdf9-416f-9dfe-a3d4681d238b_560x280.png 848w, https://substackcdn.com/image/fetch/$s_!B_8R!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927b8099-cdf9-416f-9dfe-a3d4681d238b_560x280.png 1272w, https://substackcdn.com/image/fetch/$s_!B_8R!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927b8099-cdf9-416f-9dfe-a3d4681d238b_560x280.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!B_8R!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927b8099-cdf9-416f-9dfe-a3d4681d238b_560x280.png" width="726" height="363" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/927b8099-cdf9-416f-9dfe-a3d4681d238b_560x280.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:280,&quot;width&quot;:560,&quot;resizeWidth&quot;:726,&quot;bytes&quot;:259342,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.visiongeek.io/i/164229663?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927b8099-cdf9-416f-9dfe-a3d4681d238b_560x280.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!B_8R!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927b8099-cdf9-416f-9dfe-a3d4681d238b_560x280.png 424w, https://substackcdn.com/image/fetch/$s_!B_8R!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927b8099-cdf9-416f-9dfe-a3d4681d238b_560x280.png 848w, https://substackcdn.com/image/fetch/$s_!B_8R!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927b8099-cdf9-416f-9dfe-a3d4681d238b_560x280.png 1272w, https://substackcdn.com/image/fetch/$s_!B_8R!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F927b8099-cdf9-416f-9dfe-a3d4681d238b_560x280.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The <a href="https://www.kaggle.com/competitions/cupybara">Cup-ybara Challenge 2025</a> is an 8-week global AI competition hosted by <a href="https://tryolabs.com/cupybara-challenge">Tryolabs</a> (as part of its &#8220;<a href="https://tryolabs.com/ai-for-good">AI for Good</a>&#8221; initiative) on the Kaggle platform. It tasks participants with building machine-learning models to automatically identify native Uruguayan wildlife species in <a href="https://www.linkedin.com/posts/tryolabs_cup-ybara-challenge-is-coming-an-open-activity-7305242039732891650-cYze/">short camera-trap videos</a>. </p><p>Tryolabs describes it as &#8220;an open competition to develop cutting-edge AI models for automated wildlife species detection&#8221;, aimed at speeding up a traditionally slow, manual monitoring process. The contest connects the data science community with conservation: the winning models will be shared with local NGO <a href="https://amba.org.uy/en/">AMB&#193;</a> to be put into real-world use in Uruguay&#8217;s forests and reserves.</p><p>The dataset for this competition is composed of real-world footage collected by camera traps in Uruguayan ecosystems which includes unlabeled training data (train),  labeled test data for public leaderboard evaluation (test), held-out private test data for final scoring (private_test).</p><p>Each video is exactly 15 seconds long, encoded in .mp4 format, and recorded using motion-triggered camera traps deployed in natural, uncontrolled environments.</p><ul><li><p>Competition start date: Monday, April 21st, 2025</p></li><li><p>Submissions close date: Sunday, June 15th, 2025</p></li></ul><p>During the competition, participants may collaborate in teams, iterate on models, and discuss ideas on the Kaggle forums. Kaggle enforces its standard competition rules: for example, no external data or manual re-labeling beyond what is provided, and a specific submission format (CSV) via the provided BaseModel class.</p><p>Submissions are evaluated by a &#8220;weighted F1 score&#8221; on the held-out test set, reflecting the multi-class classification accuracy of species labels. (Weighted F1 is common in Kaggle&#8217;s multi-category tasks) </p><p>The challenge is open to anyone worldwide. Participants need to register for free on Kaggle and accept the competition rules. Monetary prices will be awarded based on evaluation results performed after the main competition ends on a private dataset.  (1st Place: $500, 2nd Place: $300, 3rd Place: $100, Honorable Mention: $100)</p><p>Prizes will be paid using <a href="https://buymeacoffee.com/">Buy Me a Coffee</a>. In summary, Cup-ybara aligns AI and wildlife conservation, inviting participants to leverage modern video classification techniques to help protect biodiversity.</p>]]></content:encoded></item><item><title><![CDATA[LlamaCon 2025, RF-DETR, Cursor Free Plan]]></title><description><![CDATA[Vision Geek Newsletter #11]]></description><link>https://newsletter.visiongeek.io/p/llamacon-2025-rf-detr-cursor-free</link><guid isPermaLink="false">https://newsletter.visiongeek.io/p/llamacon-2025-rf-detr-cursor-free</guid><dc:creator><![CDATA[Arun Ponnusamy]]></dc:creator><pubDate>Fri, 16 May 2025 18:45:10 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/1VUfyfeeURw" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hello there,</p><p>Welcome to the next edition of Vision Geek Newsletter, covering the essential news and learning resources in Computer Vision and Machine Learning, read by 1000+ Computer Vision practitioners.</p><p>Let&#8217;s get into it.</p><h2><strong>LlamaCon 2025</strong></h2><div id="youtube2-1VUfyfeeURw" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;1VUfyfeeURw&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/1VUfyfeeURw?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Meta recently held it&#8217;s first-ever <a href="https://www.llama.com/events/llamacon/2025/">LlamaCon</a> focusing entirely on Llama models, open source initiatives and the company&#8217;s work in AI. </p><h4>Key Announcements &amp; Highlights</h4><ul><li><p>The conference kicked off with the showcase of the capabilities and strengths of the recently released natively multimodal <a href="https://ai.meta.com/blog/llama-4-multimodal-intelligence/">Llama 4</a> models. </p></li><li><p>A new standalone Meta AI <a href="https://ai.meta.com/meta-ai/">app</a> for users available on Android &amp; iOS.</p></li><li><p>Launch of new <a href="https://www.llama.com/products/llama-api/">Llama API</a>, hosted on their cloud which simplifies access to Llama models for inference in collaboration with Cerebras and Groq to deliver faster performance.</p></li><li><p>The team also shared the technical details and behind the scene decisions while building Llama 4 models and the API.</p></li><li><p>And showcased one-click API key generation, interactive playground for testing, SDKs for Python &amp; TypeScript, compatibility with OpenAI SDKs, support for Llama 4 variants like Scout and Maverick and tools for fine-tuning the models with custom dataset.</p></li><li><p>Meta has unveiled a suite of open-source <a href="https://www.llama.com/llama-protections/">Llama Protection tools</a> and <a href="https://www.llama.com/llama-protections/ai-defenders/">Llama Defenders Program</a> aimed at improving AI safety and security. </p></li><li><p>And announced 10 international recipients of the second <a href="https://about.fb.com/news/2025/04/llama-impact-grant-recipients/">Llama Impact Grants</a>. With over $1.5 million USD awarded, these grants support companies, startups, and universities using Llama to drive transformative change.</p></li><li><p>Announced that <a href="https://ai.meta.com/sam3/">SAM 3</a> will be released this summer with support for text based prompts for segmenting objects/regions in images &amp; videos whereas SAM 2 supported only clicks(points), masks and bounding boxes as prompts and showed a simple demo of SAM 3.</p></li><li><p>Mark Zuckerberg&#8217;s a couple of fireside chats with <a href="https://www.youtube.com/watch?v=1VUfyfeeURw&amp;t=2099">Ali Ghodsi</a> (CEO of Databricks) and <a href="https://www.youtube.com/watch?v=WaJOONFllLc">Satya Nadella</a> (CEO of Microsoft) and a <a href="https://www.youtube.com/playlist?list=PLb0IAmt7-GS2eX0SYxHXdKPDwTNUxXB15">Hackathon</a>.</p></li></ul><h3>RF-DETR by Roboflow</h3><div id="youtube2-Mlcap4KGddg" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;Mlcap4KGddg&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/Mlcap4KGddg?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Roboflow has recently released <a href="https://roboflow.com/model/rf-detr">RF-DETR</a>, real-time object detection transformer-based architecture under the Apache 2.0 license.</p><p>RF-DETR is the first real-time model to exceed 60 AP on the <a href="https://cocodataset.org/#home">Microsoft COCO benchmark</a> however models like DINO and SwinV2-G achieve slightly higher mAP scores on COCO, RF-DETR distinguishes itself by combining high accuracy with real time performance, making it particularly suitable for applications requiring both speed and precision. It also achieves state-of-the-art performance on <a href="https://github.com/roboflow/rf100-vl">RF100-VL</a>, an object detection benchmark that measures model domain adaptability to real world problems. </p><p>The model comes in two variants: RF-DETR Base (29M parameters) and RF-DETR Large (129M parameters). RF-DETR is developed for projects that need a model that can run high speeds with a high degree of accuracy, and often on limited compute (like on the edge or low latency).</p><h4>Key Benefits</h4><ul><li><p>Converges faster than other models as pretraining helps.</p></li><li><p>No NMS (smoother predictions between video frames and better understanding of overlapping objects)</p></li><li><p>Gets the same results on cheaper hardware as LayerNorm is used instead of BatchNorm</p></li><li><p>Multi Resolution training.</p></li></ul><h4>RT-DETR (Baidu) vs RF-DETR (Roboflow):</h4><ul><li><p>Both RT-DETR and RF-DETR works on No NMS. </p></li><li><p>In terms of speed, RT-DETR runs Up to 114 FPS whereas RF-DETR runs upto 25+ FPS. </p></li><li><p>RF-DETR is<strong> </strong>designed for easier domain transfer &amp; fine-tuning while RT-DETR transferability is not heavily optimized for small custom datasets.</p></li></ul><p>RT-DETR is best suited when speed is of high priority than accuracy and RF-DETR is best suited when accuracy is of high priority while maintaining real time speed.</p><h3>Cursor Free Plan</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!P32c!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6781de99-af71-458d-8ee8-f6d1d73b4e1e_800x260.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!P32c!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6781de99-af71-458d-8ee8-f6d1d73b4e1e_800x260.jpeg 424w, https://substackcdn.com/image/fetch/$s_!P32c!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6781de99-af71-458d-8ee8-f6d1d73b4e1e_800x260.jpeg 848w, https://substackcdn.com/image/fetch/$s_!P32c!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6781de99-af71-458d-8ee8-f6d1d73b4e1e_800x260.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!P32c!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6781de99-af71-458d-8ee8-f6d1d73b4e1e_800x260.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!P32c!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6781de99-af71-458d-8ee8-f6d1d73b4e1e_800x260.jpeg" width="800" height="260" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6781de99-af71-458d-8ee8-f6d1d73b4e1e_800x260.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:260,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:13954,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.visiongeek.io/i/163380127?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6781de99-af71-458d-8ee8-f6d1d73b4e1e_800x260.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!P32c!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6781de99-af71-458d-8ee8-f6d1d73b4e1e_800x260.jpeg 424w, https://substackcdn.com/image/fetch/$s_!P32c!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6781de99-af71-458d-8ee8-f6d1d73b4e1e_800x260.jpeg 848w, https://substackcdn.com/image/fetch/$s_!P32c!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6781de99-af71-458d-8ee8-f6d1d73b4e1e_800x260.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!P32c!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6781de99-af71-458d-8ee8-f6d1d73b4e1e_800x260.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><a href="https://cursor.com">Cursor</a>, the AI coding assistant built on top of VS Code, now offers a free <a href="https://www.cursor.com/students">student plan</a>. You will have access to all <a href="https://docs.cursor.com/account/plans-and-usage">Pro features</a> for a year. This includes 500 fast premium requests per month and unlimited slow premium requests which usually costs $20 per month.</p><p>With this plan, students get access to key features like:</p><ul><li><p>AI pair programming (like GitHub Copilot)</p></li><li><p>Smart code suggestions and autocompletion</p></li><li><p>Inline code explanations and refactoring</p></li><li><p>Help with debugging and navigating large codebases</p></li></ul><p>Cursor is especially useful for beginners and those working on personal or academic projects in Python, JavaScript, TypeScript, and more. To get access, you&#8217;ll need to verify with a valid .edu email or upload proof of enrollment. Once approved, you&#8217;ll unlock premium features at no cost for a year.</p><p>However Cursor has<strong> </strong>restricted its free student plan in various regions including India. Cursor&#8217;s team <a href="https://forum.cursor.com/t/student-discount-details-updates-q-as/88907">says</a> &#8220;we are working hard with our partners to bring this to more students soon&#8221;.</p><p>Students in the restricted regions can checkout other offers such as <a href="https://education.github.com/pack">GitHub Copilot Student Pack</a>, <a href="https://azure.microsoft.com/en-us/free/students">Microsoft Azure for Students</a> in the meantime. Hopefully, as the program matures and verification workflows improve more regions will be added to the free-Pro list.</p><p></p>]]></content:encoded></item><item><title><![CDATA[Nvidia Jetson Orin Nano Super, PaliGemma 2, Segment Anything Model (SAM) 2.1]]></title><description><![CDATA[Vision Geek Newsletter #10]]></description><link>https://newsletter.visiongeek.io/p/nvidia-jetson-orin-nano-super-paligemma</link><guid isPermaLink="false">https://newsletter.visiongeek.io/p/nvidia-jetson-orin-nano-super-paligemma</guid><dc:creator><![CDATA[Arun Ponnusamy]]></dc:creator><pubDate>Tue, 31 Dec 2024 13:06:30 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/S9L2WGf1KrM" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hello there,</p><p>Welcome to the next edition of Vision Geek Newsletter, covering the essential news in Computer Vision and Machine Learning, read by 1000+ Computer Vision practitioners. </p><p>Let&#8217;s get into it.</p><h2>Nvidia Jetson Orin Nano Super</h2><div id="youtube2-S9L2WGf1KrM" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;S9L2WGf1KrM&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/S9L2WGf1KrM?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Nvidia has <a href="https://developer.nvidia.com/blog/nvidia-jetson-orin-nano-developer-kit-gets-a-super-boost/">announced</a> a &#8220;super&#8221; update to their existing edge device &#8220;Jetson Orin Nano&#8221; and renaming the updated device to &#8220;Jetson Orin Nano <strong>Super</strong>&#8221; with a price drop from $499 to just $249 (8GB variant), claiming to be &#8220;<strong>The World&#8217;s Most Affordable Generative AI Computer&#8221;.</strong></p><p>There is no hardware update. But through a pure software update, they have introduced a new 25W power mode to the device which has increased the performance in all aspects. </p><ul><li><p>1.7x higher generative AI model performance. </p></li><li><p>67 Sparse TOPs, a significant increase from the previous 40 Sparse TOPs</p></li><li><p>102 GB/s of memory bandwidth, a significant leap from the previous 65 GB/s memory bandwidth.</p></li><li><p>1.7 GHz of CPU clock speed, up from 1.5 GHz.</p></li><li><p>1020 MHz of GPU clock speed, up from 635 MHz.</p></li></ul><p>This compact yet powerful system can effortlessly handle a wide range of LLMs, VLMs, and Vision Transformers (ViTs), from smaller models to those with up to 8B parameters, such as the Llama-3.1-8B model.</p><p>For the given price point and the performance, Orin Nano Super can be a great choice as a baseline edge device to handle the modern day AI workloads. Though it is advertised as Generative AI computer, it can handle all forms of ML models.</p><h2>PaliGemma 2</h2><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;f9402eda-ebf1-4bb2-8ae8-cec2b1479664&quot;,&quot;duration&quot;:null}"></div><p>Google has <a href="https://developers.googleblog.com/en/introducing-paligemma-2-powerful-vision-language-models-simple-fine-tuning/">released</a> the next update to PaliGemma, open source vision-language model in the <a href="https://ai.google.dev/gemma">Gemma</a> family of models.<code> </code>Like its predecessor, PaliGemma 2 uses the same powerful <a href="https://huggingface.co/collections/google/siglip-659d5e62f0ae1a57ae0e83ba">SigLIP</a> and <a href="https://arxiv.org/abs/2310.09199">PaLI-3</a> for vision, but it upgrades to the latest <a href="https://blog.google/technology/developers/google-gemma-2/">Gemma 2</a> for the text decoder part.</p><h4>What&#8217;s New</h4><ul><li><p><strong>Scalable performance:</strong> Optimize performance for any task with PaliGemma 2's multiple model sizes (3B, 10B, 28B parameters) and resolutions (224px, 448px, 896px), where as the previous version had only one variant (3B).</p></li></ul><ul><li><p><strong>Long captioning:</strong> PaliGemma 2 generates detailed, contextually relevant captions for images, going beyond simple object identification to describe actions, emotions, and the overall narrative of the scene.</p></li></ul><ul><li><p><strong>Expanding to new horizons:</strong> Leading performance on chemical formula recognition, music score recognition, spatial reasoning, and chest X-ray report generation, as detailed in the <a href="https://arxiv.org/abs/2412.03555">technical report</a>.</p></li></ul><p>PaliGemma 2 is distributed under the Gemma license, which allows for redistribution, commercial use, fine-tuning and creation of model derivatives. The pre-trained models have been designed for easy <a href="https://github.com/merveenoyan/smol-vision/blob/main/Fine_tune_PaliGemma.ipynb">fine-tuning</a> on custom datasets for specific tasks. Model files are available on <a href="https://www.kaggle.com/models/google/paligemma-2">Kaggle</a> and <a href="https://huggingface.co/collections/google/paligemma-2-release-67500e1e1dbfdd4dee27ba48">HuggingFace</a>. Get started with <a href="https://huggingface.co/blog/paligemma2">HuggingFace Transformers</a> or <a href="https://ai.google.dev/gemma/docs/paligemma/inference-with-keras">Keras</a>. (<a href="https://ai.google.dev/gemma/docs/paligemma">Docs</a> | <a href="https://github.com/google-gemini/gemma-cookbook/tree/main/PaliGemma_2">Notebooks</a> | <a href="https://huggingface.co/spaces/merve/paligemma2-vqav2">Demo</a>)</p><h2>Segment Anything Model (SAM) 2.1</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!F4iK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6baabfc-d299-471e-b971-09436dee15a2_543x497.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!F4iK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6baabfc-d299-471e-b971-09436dee15a2_543x497.png 424w, https://substackcdn.com/image/fetch/$s_!F4iK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6baabfc-d299-471e-b971-09436dee15a2_543x497.png 848w, https://substackcdn.com/image/fetch/$s_!F4iK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6baabfc-d299-471e-b971-09436dee15a2_543x497.png 1272w, https://substackcdn.com/image/fetch/$s_!F4iK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6baabfc-d299-471e-b971-09436dee15a2_543x497.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!F4iK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6baabfc-d299-471e-b971-09436dee15a2_543x497.png" width="458" height="419.2007366482505" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b6baabfc-d299-471e-b971-09436dee15a2_543x497.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:497,&quot;width&quot;:543,&quot;resizeWidth&quot;:458,&quot;bytes&quot;:88196,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!F4iK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6baabfc-d299-471e-b971-09436dee15a2_543x497.png 424w, https://substackcdn.com/image/fetch/$s_!F4iK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6baabfc-d299-471e-b971-09436dee15a2_543x497.png 848w, https://substackcdn.com/image/fetch/$s_!F4iK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6baabfc-d299-471e-b971-09436dee15a2_543x497.png 1272w, https://substackcdn.com/image/fetch/$s_!F4iK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6baabfc-d299-471e-b971-09436dee15a2_543x497.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Meta&#8217;s Fundamental AI Research (FAIR) team has <a href="https://ai.meta.com/blog/fair-news-segment-anything-2-1-meta-spirit-lm-layer-skip-salsa-lingua/">released</a> an update to SAM 2 featuring updated <a href="https://github.com/facebookresearch/sam2?tab=readme-ov-file#sam-21-checkpoints">checkpoints</a> with better accuracy. </p><h4>What&#8217;s New</h4><ul><li><p>Additional data augmentation techniques to simulate the presence of visually similar objects and small objects where SAM 2 previously struggled.</p></li><li><p>Improved occlusion handling capability by training the model on longer sequences of frames and making some tweaks to positional encoding of spatial and object pointer memory (<a href="https://arxiv.org/abs/2408.00714v2">updated paper</a>).</p></li><li><p>SAM 2 <a href="https://github.com/facebookresearch/sam2">Developer Suite</a>, a package of open source code to make it easier than ever to build with SAM 2. This release includes <a href="https://github.com/facebookresearch/sam2/blob/main/training/README.md">training code</a> for fine-tuning SAM 2 with your own data.</p></li><li><p>Front-end and back-end <a href="https://github.com/facebookresearch/sam2/blob/main/demo/README.md">code</a> for the web demo.</p></li></ul><p>Not much of changes in model sizes, no. of model variants and inference speed. </p>]]></content:encoded></item><item><title><![CDATA[Meta SAM2; FastHTML; Google Imagen 3; Grok 2 ]]></title><description><![CDATA[Vision Geek Newsletter #9]]></description><link>https://newsletter.visiongeek.io/p/meta-sam2-fasthtml-google-imagen</link><guid isPermaLink="false">https://newsletter.visiongeek.io/p/meta-sam2-fasthtml-google-imagen</guid><dc:creator><![CDATA[Arun Ponnusamy]]></dc:creator><pubDate>Sun, 25 Aug 2024 11:58:24 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe32f7e1-d1b8-40ac-8f45-16be6e0f3993_1280x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3>Segment Anything Model 2</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!clOc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe32f7e1-d1b8-40ac-8f45-16be6e0f3993_1280x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!clOc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe32f7e1-d1b8-40ac-8f45-16be6e0f3993_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!clOc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe32f7e1-d1b8-40ac-8f45-16be6e0f3993_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!clOc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe32f7e1-d1b8-40ac-8f45-16be6e0f3993_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!clOc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe32f7e1-d1b8-40ac-8f45-16be6e0f3993_1280x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!clOc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe32f7e1-d1b8-40ac-8f45-16be6e0f3993_1280x720.png" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fe32f7e1-d1b8-40ac-8f45-16be6e0f3993_1280x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:479797,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!clOc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe32f7e1-d1b8-40ac-8f45-16be6e0f3993_1280x720.png 424w, https://substackcdn.com/image/fetch/$s_!clOc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe32f7e1-d1b8-40ac-8f45-16be6e0f3993_1280x720.png 848w, https://substackcdn.com/image/fetch/$s_!clOc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe32f7e1-d1b8-40ac-8f45-16be6e0f3993_1280x720.png 1272w, https://substackcdn.com/image/fetch/$s_!clOc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe32f7e1-d1b8-40ac-8f45-16be6e0f3993_1280x720.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Meta has <a href="https://ai.meta.com/sam2/">released</a> the second version of SAM with significant improvements to the first version released last year. SAM2 can now segment and track selected objects in videos. This is made possible with the added memory bank in the architecture. SAM2 is 6x faster compared to SAM1 and the model weights sizes are much smaller. SAM2 has 4 model variants tiny, small, base plus and large. They have also released both the model weights and the <a href="https://ai.meta.com/datasets/segment-anything-video/">dataset</a> used for training. (<a href="https://github.com/facebookresearch/segment-anything-2">github</a> | <a href="https://ai.meta.com/blog/segment-anything-2/">blog</a> | <a href="https://arxiv.org/abs/2408.06305v1">paper</a> | <a href="https://sam2.metademolab.com/">demo</a>)</p><h3>FastHTML </h3><div id="youtube2-QqZUzkPcU7A" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;QqZUzkPcU7A&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/QqZUzkPcU7A?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Jeremy Howard and the team at <a href="http://answer.ai">answer.ai</a> have released a new library for building modern interactive web apps with pure python in just few lines of code. <a href="http://fastht.ml">FastHTML</a> can be used for everything from collaborative games to multi-modal UI. It can be a potential alternative to existing libraries like <a href="http://streamlit.io">streamlit</a>, <a href="https://www.gradio.app/">gradio</a> and <a href="https://google.github.io/mesop/">mesop</a>. (<a href="https://docs.fastht.ml/">docs</a> | <a href="https://www.answer.ai/posts/2024-08-03-fasthtml.html">blog</a> | <a href="https://github.com/AnswerDotAI/fasthtml">github</a>)</p><h3>Google Imagen 3</h3><div id="youtube2-nEuNwULfGXk" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;nEuNwULfGXk&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/nEuNwULfGXk?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Google announced it&#8217;s latest image generation model &#8220;<a href="https://deepmind.google/technologies/imagen-3/">Imagen 3</a>&#8221; at Google I/O few months back. Now it is being released for people to use in select countries through <a href="http://labs.google/imagefx">ImageFX</a> lab web interface and Google Cloud <a href="https://cloud.google.com/generative-ai-studio">Vertex AI Studio</a>. Imagen 3 seems to be capable of creating stunning photo realistic images from text prompts competing with other models like DALL-E 3, FLUX.1 and Midjourney with better safety guardrails.  </p><h3>Grok 2 with Vision</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!E4aV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3416742f-1a75-4b90-a67a-149b01d1cb53_743x506.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!E4aV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3416742f-1a75-4b90-a67a-149b01d1cb53_743x506.png 424w, https://substackcdn.com/image/fetch/$s_!E4aV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3416742f-1a75-4b90-a67a-149b01d1cb53_743x506.png 848w, https://substackcdn.com/image/fetch/$s_!E4aV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3416742f-1a75-4b90-a67a-149b01d1cb53_743x506.png 1272w, https://substackcdn.com/image/fetch/$s_!E4aV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3416742f-1a75-4b90-a67a-149b01d1cb53_743x506.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!E4aV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3416742f-1a75-4b90-a67a-149b01d1cb53_743x506.png" width="743" height="506" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3416742f-1a75-4b90-a67a-149b01d1cb53_743x506.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:506,&quot;width&quot;:743,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:151032,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!E4aV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3416742f-1a75-4b90-a67a-149b01d1cb53_743x506.png 424w, https://substackcdn.com/image/fetch/$s_!E4aV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3416742f-1a75-4b90-a67a-149b01d1cb53_743x506.png 848w, https://substackcdn.com/image/fetch/$s_!E4aV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3416742f-1a75-4b90-a67a-149b01d1cb53_743x506.png 1272w, https://substackcdn.com/image/fetch/$s_!E4aV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3416742f-1a75-4b90-a67a-149b01d1cb53_743x506.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><a href="http://x.ai">xAI</a>, the AI lab from X (formerly Twitter) headed by Elon Musk, has released <a href="https://x.ai/blog/grok-2">Grok 2</a> with vision capability. Grok started off as an LLM with just text input and output. The recent version expands it&#8217;s capabilities in vision and text understanding, also integrating real-time information from the &#120143; platform. Currently it is in beta. &#120143; Premium and Premium+ users will have access to two new models: Grok-2 and Grok-2 mini in the X app. xAI is working with <a href="https://blackforestlabs.ai/">Black Forest Labs</a> to experiment with their uncensored image generation model <a href="https://blackforestlabs.ai/#get-flux">FLUX.1</a> </p><h3>Mark&#8217;s Open Source Vision</h3><div id="youtube2-CMd4fMKCIso" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;CMd4fMKCIso&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/CMd4fMKCIso?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Meta has been making significant contributions to the open source AI community lately with state-of-the-art foundational models like <a href="https://ai.meta.com/sam2/">SAM2</a> and <a href="https://llama.meta.com/">Llama 3.1</a> which competes with commercial models from companies like Google and OpenAI. If you have been wondering why Meta is actually open sourcing these models, Meta&#8217;s CEO Mark Zuckerberg shares his vision for open source in this Bloomberg <a href="https://www.youtube.com/watch?v=YuIc4mq7zMU">interview</a>.  </p>]]></content:encoded></item><item><title><![CDATA[PyTorch Documentary; Karpathy's Keynote Speech; Kyutai's Moshi; Andrew Ng's Talk]]></title><description><![CDATA[Vision Geek Newsletter #8]]></description><link>https://newsletter.visiongeek.io/p/issue-8</link><guid isPermaLink="false">https://newsletter.visiongeek.io/p/issue-8</guid><dc:creator><![CDATA[Arun Ponnusamy]]></dc:creator><pubDate>Fri, 26 Jul 2024 17:13:22 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/tsTeEkzO9xc" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hello there,</p><p>If you are looking for some inspiring content to watch over the weekend in the AI space, below is a quick list of interesting videos for you. </p><h3>Official PyTorch Documentary</h3><div id="youtube2-rgP_LBtaUEc" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;rgP_LBtaUEc&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/rgP_LBtaUEc?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>PyTorch team has produced a documentary covering the journey of the library from the initial days and it&#8217;s impact on the ongoing AI revolution along with sponsors like AMD, AWS, Meta and Microsoft. The documentary unveils the authentic narrative of PyTorch&#8217;s inception, attributing its existence to a dedicated group of unsung heroes and shares the strength of the PyTorch community. Inspiring watch.</p><h3>Andrej Karpathy&#8217;s Keynote Speech</h3><div id="youtube2-tsTeEkzO9xc" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;tsTeEkzO9xc&quot;,&quot;startTime&quot;:&quot;245s&quot;,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/tsTeEkzO9xc?start=245s&amp;rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Andrej Karpathy recently gave the keynote speech at the UC Berkeley AI Hackathon 2024 Awards ceremony. Apparently he loves the vibe of hackathons. He discussed about the current state of AI, his early days at OpenAI, LLM OS, his vision for AGI and how working on side projects have the potential to snowball into something bigger. This video also showcases the winning projects from the hackathon. </p><h3>Open Source Alternative to GPT-4o</h3><div id="youtube2-hm2IJSKcYvo" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;hm2IJSKcYvo&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/hm2IJSKcYvo?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>A French AI Lab called Kyutai Labs has released a real time voice enabled model called Moshi, a potential open source alternative to OpenAI&#8217;s GPT-4o. The team claims that in just 6 months, with a team of 8, the Kyutai research lab has developed the model from scratch and their mission is to build and democratize artificial general intelligence through open science (the real OpenAI?). </p><p>During the presentation, the Kyutai team interacted with Moshi to illustrate its potential as a coach or companion for example, and its creativity through the incarnation of characters in roleplays. Code and model weights to be open sourced soon. Chat with Moshi at <a href="https://moshi.chat">moshi.chat</a>. Currently conversations are limited to 5 mins.</p><h3>Andrew Ng on AI Agentic Workflows</h3><div id="youtube2-sal78ACtGTc" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;sal78ACtGTc&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/sal78ACtGTc?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Andrew Ng recently gave a talk at Sequioa&#8217;s AI Ascent event on AI agentic workflows. &#8220;AI Agents&#8221; has become a buzzword recently and being tossed around in many places. In this talk, Andrew Ng presents concrete examples of agentic workflow and how it can improve the performance of LLMs. He also mentioned his experience using an open source version of <a href="https://www.youtube.com/watch?v=fjHtjT7GO1c">Devin</a> (the AI Software Engineer) - <a href="https://github.com/OpenBMB/ChatDev">ChatDev</a>.</p><p>Hope you find these videos engaging. Let us know what you have been watching lately. </p><p>Cheers.</p>]]></content:encoded></item><item><title><![CDATA[Tesla AI Day 2022; Meta Universal Speech Translator; Google AI@22; NN Zero-to-hero]]></title><description><![CDATA[Vision Geek AI Newsletter #7]]></description><link>https://newsletter.visiongeek.io/p/tesla-ai-day-2022-meta-universal</link><guid isPermaLink="false">https://newsletter.visiongeek.io/p/tesla-ai-day-2022-meta-universal</guid><dc:creator><![CDATA[Arun Ponnusamy]]></dc:creator><pubDate>Mon, 28 Nov 2022 10:07:44 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/suv8ex8xlZA" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>Tesla AI Day 2022</h2><div id="youtube2-suv8ex8xlZA" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;suv8ex8xlZA&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/suv8ex8xlZA?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><h3>Tesla Bot</h3><p>Last year Elon Musk announced that Tesla will be building a humanoid robot in their AI Day event and revealed the conceptual outlook of the Tesla Bot. &#8220;Tesla AI Day 2022&#8221; was held recently and the team showcased the prototype they have been working on. There are two variants Bumblebee and Optimus.</p><p>The team started off with Bumblebee where they used readily available off-the-shelf hardware components and moved on to Optimus where all the hardware components are designed by Tesla in-house, giving them a lot more control and performance.<br><br>Most of the software built for Autopilot in the cars is repurposed for the Bot. We have seen impressive humanoid robots from other companies in the past, they are very expensive and made in small quantities. </p><p>What makes Tesla Bot interesting is that they are planning to build these Bots in large quantities such that the cost of a bot will become lower than a car, hopefully less than $20,000. They already have the hardware, AI software and real world data from the cars. Tesla has the potential to make humanoid robots mainstream. We just have to wait and see if they can pull this off.</p><h3>Autopilot</h3><p>The Autopilot team gave a glimpse of how they are handling complex real world scenarios. They are slowly rolling out FSD (Full Self Driving) Beta to more and more customers.<br><br>One interesting thing about Tesla is that they don't use LiDAR at all in their cars. Everything is done using the 8 cameras present in the vehicle. It's quite hard. Most of the self driving car companies use expensive LiDARs.<br><br>They showcased the Occupancy Network they use to predict what's present on the scene around the vehicle in 3D and how they are taking inspirations from language modelling to tackle lane prediction at complex intersections.<br><br>They have developed their own compiler, file format (.smol) and neural network accelerator (TRIP Engine) to make the best use of their FSD hardware. They also use simulation to create 3D scenes to create training data of specific scenarios to improve the model accuracy, completely automated, no 3D artists involved. It's fascinating to see how the team is optimizing every single aspect to achieve better performance and accuracy.</p><h3>Dojo Supercomputer</h3><p>Tesla gets tens of thousands of real world video clips from the cameras on their fleet everyday. Training machine learning models on this ever-growing dataset is not an easy task.<br><br>Tesla uses a cluster of 14,000 GPUs. Out of which 10,000 GPUs are used for model training and 4000 GPUs are used for auto labeling task. Even with this massive GPU cluster, it takes months to train the models on huge datasets.<br><br>The team has built a supercomputer named "Dojo" to tackle this problem. Designing the entire hardware and software stack from the ground up has given them a huge boost in performance at a fraction of the usual GPU cost. The same models can now be trained in less than a week instead of a month.<br><br>From building a completely new compiler to creating a customer protocol (TTP - Tesla Transfer Protocol) to building a custom network interface card (DNIC - Dojo Network Interface Card) the team has innovated at all the layers. Truly inspiring work.</p><h2>Meta Universal Speech Translator</h2><div id="youtube2-u0Y6aRoqfAc" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;u0Y6aRoqfAc&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/u0Y6aRoqfAc?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Eight months ego Meta announced their &#8220;Universal Speech Translator&#8221; project, which aims to develop new AI methods that will allow real-time speech-to-speech translation across many languages. Now the team has come up with an interesting demo.<br><br>So far, AI powered speech translation systems convert the speech to text first and use NLP (Natural Language Processing) models to understand the text and convert it to text in the destination language and then convert that text to speech.<br><br>This approach works well for the languages that have a standard writing system. But nearly half of the world&#8217;s 7,000+ living languages are primarily oral and do not have a standard or widely used writing system.<br><br>This makes it impossible to build machine translation tools using standard techniques, which require large amounts of written text in order to train language models.<br><br>To address this challenge, Meta&#8217;s AI team has built the first AI-powered <a href="https://ai.facebook.com/blog/ai-translation-hokkien/">speech-to-speech translation</a> system for Hokkien, a primarily oral language that&#8217;s widely spoken in China, Taiwan and few other countries but lacks a standard written form.<br><br>The team has developed a variety of novel approaches and systems to achieve this. Meta is open-sourcing the translation models, evaluation datasets and research papers so that others can reproduce and build on their work.<br><br>Though the Hokkien translation model is still a work in progress and can translate only one full sentence at a time, it has the potential to allow anyone to communicate with anyone else from anywhere in the world in their own native language. Exciting times we live in !</p><h2>Google AI@22</h2><div id="youtube2-X5iLF-cszu0" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;X5iLF-cszu0&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/X5iLF-cszu0?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>In the recent Google event &#8220;AI @ 22&#8221; , their research team shared the recent advancements in generative modelling. There has been a lot of improvement particularly in image and video generation from text.<br><br>&#8220;Imagen&#8221; and &#8220;Parti&#8221; are two models the team has built with slightly different approaches to generate images from text. The results from these models are so crisp and of very high quality. We have definitely come a long way from generating low resolution handwritten numbers with MNIST.<br><br>Now that we are able to generate realistic high definition images from text, naturally the next step is to try generating videos from text. &#8220;Imagen Video&#8221; and &#8220;Phenaki&#8221; are two such attempts in creating consistent videos from text.<br><br>Taking inspirations from language modelling in solving computer vision problems has become common now. &#8220;Parti&#8221; and &#8220;Phenaki&#8221; are two such examples. Text-to-video models are still in its infancy but the revolution has definitely started.</p><h2>NN Zero-to-hero </h2><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://twitter.com/karpathy/status/1547332300186066944?lang=en&quot;,&quot;full_text&quot;:&quot;It&#8217;s been a great pleasure to help Tesla towards its goals over the last 5 years and a difficult decision to part ways. In that time, Autopilot graduated from lane keeping to city streets and I look forward to seeing the exceptionally strong Autopilot team continue that momentum.&quot;,&quot;username&quot;:&quot;karpathy&quot;,&quot;name&quot;:&quot;Andrej Karpathy&quot;,&quot;profile_image_url&quot;:&quot;&quot;,&quot;date&quot;:&quot;Wed Jul 13 21:29:03 +0000 2022&quot;,&quot;photos&quot;:[],&quot;quoted_tweet&quot;:{},&quot;reply_count&quot;:0,&quot;retweet_count&quot;:1253,&quot;like_count&quot;:26021,&quot;impression_count&quot;:0,&quot;expanded_url&quot;:{},&quot;video_url&quot;:null,&quot;belowTheFold&quot;:true}" data-component-name="Twitter2ToDOM"></div><p>In case you noticed, in the Tesla AI Day 2022 event, Autopilot session was not presented by <a href="http://karpathy.ai">Andrej Karpathy</a>. He was leading the AI team at Tesla until last year.<br><br>It seems that after a four month break from work, Andrej has decided to leave Tesla. For his tweet, Elon Musk has replied as "Thanks for everything you have done for Tesla! It has been an honor working with you.&#8220;<br><br>Andrej has mentioned that he has no concrete plans for what&#8217;s next but look to spend more time revisiting his long-term passions around technical work in AI, open source and education. He is currently working on creating a course called &#8220;Neural Networks - Zero to Hero&#8221;. He is making the course available for free to everyone.<br><br>It is hosted on <a href="https://github.com/karpathy/nn-zero-to-hero">GitHub</a>. Course lectures are available as <a href="https://www.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ">YouTube</a> videos and the coding exercises are available as Google colab notebooks. It is currently in progress. Some of the videos and notebooks are already available.</p><h2>PyImageSearch got acquired</h2><div id="youtube2-AKANzByKFGc" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;AKANzByKFGc&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/AKANzByKFGc?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p><a href="https://pyimagesearch.com">PyImageSearch</a> is one of the widely read blogs in the field of Computer Vision and Deep Learning. If you are a computer vision practitioner, most probably you might have read some of the blog posts on PyImageSearch in your learning journey.<br><br>Adrian Rosebrock is the core author and owner of the blog. It seems that PyImageSearch has been acquired last year. The details about the acquisition are not yet publicly disclosed.<br><br>In his new venture <a href="http://infoproductmastery.com">Info Product Mastery</a>, Adrian has mentioned that </p><blockquote><p>&#8220;I launched my company, PyImageSearch.com, in 2014. By 2017 I had grown it to 7 figures per year in revenue by selling eBooks and online courses I had created. In 2021 PyImageSearch was acquired for a life changing exit. I&#8217;m here to share my experiences so you can learn from the mistakes I&#8217;ve made while building and growing your own info product business.&#8221;</p></blockquote><p>The <a href="https://open.spotify.com/show/1ESZbd7dHL10keKKEPtiEf">podcast</a> aims to help developers, educators, and entrepreneurs launch and grow their online education businesses. Whether they are just looking to create a passive income stream, or build a full-time living. </p><h2>AI fun :)</h2><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://twitter.com/DeepLearningAI_/status/1592185593827659776&quot;,&quot;full_text&quot;:&quot;&#129313;\n\nOriginal: ML Tech Hub &quot;,&quot;username&quot;:&quot;DeepLearningAI_&quot;,&quot;name&quot;:&quot;DeepLearning.AI&quot;,&quot;profile_image_url&quot;:&quot;&quot;,&quot;date&quot;:&quot;Mon Nov 14 16:00:01 +0000 2022&quot;,&quot;photos&quot;:[{&quot;img_url&quot;:&quot;https://pbs.substack.com/media/FhiUXDpXoAA_XO0.jpg&quot;,&quot;link_url&quot;:&quot;https://t.co/2fluOFmtNt&quot;,&quot;alt_text&quot;:null}],&quot;quoted_tweet&quot;:{},&quot;reply_count&quot;:0,&quot;retweet_count&quot;:18,&quot;like_count&quot;:149,&quot;impression_count&quot;:0,&quot;expanded_url&quot;:{},&quot;video_url&quot;:null,&quot;belowTheFold&quot;:true}" data-component-name="Twitter2ToDOM"></div><h2>Support this newsletter &#10084;&#65039;</h2><p>If you are getting value out of my work, consider supporting me on Patreon and unlock exclusive benefits.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.patreon.com/arunponnusamy&quot;,&quot;text&quot;:&quot;Become a Patron&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.patreon.com/arunponnusamy"><span>Become a Patron</span></a></p>]]></content:encoded></item><item><title><![CDATA[GAN Specialization; Nvidia GTC Fall 2020; Vision Transformer; OpenCV 4.5.0]]></title><description><![CDATA[Vision Geek AI Newsletter #6]]></description><link>https://newsletter.visiongeek.io/p/issue-6</link><guid isPermaLink="false">https://newsletter.visiongeek.io/p/issue-6</guid><dc:creator><![CDATA[Arun Ponnusamy]]></dc:creator><pubDate>Sun, 29 Nov 2020 07:19:16 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/9d4jmPmTWmc" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>GAN Specialization</h2><div id="youtube2-9d4jmPmTWmc" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;9d4jmPmTWmc&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/9d4jmPmTWmc?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>deeplearning.ai has launched a new <a href="https://www.coursera.org/specializations/generative-adversarial-networks-gans">specialization</a> on Coursera containing 3 courses on Generative Adversarial Networks (<a href="https://en.wikipedia.org/wiki/Generative_adversarial_network">GAN</a>). Since it&#8217;s inception in 2014 by <a href="https://twitter.com/goodfellow_ian">Ian Goodfellow</a>, GANs have created a whole new subfield in AI, giving machines the ability to imagine and learn to create new content (text, image, video, music etc). This is already proving to be a game changer (good or bad ?) with things like OpenAI <a href="https://en.wikipedia.org/wiki/GPT-3">GPT-3</a> and <a href="https://en.wikipedia.org/wiki/Deepfake">DeepFakes</a>. Apart from this <a href="https://www.manning.com/books/gans-in-action">book</a> (uses TensorFlow/Keras), this set of courses are going to be really helpful for people interested in getting a better understanding of GANs and get hands-on with PyTorch.</p><h2>Nvidia GTC Fall 2020</h2><div id="youtube2-CKnipnFsuFo" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;CKnipnFsuFo&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/CKnipnFsuFo?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Nvidia has announced a bunch of new products and partnerships in the <a href="https://www.nvidia.com/en-us/gtc/">GPU Technology Conference (GTC)</a> happened virtually in October. Ranging from a new <a href="https://www.nvidia.com/en-in/autonomous-machines/embedded-systems/jetson-nano/education-projects/">2GB Jetson Nano</a> ($59) to data center specific products like DPU, new architectures like Bluefield in partnership with ARM and VMWare. Nvidia is definitely spearheading the AI hardware movement in full throttle. They have completely repositioned themselves from just a gaming company to premier AI computing company. </p><h2>Transformers for Image Recognition</h2><div id="youtube2-TrdevFK_am4" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;TrdevFK_am4&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/TrdevFK_am4?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>CNNs have been the go-to algorithm/method for computer vision tasks in almost the last decade since <a href="https://en.wikipedia.org/wiki/AlexNet">AlexNet</a> won the <a href="http://www.image-net.org/challenges/LSVRC/">ImageNet</a> competition in 2012. In recent times, Transformer networks have been slowly gaining momentum since the success of models like <a href="https://en.wikipedia.org/wiki/GPT-3">GPT-3</a>. Though transformer networks have been mainly used in NLP, it is slowly getting into computer vision as well. </p><p>Facebook AI Research team successfully applied transformers to <a href="https://ai.facebook.com/blog/end-to-end-object-detection-with-transformers/">object detection</a> recently, OpenAI team also released I<a href="https://openai.com/blog/image-gpt/">mage-GPT</a>. And now a team from Google has used transformers for image classification. The paper is currently under <a href="https://openreview.net/forum?id=YicbFdNTTy">review</a>. Can transformers replace CNN ? well, we have to wait and see. But CNN is definitely starting to face competition.  (<a href="https://arxiv.org/abs/2010.11929">paper</a> | <a href="https://github.com/google-research/vision_transformer">github</a>)</p><h2>Deep Learning with PyTorch Book</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!B2eu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F1b4227b5-073b-499b-82b4-db63aa126529_1246x386.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!B2eu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F1b4227b5-073b-499b-82b4-db63aa126529_1246x386.jpeg 424w, https://substackcdn.com/image/fetch/$s_!B2eu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F1b4227b5-073b-499b-82b4-db63aa126529_1246x386.jpeg 848w, https://substackcdn.com/image/fetch/$s_!B2eu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F1b4227b5-073b-499b-82b4-db63aa126529_1246x386.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!B2eu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F1b4227b5-073b-499b-82b4-db63aa126529_1246x386.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!B2eu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F1b4227b5-073b-499b-82b4-db63aa126529_1246x386.jpeg" width="1246" height="386" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/1b4227b5-073b-499b-82b4-db63aa126529_1246x386.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:386,&quot;width&quot;:1246,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Image" title="Image" srcset="https://substackcdn.com/image/fetch/$s_!B2eu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F1b4227b5-073b-499b-82b4-db63aa126529_1246x386.jpeg 424w, https://substackcdn.com/image/fetch/$s_!B2eu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F1b4227b5-073b-499b-82b4-db63aa126529_1246x386.jpeg 848w, https://substackcdn.com/image/fetch/$s_!B2eu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F1b4227b5-073b-499b-82b4-db63aa126529_1246x386.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!B2eu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F1b4227b5-073b-499b-82b4-db63aa126529_1246x386.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Just in case you missed it, few months back PyTorch team has made the full version of the book &#8220;Deep Learning with PyTorch&#8221; freely available for everyone. Apart from the official PyTorch <a href="https://pytorch.org/tutorials/">tutorials</a>, this book will be a great place to start for anyone new to PyTorch. You can download the book <a href="https://pytorch.org/assets/deep-learning/Deep-Learning-with-PyTorch.pdf">here</a>.</p><h2>OpenCV 4.5.0 </h2><p>OpenCV 4.5.0 has been <a href="https://opencv.org/opencv-4-5-0/">released</a>. Some of the improvements include </p><ul><li><p>Better SIFT in the main repository</p></li><li><p>Real-time Single Object Tracking using Deep Learning</p></li><li><p>(opencv_contrib):&nbsp;OpenCV&nbsp;bindings for Julia Programming Language</p></li><li><p>(dnn module) 3-5x faster inference on ARM, Improved ONNX support, fixes and optimizations in DNN CUDA backend.</p></li><li><p>and the license has been switched to Apache 2 from BSD 3-clause license. It doesn&#8217;t affect how we use OpenCV. It&#8217;s still free to use in all forms (including commercial use), Apache 2 license provides more legal protections. Read more <a href="https://opencv.org/opencv-is-to-change-the-license-to-apache-2/">here</a> for the rationale behind this decision.</p></li></ul><h2>AI Pay Grades</h2><p>Though there is a high demand for good AI talents in the industry, it may be unclear what the actual job offers look like around the world. Often times, ML Engineers like us are not really sure whether they are underpaid and (ironically) don&#8217;t have enough data to get an overall idea and better negotiate the compensation.  <a href="https://aipaygrad.es">aipaygrad.es</a> is a good initiative from the community to collect and show statistics from real job offers made. </p><p>Keep in mind, currently there is only a small amount of data available and mostly from US. So it&#8217;s not yet fully representative of the industry. Nevertheless keep an eye on this site for more data over a period and if possible spread the word and contribute. </p><h2>Landing an ML job</h2><div id="youtube2-V9fjtkPWD4E" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;V9fjtkPWD4E&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/V9fjtkPWD4E?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Even though many <a href="https://www2.deloitte.com/us/en/insights/industry/technology/ai-talent-challenges-shortage.html">studies</a> say that there is a shortage of ML Engineers, getting into an actual ML role might not be that easy.  In this session hosted by deeplearning.ai , technical recruiters from top companies like Pinterest and Scale AI explain practical approaches on how to land an ML job and build a career in the AI industry. Informative session for ML Engineers.</p><h2>AI fun :)</h2><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://twitter.com/KritiKKohli/status/1316814511567966209&quot;,&quot;full_text&quot;:&quot;&#8294;<span class=\&quot;tweet-fake-link\&quot;>@AnimaAnandkumar</span>&#8297; gives a memorable explanation of tensors &#8294;<span class=\&quot;tweet-fake-link\&quot;>@JupyterCon</span>&#8297; &#128516; &quot;,&quot;username&quot;:&quot;KritiKKohli&quot;,&quot;name&quot;:&quot;Kriti Kohli&quot;,&quot;profile_image_url&quot;:&quot;&quot;,&quot;date&quot;:&quot;Thu Oct 15 18:53:39 +0000 2020&quot;,&quot;photos&quot;:[{&quot;img_url&quot;:&quot;https://pbs.substack.com/media/EkZD0HlXsAEPPia.jpg&quot;,&quot;link_url&quot;:&quot;https://t.co/cvkh5amnws&quot;}],&quot;quoted_tweet&quot;:{},&quot;reply_count&quot;:0,&quot;retweet_count&quot;:91,&quot;like_count&quot;:611,&quot;impression_count&quot;:0,&quot;expanded_url&quot;:{},&quot;video_url&quot;:null,&quot;belowTheFold&quot;:true}" data-component-name="Twitter2ToDOM"></div><h2>Support this newsletter &#10084;&#65039;</h2><p>If you are getting value out of my work, consider supporting me on Patreon and unlock exclusive benefits.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.patreon.com/arunponnusamy&quot;,&quot;text&quot;:&quot;Become a Patron&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.patreon.com/arunponnusamy"><span>Become a Patron</span></a></p>]]></content:encoded></item><item><title><![CDATA[Detection Transformers; Nvidia Jetson Xavier NX DevKit; Google's Big Transfer (BiT); CVPR 2020]]></title><description><![CDATA[Vision Week Issue #5]]></description><link>https://newsletter.visiongeek.io/p/detection-transformers-nvidia-jetson</link><guid isPermaLink="false">https://newsletter.visiongeek.io/p/detection-transformers-nvidia-jetson</guid><dc:creator><![CDATA[Arun Ponnusamy]]></dc:creator><pubDate>Fri, 26 Jun 2020 07:36:52 GMT</pubDate><enclosure url="https://cdn.substack.com/image/fetch/h_600,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F043ff957-e896-421d-8135-eda0a629dd79_2400x1351.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3>Detection Transformers (DETR)</h3><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!26SX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F043ff957-e896-421d-8135-eda0a629dd79_2400x1351.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!26SX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F043ff957-e896-421d-8135-eda0a629dd79_2400x1351.jpeg 424w, https://substackcdn.com/image/fetch/$s_!26SX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F043ff957-e896-421d-8135-eda0a629dd79_2400x1351.jpeg 848w, https://substackcdn.com/image/fetch/$s_!26SX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F043ff957-e896-421d-8135-eda0a629dd79_2400x1351.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!26SX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F043ff957-e896-421d-8135-eda0a629dd79_2400x1351.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!26SX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F043ff957-e896-421d-8135-eda0a629dd79_2400x1351.jpeg" width="1456" height="820" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/043ff957-e896-421d-8135-eda0a629dd79_2400x1351.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:820,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:907036,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!26SX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F043ff957-e896-421d-8135-eda0a629dd79_2400x1351.jpeg 424w, https://substackcdn.com/image/fetch/$s_!26SX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F043ff957-e896-421d-8135-eda0a629dd79_2400x1351.jpeg 848w, https://substackcdn.com/image/fetch/$s_!26SX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F043ff957-e896-421d-8135-eda0a629dd79_2400x1351.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!26SX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F043ff957-e896-421d-8135-eda0a629dd79_2400x1351.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><p><a href="https://ai.googleblog.com/2017/08/transformer-novel-neural-network.html">Transformer networks</a> have gained a lot of popularity in the recent years. But mostly in the sequence modeling space like NLP, not much <a href="https://www.youtube.com/watch?v=bPmDX-dzNHw">used</a> in the computer vision space. </p><p>Recently Facebook AI research team has successfully used transformer networks for object detection making the end-to-end detection pipeline much simpler.               (<a href="https://arxiv.org/abs/2005.12872">paper</a> | <a href="https://github.com/facebookresearch/detr">code</a> | <a href="https://ai.facebook.com/blog/end-to-end-object-detection-with-transformers">blog</a>)</p><h3>Nvidia Jetson Xavier NX DevKit</h3><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uYhI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F61d01d79-950b-4132-8476-922024623e64_1508x633.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uYhI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F61d01d79-950b-4132-8476-922024623e64_1508x633.jpeg 424w, https://substackcdn.com/image/fetch/$s_!uYhI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F61d01d79-950b-4132-8476-922024623e64_1508x633.jpeg 848w, https://substackcdn.com/image/fetch/$s_!uYhI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F61d01d79-950b-4132-8476-922024623e64_1508x633.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!uYhI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F61d01d79-950b-4132-8476-922024623e64_1508x633.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uYhI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F61d01d79-950b-4132-8476-922024623e64_1508x633.jpeg" width="1456" height="611" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/61d01d79-950b-4132-8476-922024623e64_1508x633.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:611,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:90858,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uYhI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F61d01d79-950b-4132-8476-922024623e64_1508x633.jpeg 424w, https://substackcdn.com/image/fetch/$s_!uYhI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F61d01d79-950b-4132-8476-922024623e64_1508x633.jpeg 848w, https://substackcdn.com/image/fetch/$s_!uYhI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F61d01d79-950b-4132-8476-922024623e64_1508x633.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!uYhI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F61d01d79-950b-4132-8476-922024623e64_1508x633.jpeg 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><p>Nvidia <a href="https://developer.nvidia.com/blog/jetson-xavier-nx-the-worlds-smallest-ai-supercomputer/">announced</a> Jetson Xavier Nx last November but it wasn&#8217;t available to buy right away. After the wait, the <a href="https://www.nvidia.com/en-in/autonomous-machines/embedded-systems/jetson-xavier-nx/">devkit</a> has finally arrived. </p><h4>What&#8217;s new ?</h4><p>Nvidia announced <a href="https://developer.nvidia.com/embedded/jetson-nano-developer-kit">Jetson Nano</a> last year as entry level edge device for AI at $99. Nano was really well received in the market. Now Nvidia is upping the game with Xavier NX for heavy compute applications on edge which Nano can&#8217;t handle. In fact, Nvidia calls it &#8220;the World&#8217;s Smallest AI Supercomputer&#8221;. It comes at the same form factor as Nano but with way better compute capability at $399.</p><h3>Google opensources Big Transfer (BiT)</h3><p></p><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vk4N!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F82f0afd7-cf1b-4ecb-87b0-784543004660_1156x468.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vk4N!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F82f0afd7-cf1b-4ecb-87b0-784543004660_1156x468.png 424w, https://substackcdn.com/image/fetch/$s_!vk4N!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F82f0afd7-cf1b-4ecb-87b0-784543004660_1156x468.png 848w, https://substackcdn.com/image/fetch/$s_!vk4N!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F82f0afd7-cf1b-4ecb-87b0-784543004660_1156x468.png 1272w, https://substackcdn.com/image/fetch/$s_!vk4N!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F82f0afd7-cf1b-4ecb-87b0-784543004660_1156x468.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vk4N!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F82f0afd7-cf1b-4ecb-87b0-784543004660_1156x468.png" width="1156" height="468" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/82f0afd7-cf1b-4ecb-87b0-784543004660_1156x468.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:468,&quot;width&quot;:1156,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:126880,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vk4N!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F82f0afd7-cf1b-4ecb-87b0-784543004660_1156x468.png 424w, https://substackcdn.com/image/fetch/$s_!vk4N!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F82f0afd7-cf1b-4ecb-87b0-784543004660_1156x468.png 848w, https://substackcdn.com/image/fetch/$s_!vk4N!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F82f0afd7-cf1b-4ecb-87b0-784543004660_1156x468.png 1272w, https://substackcdn.com/image/fetch/$s_!vk4N!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F82f0afd7-cf1b-4ecb-87b0-784543004660_1156x468.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><p>Researchers at Google have found that pre-training computer vision models on very large scale datasets (way bigger than <a href="http://image-net.org/">ImageNet</a>) helps the model to learn quickly and generalize better when finetuned using transfer learning for other tasks even with vey small training data. This approach has been found useful in the <a href="https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html">language domain</a> in recent times. This work in computer vision shows the need for very large scale open datasets (one of the datasets with 300M images used in the research was an internal dataset). (<a href="https://arxiv.org/abs/1912.11370">paper</a> | <a href="https://github.com/google-research/big_transfer">code</a> | <a href="https://ai.googleblog.com/2020/05/open-sourcing-bit-exploring-large-scale.html">blog</a>)</p><h3>AI for Healthcare Nanodegree</h3><div id="youtube2-0Dhd-T4wDew" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;0Dhd-T4wDew&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/0Dhd-T4wDew?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Following <a href="https://www.coursera.org/specializations/ai-for-medicine">Coursera</a>, Udacity has also launched &#8220;<a href="https://www.udacity.com/course/ai-for-healthcare-nanodegree--nd320">AI for Healthcare</a>&#8221; nanodegree, a specialized course focusing on applying machine learning to medical data. With everything going on currently, applying AI to healthcare has gained more importance than ever. Checkout the <a href="https://www.udacity.com/course/ai-for-healthcare-nanodegree--nd320">curriculum</a> to learn more. </p><h3>CVPR 2020</h3><div id="youtube2-aHUYXtbwl_8" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;aHUYXtbwl_8&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/aHUYXtbwl_8?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p><a href="http://cvpr2020.thecvf.com/">CVPR</a> this year was conducted as a virtual conference due to the pandemic. Featured two special keynotes from <a href="https://www.youtube.com/watch?v=vgdVIeQKH-E">Satya Nadella</a>, CEO of Microsoft and <a href="https://www.youtube.com/watch?v=2vEay8zZfSo">Charlie Bell</a>, SVP of AWS. Many <a href="http://cvpr2020.thecvf.com/program/tutorials">tutorial</a> and <a href="http://cvpr2020.thecvf.com/workshops-schedule">workshop</a> organizers have generously posted the video recordings online. Checkout the individual websites for the tutorial / workshops you are interested in. </p><h3>Deep learning lectures from DeepMind</h3><div id="youtube2-7R52wiUgxZI" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;7R52wiUgxZI&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/7R52wiUgxZI?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>DeepMind in collaboration with UCL has shared a series of deep learning lectures by research scientists at deepmind. 10 out of 12 planned lectures are available on <a href="https://www.youtube.com/playlist?list=PLqYmG7hTraZCDxZ44o4p3N5Anz3lLRVZF">YouTube</a>. Probably the course is affected due to the pandemic. Nevertheless, the available lectures are definitely worth checking out. </p><h3>Fei-Fei Li on Exponential View Podcast</h3><p><a href="https://twitter.com/drfeifei">Dr.Fei-Fei Li</a>, creator of ImageNet was featured on Exponential View podcast. She discussed the early days of <a href="http://image-net.org/">ImageNet</a>, the impact it has created, her current work on healthcare and the vision of <a href="https://hai.stanford.edu/">Human centered AI Lab</a> (HAI) she has cofounded at Stanford. Checkout the full episode <a href="https://open.spotify.com/episode/7e2vLkJbzTLDloE6PT0n1M?si=r02M7aftQfGHrhExDykVxw">here</a> or wherever you generally consume podcasts.</p><h3>AI fun :)</h3><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!z3yL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F758c5960-95a9-407a-b28d-02875cdea1e5_768x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!z3yL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F758c5960-95a9-407a-b28d-02875cdea1e5_768x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!z3yL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F758c5960-95a9-407a-b28d-02875cdea1e5_768x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!z3yL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F758c5960-95a9-407a-b28d-02875cdea1e5_768x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!z3yL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F758c5960-95a9-407a-b28d-02875cdea1e5_768x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!z3yL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F758c5960-95a9-407a-b28d-02875cdea1e5_768x768.jpeg" width="530" height="530" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/758c5960-95a9-407a-b28d-02875cdea1e5_768x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:768,&quot;resizeWidth&quot;:530,&quot;bytes&quot;:137194,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!z3yL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F758c5960-95a9-407a-b28d-02875cdea1e5_768x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!z3yL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F758c5960-95a9-407a-b28d-02875cdea1e5_768x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!z3yL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F758c5960-95a9-407a-b28d-02875cdea1e5_768x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!z3yL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F758c5960-95a9-407a-b28d-02875cdea1e5_768x768.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a>]]></content:encoded></item><item><title><![CDATA[YOLOv4; PyTorch 1.5; Nvidia DGX A100; Tesla at ScaledML Conference]]></title><description><![CDATA[Vision Week Issue #4]]></description><link>https://newsletter.visiongeek.io/p/yolov4-pytorch-15-nvidia-dgx-a100</link><guid isPermaLink="false">https://newsletter.visiongeek.io/p/yolov4-pytorch-15-nvidia-dgx-a100</guid><dc:creator><![CDATA[Arun Ponnusamy]]></dc:creator><pubDate>Fri, 22 May 2020 10:05:40 GMT</pubDate><enclosure url="https://cdn.substack.com/image/fetch/h_600,c_limit,f_auto,q_auto:good/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F38a9d794-ed46-4c0c-b293-ed175c57223c_1216x587.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3>YOLO v4 - Better Speed and Accuracy</h3><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5b52!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F7010bc03-b55a-4fcc-b2cb-9ef57deaee38_536x423.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5b52!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F7010bc03-b55a-4fcc-b2cb-9ef57deaee38_536x423.png 424w, https://substackcdn.com/image/fetch/$s_!5b52!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F7010bc03-b55a-4fcc-b2cb-9ef57deaee38_536x423.png 848w, https://substackcdn.com/image/fetch/$s_!5b52!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F7010bc03-b55a-4fcc-b2cb-9ef57deaee38_536x423.png 1272w, https://substackcdn.com/image/fetch/$s_!5b52!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F7010bc03-b55a-4fcc-b2cb-9ef57deaee38_536x423.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5b52!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F7010bc03-b55a-4fcc-b2cb-9ef57deaee38_536x423.png" width="536" height="423" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/7010bc03-b55a-4fcc-b2cb-9ef57deaee38_536x423.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:423,&quot;width&quot;:536,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:79592,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5b52!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F7010bc03-b55a-4fcc-b2cb-9ef57deaee38_536x423.png 424w, https://substackcdn.com/image/fetch/$s_!5b52!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F7010bc03-b55a-4fcc-b2cb-9ef57deaee38_536x423.png 848w, https://substackcdn.com/image/fetch/$s_!5b52!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F7010bc03-b55a-4fcc-b2cb-9ef57deaee38_536x423.png 1272w, https://substackcdn.com/image/fetch/$s_!5b52!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F7010bc03-b55a-4fcc-b2cb-9ef57deaee38_536x423.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><p>You might have heard by now, YOLOv4 has been released. But not actually from the original author of YOLO, Joseph Redmon. This is from different researchers including Alexey Bochkovskiy who is known for his popular <a href="https://github.com/AlexeyAB/darknet">github repo</a> on DarkNet, forked from original <a href="https://github.com/pjreddie/darknet">DarkNet</a> repo. He has made several improvements to DarkNet. In a way his repo was more popular than the original one. </p><p>YOLOv4 employs more modern state-of-the-art techniques such as Weighted-Residual-Connections (WRC), Cross-Stage-Partial-connections (CSP), Cross mini-Batch Normalization (CmBN), Self-adversarial-training (SAT) and Mish-activation to achieve better speed and accuracy. Checkout the <a href="https://arxiv.org/abs/2004.10934">paper</a> and <a href="https://github.com/AlexeyAB/darknet">code</a> for more details.</p><h3>Nvidia DGX A100</h3><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Sar6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F38a9d794-ed46-4c0c-b293-ed175c57223c_1216x587.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Sar6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F38a9d794-ed46-4c0c-b293-ed175c57223c_1216x587.png 424w, https://substackcdn.com/image/fetch/$s_!Sar6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F38a9d794-ed46-4c0c-b293-ed175c57223c_1216x587.png 848w, https://substackcdn.com/image/fetch/$s_!Sar6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F38a9d794-ed46-4c0c-b293-ed175c57223c_1216x587.png 1272w, https://substackcdn.com/image/fetch/$s_!Sar6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F38a9d794-ed46-4c0c-b293-ed175c57223c_1216x587.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Sar6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F38a9d794-ed46-4c0c-b293-ed175c57223c_1216x587.png" width="720" height="347.5657894736842" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/38a9d794-ed46-4c0c-b293-ed175c57223c_1216x587.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:587,&quot;width&quot;:1216,&quot;resizeWidth&quot;:720,&quot;bytes&quot;:1070018,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Sar6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F38a9d794-ed46-4c0c-b293-ed175c57223c_1216x587.png 424w, https://substackcdn.com/image/fetch/$s_!Sar6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F38a9d794-ed46-4c0c-b293-ed175c57223c_1216x587.png 848w, https://substackcdn.com/image/fetch/$s_!Sar6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F38a9d794-ed46-4c0c-b293-ed175c57223c_1216x587.png 1272w, https://substackcdn.com/image/fetch/$s_!Sar6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F38a9d794-ed46-4c0c-b293-ed175c57223c_1216x587.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><p>Nvidia CEO Jenson Huang announced new products in his &#8220;<a href="https://youtu.be/nurL3N1Etuc">kitchen keynote</a>&#8221; as part of GTC 2020. The highlight was the absolute GPU beast DGX A100 which is aimed at datacenter usage or as a server node for high intensive training/inference for your research team, amazingly priced at $199K (yes, you read that right) along with other announcements real time ray tracing and audio-to-face (A2F).</p><h3>PyTorch 1.5</h3><p>PyTorch 1.5 has been released with several changes and additions along with updates for torchvision, torchaudio and torchtext. Facebook (maintainer of PyTorch) is also partnering with AWS for TorchServe, a tool that helps you deploy your PyTorch models to the cloud and use them in production with API calls over the internet. Checkout the official <a href="https://pytorch.org/blog/pytorch-library-updates-new-model-serving-library/">blog post</a> for more details.</p><h3>Machine Learning Yearning</h3><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rfRi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3947b3fa-dbbc-4e64-9aab-cb0a18c5008d_704x436.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rfRi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3947b3fa-dbbc-4e64-9aab-cb0a18c5008d_704x436.jpeg 424w, https://substackcdn.com/image/fetch/$s_!rfRi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3947b3fa-dbbc-4e64-9aab-cb0a18c5008d_704x436.jpeg 848w, https://substackcdn.com/image/fetch/$s_!rfRi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3947b3fa-dbbc-4e64-9aab-cb0a18c5008d_704x436.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!rfRi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3947b3fa-dbbc-4e64-9aab-cb0a18c5008d_704x436.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rfRi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3947b3fa-dbbc-4e64-9aab-cb0a18c5008d_704x436.jpeg" width="704" height="436" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/3947b3fa-dbbc-4e64-9aab-cb0a18c5008d_704x436.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:436,&quot;width&quot;:704,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:38397,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rfRi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3947b3fa-dbbc-4e64-9aab-cb0a18c5008d_704x436.jpeg 424w, https://substackcdn.com/image/fetch/$s_!rfRi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3947b3fa-dbbc-4e64-9aab-cb0a18c5008d_704x436.jpeg 848w, https://substackcdn.com/image/fetch/$s_!rfRi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3947b3fa-dbbc-4e64-9aab-cb0a18c5008d_704x436.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!rfRi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3947b3fa-dbbc-4e64-9aab-cb0a18c5008d_704x436.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><p>Doing online courses and assignments is one thing, applying machine learning in the real world and deploying the model to production is completely a different thing. <a href="https://twitter.con/andrewyng">Dr. Andrew Ng</a> shares his practical experience building real world products at companies like Baidu and Google. </p><p>This book is a treasure trove of practical knowledge and best practices for AI Engineers. Be sure to get the draft of the book for free <a href="https://www.deeplearning.ai/machine-learning-yearning/">here</a>. A must read.</p><h3>Tesla at ScaledML Conference</h3><div id="youtube2-hx7BXih7zx8" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;hx7BXih7zx8&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/hx7BXih7zx8?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p><a href="https://twitter.com/karpathy">Andrej Karpathy</a> (Director of AI at Tesla) gave a talk at <a href="http://scaledml.org/2020/">ScaledML</a> conference 2020 on how the company is using AI to get closer to Full Self Driving. Particularly he discussed about stop sign detection which looks like a simple problem but in reality how challenging it is in production. Also he discussed how Tesla is achieving these results without actually having a LiDAR, just with cameras and few other sensors (radar, ultrasonic) which is quite difficult and impressive. </p><h3>ICLR 2020</h3><p>International Conference on Learning Representations (ICLR) is one of the premier conferences in the field of ML/AI. This year it was planned to be conducted at Ethiopia. But due to the current corona virus situation, the organizing committee has decided to make it a virtual conference going completely online.</p><p>I think this is the first premier conference to go completely online. Last year <a href="http://neurips.cc">NeurIPS</a> was live streamed online. But it was additional to the physical event happened in Canada. A portion of the talks (including the talks from Yann LeCun, Yoshua Bengio, Andrew Ng etc.) and workshops from ICLR have been made available <a href="https://iclr.cc/virtual_2020/index.html">online</a>. Feel free to check it out.</p><h3>The age of AI</h3><div id="youtube2-5IvQ3fYKnfM" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;5IvQ3fYKnfM&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/5IvQ3fYKnfM?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>YouTube has released an original <a href="https://www.youtube.com/playlist?list=PLjq6DwYksrzz_fsWIpPcf6V7p2RNAneKc">series</a> on AI featuring Robert Downey Jr. as host. The series explores the current state of the art works in AI, the impact it can have in our lives and what the future has in store for us. Interesting watch if you have some time to kill.</p><h3>AI fun :)</h3><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://twitter.com/xkcdComic/status/864703723196391424&quot;,&quot;full_text&quot;:&quot;Machine Learning <a class=\&quot;tweet-url\&quot; href=\&quot;https://xkcd.com/1838/\&quot;>xkcd.com/1838/</a> <a class=\&quot;tweet-url\&quot; href=\&quot;https://m.xkcd.com/1838/\&quot;>m.xkcd.com/1838/</a> &quot;,&quot;username&quot;:&quot;xkcdComic&quot;,&quot;name&quot;:&quot;XKCD Comic&quot;,&quot;profile_image_url&quot;:&quot;&quot;,&quot;date&quot;:&quot;Wed May 17 04:46:47 +0000 2017&quot;,&quot;photos&quot;:[{&quot;img_url&quot;:&quot;https://pbs.substack.com/media/DAALdBEVYAAwao2.jpg&quot;,&quot;link_url&quot;:&quot;https://t.co/Uc0b1z0mgU&quot;}],&quot;quoted_tweet&quot;:{},&quot;reply_count&quot;:0,&quot;retweet_count&quot;:3875,&quot;like_count&quot;:4862,&quot;impression_count&quot;:0,&quot;expanded_url&quot;:{},&quot;video_url&quot;:null,&quot;belowTheFold&quot;:true}" data-component-name="Twitter2ToDOM"></div>]]></content:encoded></item><item><title><![CDATA[Social Distance Monitoring; AI for Medicine Specialization; fast.ai covid-19 report]]></title><description><![CDATA[Vision Week Issue #3]]></description><link>https://newsletter.visiongeek.io/p/social-distance-monitoring-ai-for</link><guid isPermaLink="false">https://newsletter.visiongeek.io/p/social-distance-monitoring-ai-for</guid><dc:creator><![CDATA[Arun Ponnusamy]]></dc:creator><pubDate>Sun, 26 Apr 2020 04:20:41 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/15iIV1Lff-M" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3>Social Distance Monitoring</h3><div id="youtube2-15iIV1Lff-M" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;15iIV1Lff-M&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/15iIV1Lff-M?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Prof. Andrew Ng's venture landing.ai has developed a tool for monitoring social distancing. They have also <a href="https://landing.ai/landing-ai-creates-an-ai-tool-to-help-customers-monitor-social-distancing-in-the-workplace/">shared</a> the techniques on how they built it so that other people can build themselves a similar tool if needed (the actual source code is not open sourced though).</p><h3>TensorFlow Dev Summit 2020</h3><div id="youtube2-_lsjCH3fd00" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;_lsjCH3fd00&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/_lsjCH3fd00?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>TensorFlow Dev Summit this year was a completely online event due to covid-19 concerns. All the talks are available on TensorFlow <a href="https://www.youtube.com/playlist?list=PLQY2H8rRoyvzuJw20FG82Lgm2SZjTdIXU">YouTube</a> channel. In case you missed it, feel free to catch up.</p><h3>AI for Medicine Specialization</h3><div id="youtube2-Rp7qqjlBeRY" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;Rp7qqjlBeRY&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/Rp7qqjlBeRY?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>deeplearning.ai has launched a new specialization on Coursera &#8220;<a href="https://www.coursera.org/specializations/ai-for-medicine">AI for Medicine</a>&#8221; consisting of three courses taught by experts from Stanford. If you are interested in applying AI in the healthcare space it's definitely worth checking out.</p><h3>Covid-19 - a realistic look</h3><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UY5I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F30f4251c-8a0b-4a5b-9f22-67a5ff5699eb_1424x965.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UY5I!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F30f4251c-8a0b-4a5b-9f22-67a5ff5699eb_1424x965.png 424w, https://substackcdn.com/image/fetch/$s_!UY5I!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F30f4251c-8a0b-4a5b-9f22-67a5ff5699eb_1424x965.png 848w, https://substackcdn.com/image/fetch/$s_!UY5I!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F30f4251c-8a0b-4a5b-9f22-67a5ff5699eb_1424x965.png 1272w, https://substackcdn.com/image/fetch/$s_!UY5I!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F30f4251c-8a0b-4a5b-9f22-67a5ff5699eb_1424x965.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UY5I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F30f4251c-8a0b-4a5b-9f22-67a5ff5699eb_1424x965.png" width="1424" height="965" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/30f4251c-8a0b-4a5b-9f22-67a5ff5699eb_1424x965.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:965,&quot;width&quot;:1424,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:80618,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UY5I!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F30f4251c-8a0b-4a5b-9f22-67a5ff5699eb_1424x965.png 424w, https://substackcdn.com/image/fetch/$s_!UY5I!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F30f4251c-8a0b-4a5b-9f22-67a5ff5699eb_1424x965.png 848w, https://substackcdn.com/image/fetch/$s_!UY5I!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F30f4251c-8a0b-4a5b-9f22-67a5ff5699eb_1424x965.png 1272w, https://substackcdn.com/image/fetch/$s_!UY5I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F30f4251c-8a0b-4a5b-9f22-67a5ff5699eb_1424x965.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><p>Jeremy and Rachel from fast.ai has put together a <a href="https://www.fast.ai/2020/03/09/coronavirus/">realistic analysis</a> on the current situation. As they point out, just trying to stay calm and not panic is not enough. Staying informed and preparing ourselves both mentally and physically is as important as staying sane in this difficult time.</p><h3>GANs in Action</h3><p>If you have been thinking to learn to work with GANs but didn&#8217;t really get hold of a good comprehensive resource then this book &#8220;<a href="https://www.manning.com/books/gans-in-action">GANs in Action&nbsp;- Deep learning with Generative Adversarial Networks</a>&#8221; is for you. This manning publication is definitely one of the good resources out there on GAN covering from the basic idea to state of the art results.</p><h3>TensorFlow without a PhD</h3><p>If you are a fan of <a href="https://twitter.com/martin_gorner">Martin G&#246;rner</a>&#8217;s without a phd series, then you should definitely star this <a href="https://github.com/GoogleCloudPlatform/tensorflow-without-a-phd">github repo</a>. This repo contains the collection of resources for all the talks he has given in this &#8220;without a phd&#8221; series. </p><h3>AlphaGO movie </h3><div id="youtube2-WXuK6gekU1Y" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;WXuK6gekU1Y&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/WXuK6gekU1Y?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>DeepMind has released the full documentary &#8220;AlphaGO&#8221; on <a href="https://www.youtube.com/watch?v=WXuK6gekU1Y">YouTube</a>. Previously it was available on Netflix. Just to give some background, AlphaGO is the computer program that beat the 18 times World Champion Lee Sedol (Professional GO player from South Korea) in the board game &#8220;GO&#8221;. </p><p>GO is considered to be a complex game to solve for computer programs. AlphaGO beating a world champion is considered a major breakthrough in the history of AI. If you are looking for something inspiring to watch during this quarantine period, GO for it.</p><h3>AI fun :)</h3><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://twitter.com/deeplearningai_/status/1196498862958989312&quot;,&quot;full_text&quot;:&quot;We are realists if nothing else <span class=\&quot;tweet-fake-link\&quot;>#AIFun</span> &quot;,&quot;username&quot;:&quot;deeplearningai_&quot;,&quot;name&quot;:&quot;deeplearning.ai&quot;,&quot;profile_image_url&quot;:&quot;&quot;,&quot;date&quot;:&quot;Mon Nov 18 18:42:13 +0000 2019&quot;,&quot;photos&quot;:[{&quot;img_url&quot;:&quot;https://pbs.substack.com/media/EJrRX7-UcAAp-WR.jpg&quot;,&quot;link_url&quot;:&quot;https://t.co/2XPkwQOpNj&quot;}],&quot;quoted_tweet&quot;:{},&quot;reply_count&quot;:0,&quot;retweet_count&quot;:226,&quot;like_count&quot;:821,&quot;impression_count&quot;:0,&quot;expanded_url&quot;:{},&quot;video_url&quot;:null,&quot;belowTheFold&quot;:true}" data-component-name="Twitter2ToDOM"></div><h3>Bonus: </h3><h3>How China tracks everyone</h3><p>Sneak peak into how China does surveillance at scale.</p><div id="youtube2-CLo3e1Pak-Y" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;CLo3e1Pak-Y&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/CLo3e1Pak-Y?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div>]]></content:encoded></item><item><title><![CDATA[TensorFlow World; Microsoft Azure Kinect; Google Coral out of beta]]></title><description><![CDATA[Vision Week Issue #2]]></description><link>https://newsletter.visiongeek.io/p/tensorflow-world-microsoft-azure</link><guid isPermaLink="false">https://newsletter.visiongeek.io/p/tensorflow-world-microsoft-azure</guid><dc:creator><![CDATA[Arun Ponnusamy]]></dc:creator><pubDate>Mon, 25 Nov 2019 15:11:31 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!yzHY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F48ba745d-a9b8-457e-889b-ce71c4aed660_1921x1081.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3>O&#8217;Reilly TensorFlow World</h3><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yzHY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F48ba745d-a9b8-457e-889b-ce71c4aed660_1921x1081.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yzHY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F48ba745d-a9b8-457e-889b-ce71c4aed660_1921x1081.png 424w, https://substackcdn.com/image/fetch/$s_!yzHY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F48ba745d-a9b8-457e-889b-ce71c4aed660_1921x1081.png 848w, https://substackcdn.com/image/fetch/$s_!yzHY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F48ba745d-a9b8-457e-889b-ce71c4aed660_1921x1081.png 1272w, https://substackcdn.com/image/fetch/$s_!yzHY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F48ba745d-a9b8-457e-889b-ce71c4aed660_1921x1081.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yzHY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F48ba745d-a9b8-457e-889b-ce71c4aed660_1921x1081.png" width="1100" height="619" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/48ba745d-a9b8-457e-889b-ce71c4aed660_1921x1081.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:619,&quot;width&quot;:1100,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:58271,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yzHY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F48ba745d-a9b8-457e-889b-ce71c4aed660_1921x1081.png 424w, https://substackcdn.com/image/fetch/$s_!yzHY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F48ba745d-a9b8-457e-889b-ce71c4aed660_1921x1081.png 848w, https://substackcdn.com/image/fetch/$s_!yzHY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F48ba745d-a9b8-457e-889b-ce71c4aed660_1921x1081.png 1272w, https://substackcdn.com/image/fetch/$s_!yzHY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F48ba745d-a9b8-457e-889b-ce71c4aed660_1921x1081.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><p>TensorFlow team has teamed up with O&#8217;Reilly to host their first <a href="https://conferences.oreilly.com/tensorflow/tf-ca">TensorFlow World</a> conference earlier this month. If you are wondering how does this differ from TensorFlow Dev Summit, well, the key difference is in Dev Summit mostly people from the TensorFlow team will present their work. But TensorFlow World is a place for everyone in the community to learn and share what they are building with TensorFlow.</p><p>That means you can see talks and sessions from diverse set of people including TensorFlow team. All the sessions from TensorFlow team is up on TensorFlow <a href="https://www.youtube.com/playlist?list=PLQY2H8rRoyvxcmHHRftsuiO1GyinVAwUg">YouTube channel</a>. Other talks are on the O&#8217;Reilly <a href="https://www.oreilly.com/">online learning platform</a>. O&#8217;Reilly says all the recorded sessions will be available on the platform after three weeks from the conference.  They have a 10 day free trial. No credit card required. Give it a try to watch all the <a href="https://conferences.oreilly.com/tensorflow/tf-ca/public/schedule/full/public">sessions</a>.</p><h3>Azure Kinect</h3><div id="youtube2-jJglCYFiodI" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;jJglCYFiodI&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/jJglCYFiodI?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Microsoft has released a new version of Kinect called &#8216;<a href="https://azure.microsoft.com/en-us/services/kinect-dk/">Azure Kinect DK</a>&#8217;. DK stands for developer kit. Original version of <a href="https://en.wikipedia.org/wiki/Kinect">Kinect</a> was released almost a decade ago for Xbox. It was mainly intended for gaming use with Xbox. But people also used it for computer vision research because of the depth sensing capability it had. </p><p>This time Azure Kinect is solely intended for developers and companies to build things and not intended for regular consumers and this is not meant to replace the existing kinect for Xbox. Microsoft says they have put together their best sensors to build AI applications. It has a 12 MP RGB camera, 1 MP depth sensing camera and microphone arrays. It doesn&#8217;t have onboard processor but it can be connected to a CPU to process the wealth of information it captures to build vision and speech applications. </p><h3>Real time video gesture recognition</h3><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xE_d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4b3fd37-8892-4389-8e44-79ee79086f16_600x337.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xE_d!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4b3fd37-8892-4389-8e44-79ee79086f16_600x337.gif 424w, https://substackcdn.com/image/fetch/$s_!xE_d!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4b3fd37-8892-4389-8e44-79ee79086f16_600x337.gif 848w, https://substackcdn.com/image/fetch/$s_!xE_d!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4b3fd37-8892-4389-8e44-79ee79086f16_600x337.gif 1272w, https://substackcdn.com/image/fetch/$s_!xE_d!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4b3fd37-8892-4389-8e44-79ee79086f16_600x337.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xE_d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4b3fd37-8892-4389-8e44-79ee79086f16_600x337.gif" width="600" height="337" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/e4b3fd37-8892-4389-8e44-79ee79086f16_600x337.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:337,&quot;width&quot;:600,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1579121,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xE_d!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4b3fd37-8892-4389-8e44-79ee79086f16_600x337.gif 424w, https://substackcdn.com/image/fetch/$s_!xE_d!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4b3fd37-8892-4389-8e44-79ee79086f16_600x337.gif 848w, https://substackcdn.com/image/fetch/$s_!xE_d!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4b3fd37-8892-4389-8e44-79ee79086f16_600x337.gif 1272w, https://substackcdn.com/image/fetch/$s_!xE_d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4b3fd37-8892-4389-8e44-79ee79086f16_600x337.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><p>Researchers at MIT have developed a new technique &#8220;Temporal Shift Module (TSM)&#8221; to do video classification efficiently on low compute devices. Generally doing video activity recognition in real time on edge devices is hard because of the high compute. In video classification we look at sequence of frames to predict the class as opposed to looking at a single frame at a time for image classification or object detection. The <a href="https://www.youtube.com/watch?v=0T6u7S_gq-4">demo</a> runs in real time on Nvidia Jetson Nano under 10 Watts. (<a href="https://arxiv.org/abs/1811.08383">Paper</a> | <a href="https://github.com/mit-han-lab/temporal-shift-module">GitHub</a> | <a href="https://hanlab.mit.edu/projects/tsm/">Site</a>)</p><h3>CVPR 2019</h3><p><a href="http://cvpr2019.thecvf.com/">Computer Vision and Pattern Recognition</a> (CVPR) is one of the premier conferences in computer vision. CVPR 2019 was over earlier this year. Not all of us can afford to travel and attend the conference in person. Luckily there is this thing called &#8216;internet&#8217;. <a href="https://www.thecvf.com/">Computer Vision foundation</a> has made lot of the sessions available online. You can find the video recordings (if available) on <a href="https://www.youtube.com/channel/UC0n76gicaarsN_Y9YShWwhw/videos">YouTube</a> and slides under each session page on the conference <a href="http://cvpr2019.thecvf.com/">website</a>. This really helps to get a sense of what&#8217;s going on in the research frontier.</p><h3>Ancient Secrets of Computer Vision</h3><div id="youtube2-8jXIAWg_yHU" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;8jXIAWg_yHU&quot;,&quot;startTime&quot;:&quot;677s&quot;,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/8jXIAWg_yHU?start=677s&amp;rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Joseph Redmon (the author of YOLO/DarkNet) teaches a <a href="https://pjreddie.com/courses/computer-vision/">computer vision course</a> at the University of Washington. He has generously posted the video lectures on <a href="https://www.youtube.com/playlist?list=PLjMXczUzEYcHvw5YYSU92WrY8IwhTuq7p">YouTube</a>. It&#8217;s definitely one of the good introduction to CV courses available online. Feel free to check it out. </p><h3>Google Coral TPU graduates out of beta</h3><p>Google launched it&#8217;s new hardware <a href="https://coral.ai">Coral Edge TPU</a> earlier this year in March for AI at the edge. After six months now its stable and <a href="https://developers.googleblog.com/2019/10/coral-moves-out-of-beta.html">out of beta</a>. It runs models in a specific TensorFlow Lite edgetpu format very efficiently for low latency real time applications. AI on the edge is off to a good start. Long way to go though !</p><h3>3Blue1Brown on Siraj Raval Podcast</h3><p>You might know <a href="https://twitter.com/3blue1brown">Grant Sanderson</a> from his awesome YouTube channel &#8220;<a href="https://www.youtube.com/channel/UCYO_jab_esuFRV4b17AJtAw">3Blue1Brown</a>&#8221;. He was recently interviewed by Siraj on his <a href="https://anchor.fm/sirajraval/episodes/3Blue1Brown--Siraj-Raval-Podcast-3-e518il">podcast</a> where they discussed about doing math animations, Grant&#8217;s recent visit to India and more. Listen to the episode to learn more. It is available on <a href="https://podcasts.google.com/?feed=aHR0cHM6Ly9hbmNob3IuZm0vcy9iZmIxMGMwL3BvZGNhc3QvcnNz&amp;episode=NGIxOGIxMDItOWU3My1jNWE4LTY4NjktYWUzZTRkODQ3NzY1">Google Podcasts</a>, <a href="https://open.spotify.com/show/4qf0D4LRvdlfZBkq1qqywT">Spotify</a> and possibly wherever you consume your podcasts.</p><h3>AI fun :)</h3><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://twitter.com/xkcdComic/status/514649660318056448&quot;,&quot;full_text&quot;:&quot;Tasks <a class=\&quot;tweet-url\&quot; href=\&quot;http://xkcd.com/1425/\&quot;>xkcd.com/1425/</a> <a class=\&quot;tweet-url\&quot; href=\&quot;http://m.xkcd.com/1425/\&quot;>m.xkcd.com/1425/</a> &quot;,&quot;username&quot;:&quot;xkcdComic&quot;,&quot;name&quot;:&quot;XKCD Comic&quot;,&quot;profile_image_url&quot;:&quot;&quot;,&quot;date&quot;:&quot;Wed Sep 24 05:36:55 +0000 2014&quot;,&quot;photos&quot;:[{&quot;img_url&quot;:&quot;https://pbs.substack.com/media/ByRnI6HIQAA5p_x.png&quot;,&quot;link_url&quot;:&quot;http://t.co/mUrGCCZ6tK&quot;}],&quot;quoted_tweet&quot;:{},&quot;reply_count&quot;:0,&quot;retweet_count&quot;:301,&quot;like_count&quot;:141,&quot;impression_count&quot;:0,&quot;expanded_url&quot;:{},&quot;video_url&quot;:null,&quot;belowTheFold&quot;:true}" data-component-name="Twitter2ToDOM"></div>]]></content:encoded></item><item><title><![CDATA[RL Specialization; Tesla acquires DeepScale; OpenAI at TC DisruptSF]]></title><description><![CDATA[Vision Week Issue #1]]></description><link>https://newsletter.visiongeek.io/p/rl-specialization-tesla-acquires</link><guid isPermaLink="false">https://newsletter.visiongeek.io/p/rl-specialization-tesla-acquires</guid><dc:creator><![CDATA[Arun Ponnusamy]]></dc:creator><pubDate>Thu, 24 Oct 2019 11:37:26 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!jEmM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb2fd4d-64f6-4a6a-a423-c9409fcc9916_754x378.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3>RL Specialization from UAlberta on Coursera</h3><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jEmM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb2fd4d-64f6-4a6a-a423-c9409fcc9916_754x378.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jEmM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb2fd4d-64f6-4a6a-a423-c9409fcc9916_754x378.png 424w, https://substackcdn.com/image/fetch/$s_!jEmM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb2fd4d-64f6-4a6a-a423-c9409fcc9916_754x378.png 848w, https://substackcdn.com/image/fetch/$s_!jEmM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb2fd4d-64f6-4a6a-a423-c9409fcc9916_754x378.png 1272w, https://substackcdn.com/image/fetch/$s_!jEmM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb2fd4d-64f6-4a6a-a423-c9409fcc9916_754x378.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jEmM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb2fd4d-64f6-4a6a-a423-c9409fcc9916_754x378.png" width="754" height="378" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/8bb2fd4d-64f6-4a6a-a423-c9409fcc9916_754x378.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:378,&quot;width&quot;:754,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:9187,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jEmM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb2fd4d-64f6-4a6a-a423-c9409fcc9916_754x378.png 424w, https://substackcdn.com/image/fetch/$s_!jEmM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb2fd4d-64f6-4a6a-a423-c9409fcc9916_754x378.png 848w, https://substackcdn.com/image/fetch/$s_!jEmM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb2fd4d-64f6-4a6a-a423-c9409fcc9916_754x378.png 1272w, https://substackcdn.com/image/fetch/$s_!jEmM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb2fd4d-64f6-4a6a-a423-c9409fcc9916_754x378.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><p>University of Alberta and Alberta Machine Intelligence Institute (AMII) have come together to launch a <a href="https://www.coursera.org/specializations/reinforcement-learning">specialization </a>(4 courses) on Reinforcement Learning on Coursera. If you have been trying to learn RL, you might already know that there are not a lot of well structured go-to courses out there. There are some really good resources like Sutton &amp; Barto&#8217;s <a href="http://incompleteideas.net/book/the-book.html">text book</a>, David Silver&#8217;s <a href="https://www.youtube.com/playlist?list=PLqYmG7hTraZDM-OYHWgPebj2MfCFzFObQ">course lectures</a> and UC Berkeley RL <a href="https://www.youtube.com/playlist?list=PLkFD6_40KJIxJMR-j5A1mkxK26gh_qg37">course</a> <a href="https://sites.google.com/view/deep-rl-bootcamp/lectures">lectures</a> on YouTube, OpenAI&#8217;s <a href="https://spinningup.openai.com/en/latest/">spinning up</a> in deep RL. But they lack well structured assignments or proper beginner-friendly MOOC setup. </p><h4>Why this matters: </h4><p>The reason this course looks promising is, it&#8217;s from the people who work directly with the great minds in RL like Prof. Sutton. In fact Sutton himself is involved in the creation of this course. I kind of feel like these courses might be the video version of Sutton&#8217;s text book. But it&#8217;s definitely worth checking out. Give it a try and let me know your thoughts if you manage to finish any of the courses in the specialization.  </p><h3>OpenAI at TechCrunch Disrupt SF </h3><div id="youtube2-14Qfi6n-U4U" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;14Qfi6n-U4U&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/14Qfi6n-U4U?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Sam Altman (CEO) and Greg Brockman (CTO) from OpenAI were at TechCrunch Disrupt San Francisco and discussed some of the company&#8217;s earlier decisions (forming capped for profit OpenAI LP, partnership with Microsoft, GPT-2 etc) and future roadmap for OpenAI. Greg also showed a demo of their recent experiment with multi-agent RL and how the agents discovered tool usage.</p><h3>Tesla acquires DeepScale</h3><p>Tesla has <a href="https://techcrunch.com/2019/10/01/tesla-acquires-computer-vision-startup-deepscale-in-push-towards-autonomy/">acquired</a> DeepScale (the company behind SqueezeNet <a href="https://arxiv.org/abs/1602.07360">paper</a>). SqueezeNet was one of the first attempts in creating smaller models without losing too much accuracy. This acquisition clearly shows the need for efficient models that can run on edge devices with smaller footprint. AI on edge is definitely booming. </p><h3>Introduction to TensorFlow Lite on Udacity</h3><p><a href="https://www.tensorflow.org/lite">TensorFlow Lite</a> team has launched a <a href="https://www.udacity.com/course/intro-to-tensorflow-lite--ud190">course</a> on Udacity covering deploying TFLite models to mobile and edge devices. Google also released their <a href="https://coral.ai">Coral Edge TPU</a> earlier this year to advance AI applications on the edge. The course may not be advanced in-depth course. But it can definitely serve as a good comprehensive introduction to TensorFlow Lite.</p><h3>PyTorch Mobile</h3><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bIfF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F432a0b49-3842-43f5-ada5-74e74d669d57_1067x381.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bIfF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F432a0b49-3842-43f5-ada5-74e74d669d57_1067x381.jpeg 424w, https://substackcdn.com/image/fetch/$s_!bIfF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F432a0b49-3842-43f5-ada5-74e74d669d57_1067x381.jpeg 848w, https://substackcdn.com/image/fetch/$s_!bIfF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F432a0b49-3842-43f5-ada5-74e74d669d57_1067x381.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!bIfF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F432a0b49-3842-43f5-ada5-74e74d669d57_1067x381.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bIfF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F432a0b49-3842-43f5-ada5-74e74d669d57_1067x381.jpeg" width="1067" height="381" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/432a0b49-3842-43f5-ada5-74e74d669d57_1067x381.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:381,&quot;width&quot;:1067,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:55626,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bIfF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F432a0b49-3842-43f5-ada5-74e74d669d57_1067x381.jpeg 424w, https://substackcdn.com/image/fetch/$s_!bIfF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F432a0b49-3842-43f5-ada5-74e74d669d57_1067x381.jpeg 848w, https://substackcdn.com/image/fetch/$s_!bIfF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F432a0b49-3842-43f5-ada5-74e74d669d57_1067x381.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!bIfF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F432a0b49-3842-43f5-ada5-74e74d669d57_1067x381.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><p>Until now, if you want to deploy a model to mobile, your best bet was TensorFlow Lite.  PyTorch has added <a href="https://pytorch.org/mobile/home/">support for mobile</a> (Android and iOS) with their 1.3 release. (PyTorch is upping it&#8217;s game!)</p><h3>DeepMind Podcast</h3><div class="soundcloud-wrap" data-attrs="{&quot;url&quot;:&quot;https://api.soundcloud.com/tracks/665730575&quot;,&quot;title&quot;:&quot;DeepMind Podcast Trailer by DeepMind&quot;,&quot;description&quot;:&quot;This eight part series hosted by mathematician and broadcaster Hannah Fry aims to give listeners an inside look at the fascinating world of AI research and explores  some of the questions and challenges the whole field is wrestling with today.&quot;,&quot;thumbnail_url&quot;:&quot;https://i1.sndcdn.com/artworks-000582440372-wxt27u-t500x500.jpg&quot;,&quot;author_name&quot;:&quot;DeepMind&quot;,&quot;author_url&quot;:&quot;https://soundcloud.com/user-281169976&quot;,&quot;targetUrl&quot;:&quot;&quot;}" data-component-name="SoundcloudToDOM"><iframe src="https://w.soundcloud.com/player/?auto_play=false&amp;buying=false&amp;liking=false&amp;download=false&amp;sharing=false&amp;show_artwork=true&amp;show_comments=false&amp;show_playcount=false&amp;show_user=true&amp;hide_related=true&amp;visual=false&amp;start_track=0&amp;url=https%3A%2F%2Fapi.soundcloud.com%2Ftracks%2F665730575" frameborder="0" gesture="media" scrolling="no" allowfullscreen="true"></iframe></div><p>DeepMind has completely restructured the organization recently. They have released a limited series podcast with mathematician Hanna Fry hosting the show and giving us an insider look. All the eight episodes are out. You can find it on Spotify, Google Podcasts or wherever you consume your podcasts.</p><h3>PyTorch official YouTube channel</h3><div id="youtube2-RwaVqvZ3xo8" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;RwaVqvZ3xo8&quot;,&quot;startTime&quot;:&quot;176s&quot;,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/RwaVqvZ3xo8?start=176s&amp;rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>PyTorch gets it official YouTube Channel (finally!). I have been waiting for this. Earlier the videos were scattered among Facebook Developers YouTube channel and facebook pages. Now we can watch all the content in one place. Go ahead and watch the videos (PyTorch Developer Conference, Summer Hackathon etc) when you are free.</p><h3>AI fun :)</h3><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://twitter.com/karpathy/status/868178954032513024&quot;,&quot;full_text&quot;:&quot;I've been using PyTorch a few months now and I've never felt better. I have more energy. My skin is clearer. My eye sight has improved.&quot;,&quot;username&quot;:&quot;karpathy&quot;,&quot;name&quot;:&quot;Andrej Karpathy&quot;,&quot;profile_image_url&quot;:&quot;&quot;,&quot;date&quot;:&quot;Fri May 26 18:56:07 +0000 2017&quot;,&quot;photos&quot;:[],&quot;quoted_tweet&quot;:{},&quot;reply_count&quot;:0,&quot;retweet_count&quot;:407,&quot;like_count&quot;:1587,&quot;impression_count&quot;:0,&quot;expanded_url&quot;:{},&quot;video_url&quot;:null,&quot;belowTheFold&quot;:true}" data-component-name="Twitter2ToDOM"></div>]]></content:encoded></item><item><title><![CDATA[Announcing Vision Week! ]]></title><description><![CDATA[Essential news for computer vision enthusiasts.]]></description><link>https://newsletter.visiongeek.io/p/coming-soon</link><guid isPermaLink="false">https://newsletter.visiongeek.io/p/coming-soon</guid><dc:creator><![CDATA[Arun Ponnusamy]]></dc:creator><pubDate>Fri, 18 Oct 2019 03:29:51 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!VG2I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa95f9a52-b9ac-4501-90e9-538cfc1803fe_1440x500.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VG2I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa95f9a52-b9ac-4501-90e9-538cfc1803fe_1440x500.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VG2I!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa95f9a52-b9ac-4501-90e9-538cfc1803fe_1440x500.jpeg 424w, https://substackcdn.com/image/fetch/$s_!VG2I!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa95f9a52-b9ac-4501-90e9-538cfc1803fe_1440x500.jpeg 848w, https://substackcdn.com/image/fetch/$s_!VG2I!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa95f9a52-b9ac-4501-90e9-538cfc1803fe_1440x500.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!VG2I!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa95f9a52-b9ac-4501-90e9-538cfc1803fe_1440x500.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VG2I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa95f9a52-b9ac-4501-90e9-538cfc1803fe_1440x500.jpeg" width="1100" height="382" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/a95f9a52-b9ac-4501-90e9-538cfc1803fe_1440x500.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:382,&quot;width&quot;:1100,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:220374,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VG2I!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa95f9a52-b9ac-4501-90e9-538cfc1803fe_1440x500.jpeg 424w, https://substackcdn.com/image/fetch/$s_!VG2I!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa95f9a52-b9ac-4501-90e9-538cfc1803fe_1440x500.jpeg 848w, https://substackcdn.com/image/fetch/$s_!VG2I!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa95f9a52-b9ac-4501-90e9-538cfc1803fe_1440x500.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!VG2I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa95f9a52-b9ac-4501-90e9-538cfc1803fe_1440x500.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><p>Welcome to <strong>Vision Week</strong> - a biweekly (once in two weeks) newsletter on Computer Vision and AI.</p><p>It will be a healthy mix of carefully curated industry news, interesting articles/blog posts/interviews/podcasts, recent advancements, research papers, demos, learning resources and hidden gems in the field of computer vision.</p><p>Sign up now so you don&#8217;t miss the first issue.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.visiongeek.io/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.visiongeek.io/subscribe?"><span>Subscribe now</span></a></p><p>In the meantime, <a href="https://newsletter.visiongeek.io/p/coming-soon?utm_source=substack&utm_medium=email&utm_content=share&action=share">tell your friends</a>!</p>]]></content:encoded></item></channel></rss>