det.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
Mastodon Server des Unterhaltungsfernsehen Ehrenfeld zum dezentralen Diskurs.

Administered by:

Server stats:

1.8K
active users

#ComputerVision

4 posts4 participants0 posts today
IT News<p>This Polaroid-esque OCR Machine Turns Text to Braille in the Wild - One of the practical upsides of improved computer vision systems and machine learn... - <a href="https://hackaday.com/2025/08/15/this-polaroid-esque-ocr-machine-turns-text-to-braille-in-the-wild/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">hackaday.com/2025/08/15/this-p</span><span class="invisible">olaroid-esque-ocr-machine-turns-text-to-braille-in-the-wild/</span></a> <a href="https://schleuss.online/tags/computervision" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>computervision</span></a> <a href="https://schleuss.online/tags/accessibility" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>accessibility</span></a> <a href="https://schleuss.online/tags/tesseract" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>tesseract</span></a>-ocr <a href="https://schleuss.online/tags/arduinohacks" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>arduinohacks</span></a> <a href="https://schleuss.online/tags/raspberrypi" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>raspberrypi</span></a> <a href="https://schleuss.online/tags/computer" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>computer</span></a> <a href="https://schleuss.online/tags/impaired" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>impaired</span></a> <a href="https://schleuss.online/tags/braille" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>braille</span></a> <a href="https://schleuss.online/tags/seeing" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>seeing</span></a> <a href="https://schleuss.online/tags/vision" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>vision</span></a> <a href="https://schleuss.online/tags/blind" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>blind</span></a> <a href="https://schleuss.online/tags/read" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>read</span></a> <a href="https://schleuss.online/tags/ocr" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ocr</span></a></p>
IT News<p>How AI is being used to boost efficiency and security at truck terminal gates - A new automated gate platform from Outpost is designed to capture more reliable data as ... - <a href="https://www.geekwire.com/2025/how-computer-vision-and-ai-is-being-used-to-boost-efficiency-security-at-truck-terminal-gates/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">geekwire.com/2025/how-computer</span><span class="invisible">-vision-and-ai-is-being-used-to-boost-efficiency-security-at-truck-terminal-gates/</span></a> <a href="https://schleuss.online/tags/transportation" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>transportation</span></a> <a href="https://schleuss.online/tags/computervision" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>computervision</span></a> <a href="https://schleuss.online/tags/logistics" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>logistics</span></a> <a href="https://schleuss.online/tags/trucking" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>trucking</span></a> <a href="https://schleuss.online/tags/outpost" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>outpost</span></a> <a href="https://schleuss.online/tags/ai" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ai</span></a></p>
IT News<p>Ai2 unveils MolmoAct: Open-source robotics system reasons in 3D and adjusts on the fly - Jiafei Duan, Ai2 researcher, shows MolmoAct controlling a robotic arm. (GeekWire ... - <a href="https://www.geekwire.com/2025/ai2-unveils-molmoact-an-open-source-robotics-system-that-reasons-in-3d-and-adjusts-on-the-fly/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">geekwire.com/2025/ai2-unveils-</span><span class="invisible">molmoact-an-open-source-robotics-system-that-reasons-in-3d-and-adjusts-on-the-fly/</span></a> <a href="https://schleuss.online/tags/real" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>real</span></a>-timerobotplanning <a href="https://schleuss.online/tags/alleninstituteforai" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>alleninstituteforai</span></a> <a href="https://schleuss.online/tags/computervision" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>computervision</span></a> <a href="https://schleuss.online/tags/open" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>open</span></a>-sourceai <a href="https://schleuss.online/tags/multimodalai" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>multimodalai</span></a> <a href="https://schleuss.online/tags/3dreasoning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>3dreasoning</span></a> <a href="https://schleuss.online/tags/seattletech" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>seattletech</span></a> <a href="https://schleuss.online/tags/paulallen" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>paulallen</span></a> <a href="https://schleuss.online/tags/molmoact" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>molmoact</span></a> <a href="https://schleuss.online/tags/robotics" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>robotics</span></a> <a href="https://schleuss.online/tags/molmo" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>molmo</span></a> <a href="https://schleuss.online/tags/tech" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>tech</span></a> <a href="https://schleuss.online/tags/ai2" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ai2</span></a></p>
Laurent Perrinet<p>🧠 TODAY at <a href="https://neuromatch.social/tags/CCN2025" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CCN2025</span></a> ! Poster A145, 1:30-4:30pm at de Brug &amp; E‑Hall. We've developed a bio-inspired "What-Where" CNN that mimics primate visual pathways - achieving better classification with less computation. Come chat! 🎯</p><p>Presented by main author Jean-Nicolas JÉRÉMIE and in cosupervision with Emmanuel Daucé</p><p><a href="https://laurentperrinet.github.io/publication/jeremie-25-ccn/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">laurentperrinet.github.io/publ</span><span class="invisible">ication/jeremie-25-ccn/</span></a></p><p>Our research introduces a novel "What-Where" approach to CNN categorization, inspired by the dual pathways of the primate visual system:</p><ul><li><p>The ventral "What" pathway for object recognition</p></li><li><p>The dorsal "Where" pathway for spatial localization</p></li></ul><p>Key innovations:</p><p>✅ Bio-inspired selective attention mechanism</p><p>✅ Improved classification performance with reduced computational cost</p><p>✅ Smart visual sensor that samples only relevant image regions</p><p>✅ Likelihood mapping for targeted processing</p><p>The results? </p><p>Better accuracy while using fewer resources - proving that nature's designs can still teach us valuable lessons about efficient AI.</p><p>Come find us this afternoon for great discussions!</p><p><a href="https://neuromatch.social/tags/CCN2025" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CCN2025</span></a> <a href="https://neuromatch.social/tags/ComputationalNeuroscience" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ComputationalNeuroscience</span></a> <a href="https://neuromatch.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://neuromatch.social/tags/MachineLearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>MachineLearning</span></a> <a href="https://neuromatch.social/tags/BioinspiredAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>BioinspiredAI</span></a> <a href="https://neuromatch.social/tags/ComputerVision" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ComputerVision</span></a> <a href="https://neuromatch.social/tags/Research" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Research</span></a></p>
Continued thread

MultiADS: Defect-aware Supervision for Multi-type Anomaly Detection and Segmentation in Zero-Shot Learning

In manufacturing, quality control remains a critical yet complex task, especially when multiple defect types are involved. MultiADS introduces a system capable of detecting and segmenting a wide range of anomalies (e.g., scratches, bends, holes), even in zero-shot settings.

By combining visual analysis with descriptive textual input and using a curated Knowledge Base for Anomalies, MultiADS generalizes to unseen defect types without requiring prior visual examples and consistently outperforms state-of-the-art models across several benchmarks, offering a robust and scalable solution for industrial inspection tasks.

Sadikaj, Y., Zhou, H., Halilaj, L., Schmid, S., Staab, S., & Plant, C. MultiADS: Defect-aware Supervision for Multi-type Anomaly Detection and Segmentation in Zero-Shot Learning. International Conference on Computer Vision, ICCV 2025, Hawai, Oct 19-23, 2025, #ICCV2025. arxiv.org/abs/2504.06740.

arXiv logo
arXiv.orgMultiADS: Defect-aware Supervision for Multi-type Anomaly Detection and Segmentation in Zero-Shot LearningPrecise optical inspection in industrial applications is crucial for minimizing scrap rates and reducing the associated costs. Besides merely detecting if a product is anomalous or not, it is crucial to know the distinct type of defect, such as a bent, cut, or scratch. The ability to recognize the "exact" defect type enables automated treatments of the anomalies in modern production lines. Current methods are limited to solely detecting whether a product is defective or not without providing any insights on the defect type, nevertheless detecting and identifying multiple defects. We propose MultiADS, a zero-shot learning approach, able to perform Multi-type Anomaly Detection and Segmentation. The architecture of MultiADS comprises CLIP and extra linear layers to align the visual- and textual representation in a joint feature space. To the best of our knowledge, our proposal, is the first approach to perform a multi-type anomaly segmentation task in zero-shot learning. Contrary to the other baselines, our approach i) generates specific anomaly masks for each distinct defect type, ii) learns to distinguish defect types, and iii) simultaneously identifies multiple defect types present in an anomalous product. Additionally, our approach outperforms zero/few-shot learning SoTA methods on image-level and pixel-level anomaly detection and segmentation tasks on five commonly used datasets: MVTec-AD, Visa, MPDD, MAD and Real-IAD.

#Qwen-Image seems to have great image generation capabilities. It followed the prompt very closely to have a "ginger cat", "expecting facial expression", "thai restaurant", "thai style surroundings", "window with view on a mountan", "thai menu". But it didn't put any people into the scene, which I asked for, misspelled "Chiang Mai" (fixed that) and the words on the menu are baloney.

chat.qwen.ai
github.com/QwenLM/Qwen-Image
arxiv.org/abs/2508.02324

Data Annotation vs Data Labelling- Find the right for you

Key takeaways:

• Understand the core difference between annotation and labeling
• Explore use cases across NLP, computer vision & more
• Learn how each process impacts model training and accuracy

Read now to make smarter data decisions:

hitechbpo.com/blog/data-annota

Another one of my posts. This one on the topic of AI tools as assistive technology, what's working, what isn't and why, all without the hype that too many people tend to lean into when discussing this technology:

When Independence Meets Uncertainty: My Journey with AI-Powered Vision
A blind user's candid assessment of the promises and pitfalls of current AI accessibility tools
open.substack.com/pub/kaylielf

Kaylie’s Substack · 🤖👁️ From thermostat success to dryer disasters: my honest take on AI vision tools that promise independence but deliver uncertainty. A must-read for anyone curious about the real state of AI accessibility.By Kaylie L. Fox

“The nature of scientific progress is that it sometimes provides powerful tools that can be wielded for good or for ill: splitting the atom and nuclear weapons being a case in point. In such cases, it’s necessary that researchers involved in developing such #technologies participate actively in the ethical and political discussions about the appropriate boundaries for their use. Computer vision is one area in which more voices need to be heard.”

“This study backs up with clear evidence what many have long suspected: that computer-vision research is being used mainly in surveillance-enabling #applications.”

#ArtificialIntelligence / #ComputerVision / #research / #surveillance / #tech <nature.com/articles/d41586-025>

www.nature.comDon’t sleepwalk from computer-vision research into surveillanceThe output of computer-vision research is overwhelmingly aimed towards monitoring humans. The potential ethical implications need more scrutiny.

"An increasing number of scholars, policymakers and grassroots communities argue that artificial intelligence (AI) research—and computer-vision research in particular—has become the primary source for developing and powering mass surveillance. Yet, the pathways from computer vision to surveillance continue to be contentious. Here we present an empirical account of the nature and extent of the surveillance AI pipeline, showing extensive evidence of the close relationship between the field of computer vision and surveillance. Through an analysis of computer-vision research papers and citing patents, we found that most of these documents enable the targeting of human bodies and body parts. Comparing the 1990s to the 2010s, we observed a fivefold increase in the number of these computer-vision papers linked to downstream surveillance-enabling patents. Additionally, our findings challenge the notion that only a few rogue entities enable surveillance. Rather, we found that the normalization of targeting humans permeates the field. This normalization is especially striking given patterns of obfuscation. We reveal obfuscating language that allows documents to avoid direct mention of targeting humans, for example, by normalizing the referring to of humans as ‘objects’ to be studied without special consideration. Our results indicate the extensive ties between computer-vision research and surveillance."

nature.com/articles/s41586-025

NatureComputer-vision research powers surveillance technology - NatureAn analysis of research papers and citing patents indicates the extensive ties between computer-vision research and surveillance.