MultiADS: Defect-aware Supervision for Multi-type Anomaly Detection and Segmentation in Zero-Shot Learning
In manufacturing, quality control remains a critical yet complex task, especially when multiple defect types are involved. MultiADS introduces a system capable of detecting and segmenting a wide range of anomalies (e.g., scratches, bends, holes), even in zero-shot settings.
By combining visual analysis with descriptive textual input and using a curated Knowledge Base for Anomalies, MultiADS generalizes to unseen defect types without requiring prior visual examples and consistently outperforms state-of-the-art models across several benchmarks, offering a robust and scalable solution for industrial inspection tasks.
Sadikaj, Y., Zhou, H., Halilaj, L., Schmid, S., Staab, S., & Plant, C. MultiADS: Defect-aware Supervision for Multi-type Anomaly Detection and Segmentation in Zero-Shot Learning. International Conference on Computer Vision, ICCV 2025, Hawai, Oct 19-23, 2025, #ICCV2025. https://arxiv.org/abs/2504.06740.
#Qwen-Image seems to have great image generation capabilities. It followed the prompt very closely to have a "ginger cat", "expecting facial expression", "thai restaurant", "thai style surroundings", "window with view on a mountan", "thai menu". But it didn't put any people into the scene, which I asked for, misspelled "Chiang Mai" (fixed that) and the words on the menu are baloney.
https://chat.qwen.ai
https://github.com/QwenLM/Qwen-Image
https://arxiv.org/abs/2508.02324
Book your calendar. It’s one day prior to the conference, October 8, 2025. See you in Paris!
Get your tickets here https://eurorust.eu/workshops/rust-in-action/?utm_source=mastodon&utm_medium=social&utm_campaign=2025-07-28-workshop-rust-in-action
Sponsored by Helsing
"Go Computer Vision Package GoCV Adds Support for OpenCV 4.12" - me on Hackster.io about the new @gocv release!
GoCV 0.42 is out with support for the latest @opencv 4.12, new CUDA functions, ViT DNN tracking, and lots more!
Full release notes here: https://github.com/hybridgroup/gocv/releases/tag/v0.42.0
Go get it right now!
What if #AI could see the world like we do? That’s the idea behind #ComputerVision—machines interpreting visual data to navigate, detect, and decide. Our latest #ScienceGlossary entry explains how it works: http://go.tum.de/312381
TUM CCC/ R. Heckel, TUM CIT
Data Annotation vs Data Labelling- Find the right for you
Key takeaways:
• Understand the core difference between annotation and labeling
• Explore use cases across NLP, computer vision & more
• Learn how each process impacts model training and accuracy
Read now to make smarter data decisions:
Philips Taps AI to Manage Unwieldy, Outdated Image Library
Every company’s marketing department has thousands of photos that teams must sort through to find matches for advertising…
#NewsBeep #News #US #USA #UnitedStates #UnitedStatesOfAmerica #Artificialintelligence #AI #ArtificialIntelligence #ComputerVision #Philips #PYMNTSNews #Technology #VertexAI
https://www.newsbeep.com/us/9893/
Google's Gemini Veo3 now turns photos into 8-second videos with audio The AI-powered feature includes built-in watermarks for transparency and authenticity
Limited to Pro & Ultra users in select regions. Read the article to learn how it works and who can access it.
#Google #GeminiAI #AIVideo #ArtificialIntelligence #ComputerVision
OpenCV Version 4.12.0 is now available! Highlights include: GIF decode and encode for imgcodecs, improved PNG and Animated PNG files handing, animated WebP Support, and especially the new HAL for RISC-V RVV 1.0 platforms.
Read more: https://opencv.org/blog/opencv-4-12-0-is-now-available/
Another one of my posts. This one on the topic of AI tools as assistive technology, what's working, what isn't and why, all without the hype that too many people tend to lean into when discussing this technology:
When Independence Meets Uncertainty: My Journey with AI-Powered Vision
A blind user's candid assessment of the promises and pitfalls of current AI accessibility tools
https://open.substack.com/pub/kaylielfox/p/when-independence-meets-uncertainty?utm_campaign=post&utm_medium=web
We have a new proposal for adding improvements for hardware acceleration, but that would require a breaking interface change.
What do you think? Feedback wanted!
Dive into #ComputerVision with #Supervision from this #oSC25 talk! This talk shows how to streamline dataset loading, annotation & video analysis while staying lightweight for #edge & #IoT devices #AI #openSUSE https://www.youtube.com/watch?v=5CjYBrwhwS8
“The nature of scientific progress is that it sometimes provides powerful tools that can be wielded for good or for ill: splitting the atom and nuclear weapons being a case in point. In such cases, it’s necessary that researchers involved in developing such #technologies participate actively in the ethical and political discussions about the appropriate boundaries for their use. Computer vision is one area in which more voices need to be heard.”
…
“This study backs up with clear evidence what many have long suspected: that computer-vision research is being used mainly in surveillance-enabling #applications.”
#ArtificialIntelligence / #ComputerVision / #research / #surveillance / #tech <https://www.nature.com/articles/d41586-025-01965-5>
JOB: Postdoc in Digital Humanities (Computer Vision & Performing Arts) at Université Rennes 2
Full-time, starting Oct 2025, part of ERC project STAGE.
Apply by 8 Sep 2025
#DigitalHumanities #ComputerVision #PerformingArts #Postdoc #ERC #JobOpportunity #CulturalHeritage
https://euraxess.ec.europa.eu/jobs/348852
"An increasing number of scholars, policymakers and grassroots communities argue that artificial intelligence (AI) research—and computer-vision research in particular—has become the primary source for developing and powering mass surveillance. Yet, the pathways from computer vision to surveillance continue to be contentious. Here we present an empirical account of the nature and extent of the surveillance AI pipeline, showing extensive evidence of the close relationship between the field of computer vision and surveillance. Through an analysis of computer-vision research papers and citing patents, we found that most of these documents enable the targeting of human bodies and body parts. Comparing the 1990s to the 2010s, we observed a fivefold increase in the number of these computer-vision papers linked to downstream surveillance-enabling patents. Additionally, our findings challenge the notion that only a few rogue entities enable surveillance. Rather, we found that the normalization of targeting humans permeates the field. This normalization is especially striking given patterns of obfuscation. We reveal obfuscating language that allows documents to avoid direct mention of targeting humans, for example, by normalizing the referring to of humans as ‘objects’ to be studied without special consideration. Our results indicate the extensive ties between computer-vision research and surveillance."