det.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
Mastodon Server des Unterhaltungsfernsehen Ehrenfeld zum dezentralen Diskurs.

Administered by:

Server stats:

1.8K
active users

#generativeai

45 posts38 participants2 posts today

#LegalEthics Tidbit: Can I destroy all my work product to hide that I used #AI?

A NM criminal lawyer did not have time to devote to a case, so he hired help through the Lawclerk.legal freelance platform. The freelance attorney drafted a brief and the NM lawyer submitted it without verifying the citations or reading the cases. The brief contained a bunch of fictional ... (cont.)

lnkd.in/eDJ-4t_9
#law #GenerativeAI #ArtificialIntelligence

"Professor Gina Neff of Queen Mary University London tells the BBC that ChatGPT is "burning through energy", and the data centres used to power it consume more electricity in a year than 117 countries."

Source:
"Everyone's jumping on the AI doll trend - but what are the concerns?", BBC News, 12 April 2025
bbc.co.uk/news/articles/c5yg69

On the left, a picture of Zoe. She is smiling. She has shoulder-length blonde hair, a blue jacket and a silver necklace. On the right, an image generated using ChatGPT of a doll-like version of her. The doll has the same clothes and necklace - but has morphed her dark eyes into a light green, and darkened her hair.
BBC NewsChatGPT AI action dolls: Concerns around the Barbie-like viral social trendAs online users create Barbie-like dolls of themselves, experts urge caution over AI's energy and data use.

"Personally, I’m in the lonely camp of being skeptical about many of AI 2027’s predictions, but appreciative of the format and conversation it sparked. When Jessica and Saffron compare AI 2027 to Kim Stanley Robinson’s The Ministry for the Future, I think: The Ministry for the Future was awesome, and I’m glad it exists! There are plenty of others writing policy reports and op-eds; we need new styles to shock people into thinking in new ways, and to consider a broader-than-usual range of possible outcomes (e.g. I loved this delightful AGI parable from Zhengdong Wang).

I also think it takes real guts to put out predictions that can be so concretely disproven: putting dates on predictions requires skin in the game. The authors will be clowned on when they inevitably get stuff wrong. That suggests the AI 2027 authors really believe in their scenario, rather than doing weird wish fulfillment as some critics say (like I doubt they want us all to die).

Therefore, I invited AI 2027 author Daniel Kokotajlo on the podcast to discuss his team’s approach to creating AI 2027, answers to common critiques of forecasting (e.g. is it just bad sci-fi), and why he thinks writing scenarios can improve your thinking."

jasmi.news/p/daniel-kokotaljo

Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity

Joel Becker, Nate Rush, Beth Barnes, David Rein

Model Evaluation & Threat Research (METR)

"Despite widespread adoption, the impact of AI tools on software development in the wild remains understudied. We conduct a randomized controlled trial (RCT) to understand how AI tools at the February–June 2025 frontier affect the productivity of experienced open-source developers. 16 developers with moderate AI experience complete 246 tasks in mature projects on which they have an average of 5 years of prior experience. Each task is randomly assigned to allow or disallow usage of early-2025 AI tools. When AI tools are allowed, developers primarily use Cursor Pro, a popular code editor, and Claude 3.5/3.7 Sonnet. Before starting tasks, developers forecast that allowing AI will reduce completion time by 24%. After completing the study, developers estimate that allowing AI reduced completion time by 20%. Surprisingly, we find that allowing AI actually increases completion time by 19%—AI tooling slowed developers down. This slowdown also contradicts predictions from experts in economics (39% shorter) and ML (38% shorter). To understand this result, we collect and evaluate evidence for 20 properties of our setting that a priori could contribute to the observed slowdown effect—for example, the size and quality standards of projects, or prior developer experience with AI tooling. Although the influence of experimental artifacts cannot be entirely ruled out, the robustness of the slowdown effect across our analyses suggests it is unlikely to primarily be a function of our experimental design."

metr.org/Early_2025_AI_Experie

Because search engines (Google in particular) have absolutely failed me, I'm gonna crowd source this:

I'm looking for long-form blog posts on the state of #AI today. I don't mind if they get a bit technically, I'm just trying to get a deeper understanding of what #llms can do, how they work, their limitations and potential. I feel like most of what I've been exposed to is either overly optimistic takes (which I generally find off-putting) and the pessimistic takes which appeal to my cynicism (unfortunately). But I'm trying to be more open-minded, now.

I've seen a few talks on YouTube, one from Andrej Karpathy on his channel, and another one by Jodie Burchell on GOTO conferences which I think were pretty good. I'm just tired of being a non-believer who can't properly explain from a technical perspective why I don't believe other than the fact that I've tried to use LLMs for actual complex tasks and even the almighty Claude seems to crumble under real pressure

"Anthropic is very likely losing money on every single Claude Code customer, and based on my analysis, appears to be losing hundreds or even thousands of dollars per customer.

There is a gaping wound in the side of Anthropic, and it threatens financial doom for the company.

Some caveats before we continue:

- CCusage is not direct information from Anthropic, and thus there may be things we don’t know about how it charges customers, or any means of efficiency it may have.
- Despite the amount of evidence I’ve found, we do not have a representative sample of exact pricing. This evidence comes from people who use Claude Code, are measuring their usage, and elected to post their CCusage dashboards online — which likely represents a small sample of the total user base.
- Nevertheless, the amount of cases I’ve found online of egregious, unrelentingly unprofitable burn are deeply concerning, and it’s hard to imagine that these examples are outliers.
- We do not know if the current, unrestricted version of Claude Code will last.

The reason I’m leading with these caveats is because the numbers I’ve found about the sheer amount of money Claude Code’s users are burning are absolutely shocking.

In the event that they are representative of the greater picture of Anthropic’s customer base, this company is wilfully burning 200% to 3000% of each Pro or Max customer that interacts with Claude Code, and in each price point’s case I have found repeated evidence that customers are allowed to burn their entire monthly payment in compute within, at best, eight days, with some cases involving customers on a $200-a-month subscription burning as much as $10,000 worth of compute."

wheresyoured.at/anthropic-is-b

Ed Zitron's Where's Your Ed At · Anthropic Is Bleeding OutHello premium customers! Feel free to get in touch at ez@betteroffline.com if you're ever feeling chatty. And if you're not one yet, please subscribe and support my independent brain madness. Also, thank you to Kasey Kagawa for helping with the maths on this. Soundtrack: Killer Be Killed -

"Everyone should have access to answers, evidence, and data regarding the effectiveness and dangers of this technology. Axon and its customers claim this technology will revolutionize policing, but it remains to be seen how it will change the criminal justice system, and who this technology benefits most.

For months, EFF and other organizations have warned about the threats this technology poses to accountability and transparency in an already flawed criminal justice system. Now we've concluded the situation is even worse than we thought: There is no meaningful way to audit Draft One usage, whether you're a police chief or an independent researcher, because Axon designed it that way.

Draft One uses a ChatGPT variant to process body-worn camera audio of public encounters and create police reports based only on the captured verbal dialogue; it does not process the video. The Draft One-generated text is sprinkled with bracketed placeholders that officers are encouraged to add additional observations or information—or can be quickly deleted. Officers are supposed to edit Draft One's report and correct anything the Gen AI misunderstood due to a lack of context, troubled translations, or just plain-old mistakes. When they're done, the officer is prompted to sign an acknowledgement that the report was generated using Draft One and that they have reviewed the report and made necessary edits to ensure it is consistent with the officer’s recollection. Then they can copy and paste the text into their report. When they close the window, the draft disappears.

Any new, untested, and problematic technology needs a robust process to evaluate its use by officers. In this case, one would expect police agencies to retain data that ensures officers are actually editing the AI-generated reports as required...

eff.org/deeplinks/2025/07/axon

#USA#Axon#AI

Axon’s #DraftOne is Designed to Defy #Transparency

#Axon Enterprise’s Draft One — a generative artificial intelligence product that writes police reports based on audio from officers’ body-worn cameras — seems deliberately designed to avoid #audits that could provide any #accountability to the public, an @eff investigation has found.
#ai #artificialintelligence #gai #generativeai

eff.org/deeplinks/2025/07/axon

"If you want a job at McDonald’s today, there’s a good chance you'll have to talk to Olivia. Olivia is not, in fact, a human being, but instead an AI chatbot that screens applicants, asks for their contact information and résumé, directs them to a personality test, and occasionally makes them “go insane” by repeatedly misunderstanding their most basic questions.

Until last week, the platform that runs the Olivia chatbot, built by artificial intelligence software firm Paradox.ai, also suffered from absurdly basic security flaws. As a result, virtually any hacker could have accessed the records of every chat Olivia had ever had with McDonald's applicants—including all the personal information they shared in those conversations—with tricks as straightforward as guessing that an administrator account's username and password was “123456."

On Wednesday, security researchers Ian Carroll and Sam Curry revealed that they found simple methods to hack into the backend of the AI chatbot platform on McHire.com, McDonald's website that many of its franchisees use to handle job applications. Carroll and Curry, hackers with a long track record of independent security testing, discovered that simple web-based vulnerabilities—including guessing one laughably weak password—allowed them to access a Paradox.ai account and query the company's databases that held every McHire user's chats with Olivia. The data appears to include as many as 64 million records, including applicants' names, email addresses, and phone numbers."

wired.com/story/mcdonalds-ai-h

WIRED · McDonald’s AI Hiring Bot Exposed Millions of Applicants' Data to Hackers Using the Password ‘123456’By Andy Greenberg