A new AI coding challenge just published its first results – and they aren’t pretty

1 View

A brand new AI coding problem has revealed its first winner — and set a brand new bar for AI-powered software program engineers.

On Wednesday at 5pm PST, the nonprofit Laude Institute introduced the primary winner of the Ok Prize, a multi-round AI coding problem launched by Databricks and Perplexity co-founder Andy Konwinski. The winner was a Brazilian immediate engineer named Eduardo Rocha de Andrade, who will obtain $50,000 for the prize. However extra stunning than the win was his closing rating: he gained with appropriate solutions to only 7.5% of the questions on the check.

“We’re glad we constructed a benchmark that’s truly laborious,” stated Konwinski. “Benchmarks needs to be laborious in the event that they’re going to matter,” he continued, including: “Scores could be completely different if the large labs had entered with their largest fashions. However that’s type of the purpose. Ok Prize runs offline with restricted compute, so it favors smaller and open fashions. I really like that. It ranges the enjoying discipline.”

Konwinski has pledged $1 million to the primary open-source mannequin that may rating increased than 90% on the check.

Just like the well-known SWE-Bench system, the Ok Prize exams fashions in opposition to flagged points from GitHub as a check of how effectively fashions can take care of real-world programming issues. However whereas SWE-Bench is predicated on a set set of issues that fashions can practice in opposition to, the Ok Prize is designed as a “contamination-free model of SWE-Bench,” utilizing a timed entry system to protect in opposition to any benchmark-specific coaching. For spherical one, fashions have been due by March twelfth. The Ok Prize organizers then constructed the check utilizing solely GitHub points flagged after that date.

The 7.5% high rating stands in marked distinction to SWE-Bench itself, which at the moment reveals a 75% high rating on its simpler ‘Verified’ check and 34% on its tougher ‘Full’ check. Konwinski nonetheless isn’t certain whether or not the disparity is because of contamination on SWE-Bench or simply the problem of amassing new points from GitHub, however he expects the Ok Prize undertaking to reply the query quickly.

“As we get extra runs of the factor, we’ll have a greater sense,” he informed TechCrunch, “as a result of we anticipate individuals to adapt to the dynamics of competing on this each few months.”

Techcrunch occasion

San Francisco
|
October 27-29, 2025

It would look like an odd place to fall quick, given the big selection of AI coding instruments already publicly accessible – however with benchmarks changing into too straightforward, many critics see initiatives just like the Ok Prize as a vital step towards fixing AI’s growing evaluation problem.

“I’m fairly bullish about constructing new exams for current benchmarks,” says Princeton researcher Sayash Kapoor, who put ahead an analogous concept in a recent paper. “With out such experiments, we will’t truly inform if the problem is contamination, and even simply focusing on the SWE-Bench leaderboard with a human within the loop.”

For Konwinski, it’s not only a higher benchmark, however an open problem to the remainder of the trade. “In the event you hearken to the hype, it’s like we needs to be seeing AI docs and AI attorneys and AI software program engineers, and that’s simply not true,” he says. “If we will’t even get greater than 10% on a contamination free SWE-Bench, that’s the fact examine for me.”

Trending Merchandise

Add to compare

CIVOTIL Porch Sign, Porch Decor for Home, Bar, Farmhouse, 4″x16″ Aluminum Metal Wall Sign – This is Our Happy Place

Add to compare

$10.25

Add to compare

Aromatherapy Scented Candle Set – Natural Soy Wax Candles for Stress Relief and Home Decor, 4×3.5oz Scented Portable Travel Tin Candles, Perfect for Birthday Christmas Gift Idea

Add to compare

$35.99

Add to compare

PTShadow 4 Pcs Decorative Books for Home décor,Black and whiteshelf Decor Accents Library décor for Home Sweet Stacked Books

Add to compare

$22.99

Add to compare

Handmade Wooden Statue, Sitting Woman and Dog, Wood Decor Accents Craft Figurine for Bedroom Home Office Shelf Decor Gift Natural ECO Friendly

Add to compare

$15.09

Add to compare

Nicunom 12-Inch Retro Wall Clock, Round Vintage Wall Clocks, Silent Non-Ticking, Classic Decorative Clock for Home Living Room Bedroom Kitchen School Office – Battery Operated

Add to compare

$21.99

Add to compare

White Ceramic Vases Flower for Home Décor Modern Boho Vase for Living Room Pampas Floor Tall Geometric Vase (7.7in) (WhiteC)

Add to compare

$17.99

Add to compare

PeraBella House Warming Gifts New Home, Housewarming Presents Women Couple | Living Room Home Decor Farmhouse Decor, Coffee Table Decor, New Home Gifts Ideas | 3 Succulent Pots for Plant Lovers

Add to compare

$39.95

Add to compare

CEMABT White Ceramic Vase Set of 2 for Modern Minimalist Bohemian Decor，Round Matte Donut Vases for Pampas Grass-Perfect for Living Room, Dining Table, Office Bedroom Shelf Decor Christmas Day Gifts

Add to compare

$34.95

Add to compare

LEIKE Large Modern Metal Wall Clocks Rustic Round Silent Non Ticking Battery Operated Black Roman Numerals Clock for Living Room/Bedroom/Kitchen Wall Decor-60cm

Add to compare

$73.99

Add to compare

Rustic Grey Mason Jar Sconces for Home Decor, Decorative Chic Hanging Wall Decor Mason Jars with LED Strip Lights, 6-Hour Timer, Silk Hydrangea, & Iron Hooks for Home & Kitchen Decorations [Set of 2]

Add to compare

$39.99

A new AI coding challenge just published its first results – and they aren’t pretty

CIVOTIL Porch Sign, Porch Decor for Home, Bar, Farmhouse, 4″x16″ Aluminum Metal Wall Sign – This is Our Happy Place

Aromatherapy Scented Candle Set – Natural Soy Wax Candles for Stress Relief and Home Decor, 4×3.5oz Scented Portable Travel Tin Candles, Perfect for Birthday Christmas Gift Idea

PTShadow 4 Pcs Decorative Books for Home décor,Black and whiteshelf Decor Accents Library décor for Home Sweet Stacked Books

Handmade Wooden Statue, Sitting Woman and Dog, Wood Decor Accents Craft Figurine for Bedroom Home Office Shelf Decor Gift Natural ECO Friendly

Nicunom 12-Inch Retro Wall Clock, Round Vintage Wall Clocks, Silent Non-Ticking, Classic Decorative Clock for Home Living Room Bedroom Kitchen School Office – Battery Operated

White Ceramic Vases Flower for Home Décor Modern Boho Vase for Living Room Pampas Floor Tall Geometric Vase (7.7in) (WhiteC)

PeraBella House Warming Gifts New Home, Housewarming Presents Women Couple | Living Room Home Decor Farmhouse Decor, Coffee Table Decor, New Home Gifts Ideas | 3 Succulent Pots for Plant Lovers

CEMABT White Ceramic Vase Set of 2 for Modern Minimalist Bohemian Decor，Round Matte Donut Vases for Pampas Grass-Perfect for Living Room, Dining Table, Office Bedroom Shelf Decor Christmas Day Gifts

LEIKE Large Modern Metal Wall Clocks Rustic Round Silent Non Ticking Battery Operated Black Roman Numerals Clock for Living Room/Bedroom/Kitchen Wall Decor-60cm

Rustic Grey Mason Jar Sconces for Home Decor, Decorative Chic Hanging Wall Decor Mason Jars with LED Strip Lights, 6-Hour Timer, Silk Hydrangea, & Iron Hooks for Home & Kitchen Decorations [Set of 2]

Leave a reply Cancel reply

Compare items

Shopping cart