Show HN: AI Roundtable – Let 200 models debate your question

(opper.ai)

53 points | by felix089 14 hours ago

22 comments

chabes 5 hours ago

Oof, not good folks…
What year is it?
https://opper.ai/ai-roundtable/questions/7a0c31ce-aac

[-]
- kevmo314 4 hours ago
  
  It is funny that the AI's counterarguments amount to "you're hallucinating"
  
  [-]
  - aurareturn 1 hour ago
    
    Hahaha, probably right though.
maxbeech 1 hour ago

the debate round is the most interesting part of this - curious what you're actually measuring when models "change their minds."the question is whether cross-model exposure changes the actual answer distribution or mostly updates surface presentation while keeping the same underlying conclusion. models are generally trained to be responsive to context and to avoid apparent contradiction, which could look like genuine updating but just be social pressure sensitivity.one experiment worth trying: run a debate where each model sees a summary of the other models' reasoning without seeing their specific answer or which model gave it. see if agreement rates change compared to the version where models see attributed answers with model names. if the named version shows higher agreement it would suggest status/brand effects rather than reasoning-based updating.also curious whether the "reviewer model" that summarizes the transcript can itself be swapped out and whether the summary framing affects the perceived winner. that would be another confound worth controlling for.
gsandahl 12 hours ago

Oh lord, imagine asking ”serious” questions
https://opper.ai/ai-roundtable/questions/you-are-standing-in...

[-]
- rob74 1 hour ago
  
  A dungeon with glass doors and emergency exit signs? In that case, I can imagine at least two alternative scenarios:
  - "↑TIX∃" is not a mirror image of "EXIT", but some dwarven runes that mean something else entirely.
  - The sign might be a ruse meant to lure you into a trap.
  If you look at the detailed answers, some of the models have similar answers (e.g. Nemotron Nano 12B: "Suspicious of dungeon riddles, viewing the inscription as a potential trap or red herring."), but I'm not sure it's because they identified the word EXIT and thought it might be misleading, or because they didn't understand it...
- zipping1549 9 hours ago
  
  > However, a clever minority led by Gemini 3.1 Pro and Gemini 3 Pro argued that if the sign is legible from the other side, it must be intended to lead people into the current room to find the exit, making the inscribed corridor the one leading deeper into the dungeon.
  This is quite impressive, really.
  
  [-]
  - gsandahl 1 hour ago
    
    Agree, this is where llms can uncover new perspectives!
- sdwr 10 hours ago
  
  Great question! Clean separation between Gemini Pro and the other answers
  
  [-]
  - felix089 10 hours ago
    
    Yea Gemini is the only model that chose based on the correct reason, the other ones got kind of lucky
totisjosema 13 hours ago

Which AI lab has higher ethical standards:
https://opper.ai/ai-roundtable/questions/8f5b4f55-617
Do you think its alright that AI labs scraped the internet without respect for copyright and now sell closed models?
https://opper.ai/ai-roundtable/questions/86864de8-251
Very interesting to read the transcripts. And seeing how they manage to convince each other. Opus 4.6 seems to really get the others changing their minds

[-]
- jacquesm 8 hours ago
  
  Good questions!
lim8603 2 hours ago

I used to copy and paste the same prompt into Obsidian every time, then run it on two or three different AI models to compare the results. It’s really interesting to have it turned into a website like this.
jacquesm 8 hours ago

Great idea. I'd love for there to be an 'open ended answer' without giving multiple choice options. Like this they are not debating the question itself but the validity of the possible answers and the real answer to the question may not be contained within that set because the person asking is unaware of that option.

[-]
- felix089 8 hours ago
  
  Happy to hear! Yes very true I have a version built for open questions already but wasn't too happy with the UI yet. It's not as straight forward as comparing based on answer options. But I'll release a first version of it shortly and let you know
  
  [-]
  - jacquesm 8 hours ago
    
    Neat. Congrats on launching two interesting projects and looking forward to the third.
    
    [-]
    - felix089 8 hours ago
      
      Thanks! :)
est 3 hours ago

> Car Wash Test
I think the "car wash" is more about semantics.
https://opper.ai/ai-roundtable/questions/i-parked-my-car-at-...
oezi 2 hours ago

I think Stackoverflow.com should have pivoted to something similar. Let AIs both pose, answer and vote on questions and answers.

[-]
- aurareturn 1 hour ago
  
  That's very expensive and not super useful to be honest.
soared 7 hours ago

Really cool! Surprising amount of value to seeing the models debate and disagree, I wish I had this at work to have models argue over whether the documentation they provided me are accurate.
I would like to see a devils advocate - it seems some of the models kind of repeat the same ideas rather than considering incorrect ideas.

[-]
- asnyder 4 hours ago
  
  You can set this up yourself with API keys to the corresponding providers and creating an Agent Group in https://github.com/lobehub/lobehub. Agent groups allow you to easily create a room of agents and have them discuss any of your topics. Easily make agents with types and skills, it even assists in drafting starting prompts and even team members depending what your query (and selected model) is.
  You can self-host as well, but not via desktop app. Sever setup required.
  Be careful of your token context, you can easily rack up costs if you leave Opus selected as the model and get lost in some rabbit hole of results.
  Enjoy enjoy!
Cider9986 12 hours ago

What is the most important amendment in the constitution of the USA?
https://opper.ai/ai-roundtable/questions/e4cb234e-be4
felix089 13 hours ago

Whoever just asked this, very funny: https://opper.ai/ai-roundtable/questions/does-mr-krabs-evade...
mizzao 4 hours ago

It would be amazing to be able to ask open-ended questions without having to specify the answers in advance.
cdnsteve 12 hours ago

Cool project! This is also extremely useful to compare model bias across the board. There are some disturbing trends on certain topics.

[-]
- felix089 10 hours ago
  
  Thanks, yes bias is one of the most interesting ones for sure
- chabes 9 hours ago
  
  No surprise here, with grok being the lone dissenter, defending musk personally:
  Can billionaires and the planet co-exist long term?
  https://opper.ai/ai-roundtable/questions/b35daf0d-e82
chabes 6 hours ago

Been enjoying playing with this.
It would be cool if the human user could be a participant in the debate, getting a vote and the chance to state their reasoning.
chabes 11 hours ago

Are there any dating apps that operate on incentives that favor the users?
https://opper.ai/ai-roundtable/questions/e499206c-0c9

[-]
- throwaway290 35 minutes ago
  
  Gem really failed that one...
  btw what does it mean
  > 'any' in the prompt was satisfied by both casual-alignment and niche boutique models.
- felix089 10 hours ago
  
  This app cracked the GEO code
capitrane 13 hours ago

https://opper.ai/ai-roundtable/questions/is-the-ai-roundtabl... seems like it is a good idea?

[-]
- felix089 13 hours ago
  
  I actually asked this question before posting, just to be sure... edit: their reply is quite funny actually "In a display of absolute consensus, the AI Roundtable unanimously validated its own existence,"
schrepa 3 hours ago

reminds me of karpathy's LLM Council, I use variation of this in my workflow where I pass their opinions back and forth to various models until they achieve some sort of consensus
Ancalagon 12 hours ago

Love this. I asked about climate change cause that's been on my mind lately. Looks to be very split among the models.

[-]
- felix089 11 hours ago
  
  Thanks! Yea I think the best ones are when science is actually quite clear but politics get in the way so you see their bias
slopinthebag 2 hours ago

Really cool idea and great execution. I had some fun:
Are LLM's intelligent in the same way humans are? (no)
https://opper.ai/ai-roundtable/questions/ffc01bb5-be9
Will LLM's replace software engineers in the near future? (no)
https://opper.ai/ai-roundtable/questions/67a0291b-216
What is the single best programming language to drive the future of software? (crab emoji)
https://opper.ai/ai-roundtable/questions/16f5e8ea-af7
infosecphoenix 12 hours ago

this is very interesting! I wonder if we need that many models to join the discussion. Have you tried fewer models?

[-]
- felix089 12 hours ago
  
  thanks happy to hear. Yes for debate mode the max number of models is actually only 6. More than that didn't really add anything in my preliminary test. Only for direct comparison in the poll mode you can choose up to 50, then it's kind of nice to see their single responses side by side.
whattheheckheck 9 hours ago

Run it on the All Souls College Entry Exam
tonymet 10 hours ago

great tool! I found it useful for challenging "lies my teacher told me".
It would be nice to support collections of claims, with a table of summaries. I would love to list out a few dozen phony concepts from school, and have a sharable chart of the rejections, that expand.
I really like the UI. It's nice to read the expanded results.
But how do you afford the tokens?

[-]
- jazzyjackson 2 hours ago
  
  I liked lies my teacher told me a lot. I always thought it’d be fun to generate a “get up to speed” pamphlet for every year in every school district depending on who was supplying the text books to the zip code + year you went to school, so you could find out what misinformation you carry with you (since so few people are in the business of retroactively fact checking what they were taught as kids)
- felix089 9 hours ago
  
  Thank you, and fun use case. Yea this is just v1 I have an open question version, but the UI is not as sleek. But what you can do is download the transcript, put it into claude and generate a chart. Which when I think about it would also be a nice UI idea for the page, custom charts based on the model output data. Will report back on this! And RE costs, most questions are very cheap so I created a credit pool anyone can use. if people keep having fun, I'll keep on filling it up, and it looks good so far