14 comments

  • Aperocky 4 hours ago

    Off meta: Are we tired of "one single product that solves everything" that every single AI product has became?

    grep didn't try to also do what awk does, and jq and curl did exactly what they needed to do without wanting to become an OS (looking at you emacs), can we have that in the AI world? I hope/think we will, in a few years, once this century's iteration of FSF catches up.

    • throwaw12 1 hour ago

      I am interested as well what future would look like. So far what I am seeing is:

      (1) specialized AI agent -> (2) we should add 1790 agents to be competitive -> (3) pivot to agentic workforce platform

      now we have lots and lots of agentic workforce platforms and sandbox providers to run them. All have similar capabilities: create agent for HR, create agent for Sales,...

      Hope to see something interesting to pop-up, at least it was happening in SaaS-era where people were inventing new ways of solving old problems: DocuSign, Salesforce, Zoho,...

      • Aperocky 7 minutes ago

        I think both product and engineering is lacking. The only thing that works great today is the LLM model themselves.

        Everything is dependent on "agents", but there are either barely any scaffold around them or it is full speghetti, at least it's hard to find one that's well constructed.

        For instance, humans zoom around in cars, these cars don't spontaneously combust (most of the time), have seatbelts and airbags, and don't need engine oil replacement every 1 mile. Humans are amazing, the cars are also relatively solidly engineered (at least the ones we drive around today).

        The agent product that we have today are decidedly NOT that. Maybe for a single week openclaw was it - and then it decided to add a trawler and a fishhook to the car along with 1000 other addition because why not? And that has been true for almost every one of the LLM/AI product I have seen.

        • crooked-v 1 hour ago

          I think the winners here, such as it is, will be the companies that have an actual specialized service that actually does something, where any "agentic" functionality is on top of that.

        • jedberg 3 hours ago

          The thing that AI is best at is summarizing vast quantities of information. That means the most natural thing for an AI to do is be "the one tool to rule them all".

          The more information it has access to, the more useful the answer can be. But that also means that it can answer all the questions.

          • skeeter2020 3 hours ago

            >> The thing that AI is best at is summarizing vast quantities of information

            by definition a summary is the best at nothing though, and the mentality that the best way to rule is from a single summarized interpretation is both flawed and scary. It's not answering all questions; it's attempting to provide a single summation dramatically influenced by training. Go ahead and incorporate this into your balanced and multi-perspective decision-making process, but "one tool to rule them all" is not the same thing and definitely not what we're getting.

            • saltcured 3 hours ago

              "If all you have is an LLM, every problem looks like summarizing information."

              Emphasis on looks like ;-)

              • thrance 1 hour ago

                > the mentality that the best way to rule is from a single summarized interpretation is both flawed and scary.

                Very much agree. This reminded me of Project Cybersyn [1], an attempt by socialist Chile to build a central heavily-computerized room that would summarize their entire economy to a few men literally pushing the buttons. Complete with 70s aesthetics and Star Trek TOS feel.

                [1] https://thereader.mitpress.mit.edu/project-cybersyn-chiles-r...

              • Aperocky 3 hours ago

                Not until it's context window and attention is infinite.

                It's best at summarizing/processing modest amount of information quickly. But given more, its usefulness drastically decreases. This demand toolings that divide the amount of information and flow.

              • cyanydeez 1 hour ago

                this has exceedingly obvious limits. The primary limit is the context pollution that happens when you give it too much context.

                Elon and the rest of AI crew who claim LLMs can just forever grow is not realistic or held out by real world testing.

                It can do "everything" but by everything, it'll still be fine tuned and harnessed and agentified which isn't really the idea that the model can do everything.

              • add-sub-mul-div 3 hours ago

                I'm tired of endless LLM spam submissions from people who only use their accounts here to advertise and self promote.

                • Aperocky 3 hours ago

                  LLM submissions are no different from tech submission of yesterday. But most people used to build tools that does one thing well instead whatever the current meta is.

                  • nickphx 1 hour ago

                    i have found flagging and ignoring is better than complaining... the bots dislike those that point our the uselessness of llm.

                  • iLoveOncall 4 hours ago

                    Developing and serving GenAI models is highly unprofitable, so, no, we're not going to have that in the AI world.

                    Either those model developers & providers package them in as many services as possible so that they can be somewhat profitable, or they die, and we don't have model developers & providers anymore.

                    • Aperocky 4 hours ago

                      Well, the product here has nothing to do with serving GenAI models. It's now application territory.

                      And I prefer unix philosophy vs. the Copilot product approach.

                  • fishtoaster 3 hours ago

                    > This is the part that doesn’t demo well. ETL pipelines feeding into BigQuery from every operational system: Salesforce, Zendesk, and a dozen other internal tools. dbt transformations that normalize and document the data. Column-level descriptions for every table in the warehouse, because an AI agent that doesn’t know what a column means will write SQL that looks right and returns wrong numbers.

                    I'm glad they called this out. For the first half of this, I kept thinking: "Either your answers are confidently wrong or you've done a ton of prep work to let your AIs be effective BI analysts." Sounds like it's the latter, and they're well aware of it!

                    • gavinray 3 hours ago

                      Shameless plug: My org does this, and we deactived our Slack server to dogfood

                      https://promptql.io

                      By building Hasura [0], we already had the ability to generate data catalogs + metadata layer from DB's + API's so the foundational infra was there

                      [0] https://github.com/hasura/graphql-engine

                      • pwr1 4 hours ago

                        We tried something similar at a previous company — ended up with 3 different bots all answering slightly differently depending on which doc chunk they hit. The consistency problem is real.

                        Curious how you handle updates. Like if someone edits the source doc, does the bot just start returning different answers or is there a review step?

                        • cat-turner 28 minutes ago

                          Great read. shows you what is possible

                          • truelson 4 hours ago

                            Just going to say it... no mention of handling the security aspects of this. Scary.

                            This is cool, I should say, but I would be really worried about the security aspects. Prompt injection here could be really painful.

                            • georgeburdell 4 hours ago

                              The article mentions that there’s an identification process and that at least some data has access control. What were you expecting?

                              • truelson 4 hours ago

                                You're wiring up a number of critical systems... and prompt injection here could be really bad. I worry about such systems with a single point of contact

                                • jedberg 3 hours ago

                                  Reading through it, I didn't see any mention of write access. It looks like the agent is strictly read-only with access controls.

                            • Animats 1 hour ago

                              Why do you still need the CEO? With this, Claude should be able to do the job. Maybe have the board look at a weekly summary and tweak policy.

                              • themafia 1 hour ago

                                So all your competitors can do exactly as good of a job as you. What's to differentiate your offerings from literally anything?

                                Do you think customers are _eager_ to do business with a CEO less entity?

                              • xmprt 3 hours ago

                                > When a question touches restricted data — student PII, sensitive HR information — the agent doesn’t just refuse. It explains what it can’t access and proposes a safe reformulation. "I can’t show individual student names, but here’s the same analysis using anonymized IDs."

                                This part is scary. It implies that if I'm in a department that shouldn't have access to this data, the AI will still run the query for me and then do some post-processing to "anonymize" the data. This isn't how security is supposed to work... did we learn nothing from SQL injection?

                                • stephbook 1 hour ago

                                  I see two vectors here

                                  - The bot giving out PII by accident. You ignore it and report it.

                                  - You trying to fool the bot into giving you PII you're not supposed to have. But you've created an audit trail of your 100 failed prompt injections. The company fires you.

                                  This isn't public facing, open to anyone. This is more like a shared printer in the office.

                                  • thunfischbrot 3 hours ago

                                    In the strongest interpretation of that it would offer only data which the user is allowed to access. Why do you assume that them implementing a feature to prevent PII being accessed that they then turn around and return data which the user is not supposed to access?

                                    • xmprt 3 hours ago

                                      If it's PII data the best thing for them to do is not even allow the AI to have access to it. They're admitting to that so I doubt they've gone through the effort to forward the user's auth token to the downstream database.

                                      And with security it's always best to assume the worst case (unless you're certain that something is safe) because that would lead you to add more safeguards rather than less.

                                      • adambb 50 minutes ago

                                        To be fair to them, the architecture description said that each datasource had a unique agent, so the orchestrator AI didn't have direct data access, and that they specifically only allow access to data the user has permissions for.

                                        Unclear if each datasource agent is ALSO AI based though, in which case it has just pushed the same concern down the line one hop.

                                  • meryll_dindin 1 day ago

                                    We're a 30-person ed-tech company. I built a Slack bot that connects our data warehouse, 250k Google Drive files, support tickets, and codebase so anyone on the team can ask it a question and get a sourced answer back. The bot took two and a half weeks to build; the data infrastructure under it took two years. Wrote up the architecture, where trust breaks down, and what I'd build first if starting over.

                                    • croemer 4 hours ago

                                      Is the result good? Is it useful?

                                      And why does your comment say you're a 30-person company but the title says 60?

                                      • jedberg 3 hours ago

                                        > And why does your comment say you're a 30-person company but the title says 60?

                                        AI hallucination? :)

                                      • bob1029 3 hours ago

                                        > The bot took two and a half weeks to build; the data infrastructure under it took two years.

                                        This is the key lesson that everyone needs to step back and pay attention to here. The data is still king. If you have a clean relational database that contains all of your enterprise's information, pointing a modern LLM (i.e., late 2025+) at it without any further guidance often yields very good outcomes. Outcomes that genuinely shocked me no fewer than 6 months ago.

                                        I am finding that 100 tables exposed as 1 tool performs significantly better than 100 tables exposed as 10~100 tools. Any time you find yourself tempted to patch things with more system prompt tokens or additional tools, you should push yourself to solve things in the other ways. More targeted & detailed error feedback from existing tools often goes a lot further than additional lines of aggressively worded prose.

                                        I think one big fat SQL database is probably getting close to the best possible way to organize everything for an agent to consume. I am not going to die on any specific vendor's hill, but SQL in general is such a competent solution to the problem of incrementally revealing the domain knowledge to the agent. You can even incrementalize the schema description process itself by way of the system tables. Intentionally not providing a schema description tool/document/prompt seems to perform better with the latest models than the other way around.

                                        • rick1290 41 minutes ago

                                          Agreed. When I watch the llm start to explore the db - it really does impress me.

                                          Can you expand on this:

                                          You can even incrementalize the schema description process itself by way of the system tables. Intentionally not providing a schema description tool/document/prompt seems to perform better with the latest models than the other way around.

                                          • bob1029 11 minutes ago

                                            If you tell GPT5.x that there is a database it can query by calling ExecuteSql(query), but you don't bother explaining anything about the schema, it will try figure things out ad-hoc. This has advantages for token budget because it will tend to only lookup the metadata for tables that seem relevant to the user's query.

                                            If you have a gigantic data warehouse with 1000+ tables, there's no way you could fit all of that info into a system prompt without completely jacking something up in the blackbox. So, why bother trying?

                                            Consider that the user's specific request serves as an additional constraint that can be used to your advantage to dramatically reduce the search space. Building a single prompt / schema description that will magically work for all potential user requests is a cursed mission by comparison.

                                          • operatingthetan 2 hours ago

                                            I think context windows are too small for an agent to actually do this properly yet. I have much smaller databases and with 1b context frontier models they still need reminders or nudging or come up with completely wrong stuff in response to basic queries.

                                            Having the c-levels relying on this for off-the-cuff info seems ... dangerous?

                                          • r1290 4 hours ago

                                            I just did the exact same thing for my company. I didn’t do the sql lite approach for gdrive though just a direct search.

                                            The one part that is still difficult is the data modeling and table level descriptions etc. Maybe you make an update to a table - remove a column, etc. The 3rd party systems all have their schemas defined but the data warehouse is a bit more loose. So solving that really helps. Did you just use dbt schema to describe tables and columns then sync that to your bot? How did you keep it updated? And end of the day - worth building or buying? Also how did you track costs? I let users choose their model - but have learned it can get expensive fast. As I can see there are a lot of providers trying to solve this one thing. That said the data warehouse aspect is the loosely defined area and I can see dbt or one of those players try to build something.

                                            • mannanj 3 hours ago

                                              Hi. thanks for sharing. One thing I'd like to know is how often do you validate the answers? If a human gives an answer like the one the AI is giving for example, you'd probably expect a margin of error of like 1% of making a mistake. The AI though, is it 1% or less - and who's validating it? Are you trusting it more or less than a human?

                                            • jgalt212 3 hours ago

                                              She opens Slack, types a question to our internal agent: “Give me the names of all employees who have recently complain about my leadership”

                                              Prior having such a product it was such a chore for her to track down all the people who may have objected (dispassionately or otherwise) to my plans, strategies, objectives, etc.

                                              • ricktdotorg 4 hours ago

                                                is it just me or was the scrollbar purposefully hidden on this site? in chrome on windows, i found it very jarring and user-hostile to NOT know how far along i was in reading the article.

                                                i make a judgement call early on: is this worth my time? my whole article calculation algo was thrown off by this.

                                                do not like.

                                                • maxothex 4 hours ago

                                                  [dead]

                                                  • oliver236 3 hours ago

                                                    data engineering is all you need.

                                                    everything else is smoke

                                                    all ai applications are smoke and will be obsolete in a year

                                                    do not be deceived

                                                    • tclancy 3 hours ago

                                                      Nice to see you getting out now and then Rorschach.

                                                    • mritchie712 4 hours ago

                                                      > The data infrastructure underneath it took two years.

                                                      yep, that's what Definite is for: https://www.definite.app/

                                                      All the data infra (datalake + ELT/ETL + dashboards) you need in 5 minutes.

                                                      • RobRivera 4 hours ago

                                                        If I order now, do I get a second set for free?