(Someone deleted a comment about why you'd want a mobile Codex app. This is the answer I wrote.)
Once you've used these coding agents a lot, you develop a pretty intuitive feel for how they work, what they're capable of, what they're good at, and where their weaknesses are. Hopefully, you're already pretty familiar with the code base you're working on. Combining the two, this means you can get quite far essentially "vibe coding" (i.e. not looking at the actual code) on a new branch.
So if you have some idea or some issue you want to fix on the go, you just iterate with the agent for a bit (presumably no more than a couple hours) until the agent outputs an implementation. Here, I do claim there is some "skill" (which is a function of your codebase familiarity, general SWE ability, and facility with AI agents), and if you're good, this implementation will be halfway decent a high percentage of the time. Then when you're back at your desktop, you can review the changes carefully/do some proper testing/debugging etc. But you've saved a good chunk of time- an initial draft is already waiting for you.
I am not sure I understand the time savings you're describing here. Do you mean you saved the "time to write prompt into the text input box" because you got to do that sooner from your phone rather than write down your idea and do it when you got back to your computer?
Wouldn't you be doing the exact same thing had you been sitting at your computer when you had the idea?
Perhaps the person who wrote that had the mindset of "when I am away from my work, I want to be disconnected and present with the world around me, this updates now makes it so that I now have an excuse to carry work with me"
Maybe they're in a toxic/abusive work relationship where taking breaks is already difficult and this might lead to justifying working from your phone as "expected"
My question to you is: what is wrong with moving a little slower? Is time to prompt an optimization of a real bottleneck?
You can use STT and include a workflow that automatically extracts the requirements (filters all the um's, ah's, pauses) and it becomes more like an interaction where you act as the Product Owner/Manager and Codex is your Architect/Dev.
At least, that's how I code through my phone. But it does require some forethought in establishing your automated workflows. I'm at the point where my entire dev system has established templates for CI/CD so I can preview work in staging and production is still a manual step (obviously).
Sure, I too do that on the computer. Computers have microphones these days, and STT runs on my macOS as well. What was your point about in regards to my comment? I am not sure I understood you.
I was doing exactly this for a while with Claude Code. Very helpful when I'm away from home but can't stop thinking about my project. The remote agent has access to all the docs and instructions in my repo and most of the time gives me a decent draft I just need to polish later.
I unsubscribed from Claude after the performance regressions around the time of the Opus 4.7 update made it unusable. Been using Codex since then, and I've definitely missed being able to make these drafts. So I'm looking forward to trying this out.
For major new features on my SaaS this is exactly what I do on my phone/laptop sometimes over days or weeks. I never look at the code until I get a feeling that it's far enough along and then I will hop into the actual code and start manually making changes or using CC locally to make the changes iteratively over weeks until it's ready for release. In the early stages of a major new feature/product it can be counterproductive to closely monitor the AI. Of course like you said in your comment this requires very very strong knowledge of the code base and a lot of experience with using the agents in the first place. But once you can do this sort of workflow it's very powerful because you can do this in parallel with other work (just an hour or two per day over a week or two on your phone can get you to a really good first draft even on a major new feature/product. And of course I'm not saying it's ready for production that can still take weeks but that's not really the point)
No, the phone connects to your local device. This isn't "codex web" on mobile. Basically you work through your desktop on your phone. So to be clear, there are security risks (you can wipe your entire desktop from your phone).
You can run Codex Desktop on Linux. It's on AUR already. Granted, just a repacked ASAR from Windows version but still does work quite well.
Haven't tested connection to mobile yet but the integration with cloud environments already works.
For now it appears that it talks only to the Codex App. Some users in this thread are saying that apparently the Codex CLI will support it on the next official release.
Not sure about how it works with Codex now, but with Claude you can just start a terminal session of claude code with your code checked out locally on your computer, and then enable remote control which lets you control that session from your phone.
So basically, it is like you are typing on your terminal on your computer from your phone.
I tried Codex web. It kinda sucks and OpenAI doesn't seem to be promoting it? Look elsewhere if you want a Linux VM in the cloud. (I quite like exe.dev and they do have good mobile support.)
I mean I'd love for them to take it further. If you put me on the phone with a talented software engineer I could supervise all sorts of changes. I wish I could do the same thing with my coding agent. Being able to be like, "hey remind me what's in that database table ... got it okay let's rename it to ..."
I'm also completely fine if it gives me hold mustic while it's working.
I've been vibe-refactoring a fork of get-shit-done (a skill collection for coding agents) for about the past week. I've had to revisit the same ideas multiple times because the agent doesn't always get it right at first, but it's still so much faster than I could have been at the same work + it's already mostly working (I've been dogfooding it for a day or two now). And I have gotten by just bringing up issues I notice from the LLM's implementation comments, rather than actually inspecting the code even once so far.
Forgetting code exists is by definition not suitable for serious work. However, OP said in the following paragraph, that this would be a first draft, and that the code would actually be reviewed and tested properly before being integrated.
At which point it is by definition no longer vibe coding, because you do care about the code! It's just an AI assisted workflow, but now we call all of those vibe coding for some reason. (Naming things is hard!)
If vibe coding means not caring about the code, then a literal translation of the term would be "not caring about coding" coding.
> Forgetting code exists is by definition not suitable for serious work
This is just like everyone who says, “An iPad is not suitable for serious work.”
By which they (and you) generally mean, “What I do is serious work. What you do is unserious work.”
I think I do serious work – I mean they pay me for it? And I have only copy/pasted and just run whatever code’s been generated by AI for the past 12 months or so. Whenever I can I just let the AI run it itself.
Sad to learn that I’ve been so unserious all this time.
What OP said works quite well for a lot of tasks, and if you've set up base instructions on coding style they (Codex, Claude) generate code accordingly.
A key point is that after the "vibe" session you should also have a lot of tests written. So they can easily refactoring the code afterwards if there are major aspects you don't like when you get back to your desktop.
I find funny the trend of software engineers being shocked at the idea that someone would issue a set of instructions to a coder and not look at the code, or only glance at it.
How do you think the world has worked for the past thirty years? AI has just caught up with human skill is all.
Unbelievable. This is the silent de-skilling of this industry.
Imagine saying that you don't need to look at the roads or have no hands on the wheel whilst driving because someone-else said that the car can 'drive' itself; therefore, no need for anyone (including taxi drivers) to learn how to drive.
Just because a machine can generate plausible looking code does not mean you don't need to look at it or not know how it works or why it doesn't work.
Whats crazier is that Codex is free. I thought I had to pay to even try it out but nope, you can use the desktop app or cli for free, its apparently included in the free plan. You just have to sign in to your ChatGPT account.
Of course I am aware that the caveat here is that all my interaction is part of training, but I’m fine with that. Even Qwen Cli discontinued the free plan.
I switched some time after Anthropic bricked their models with adaptive thinking. It's a legit mystery to me how people are still using CC professionally.
Codex is far less frustrating and manages context better. It's also costing me about 1/3rd as much as Opus 4.7 on CC.
IME, based on an in-house bench it's still good to about 20% on the 1M for 4.6 and 4.7 with a code base >50k loc. The trick I used before switching providers was to have it write a handoff when it hit ~18% of context and reset.
There are also many people running 4.5 with specific parameters that claim to be having luck.
5.5 is absolutely comparable to opus 4.7 (both on highest effort), maybe even better. It generally seems less lazy, faster, and writes code closer to what I'd write. The only downside is that for very very long tasks, it can kind of lose track of the goal. For tasks under ten minutes I'll go with codex every time.
The main difference is in the frontend skills. GPT produces terrible design. What I do these days is ask Opus to produce an HTML mockup, then feed it to Codex.
I have not had problems with long goals. I let it chomp for 40 minutes on a proof in my custom theorem prover (xhigh fast), and it got there. Very happy with Codex, I ditched Claude for it.
I stopped trying to use Claude to do anything with 4.7 because it sucks up so many tokens so quickly. I use the 4.6 model still and have switched to Codex for larger tasks. It also works better at more complex coding tasks than Claude for web apps that have python backends and typescript front ends.
Compaction is basically seamless which is a major weak point of Claude. At effort=low, Claude is better than codex but still slower. If you don't mind trading the upfront quality of work with additional micromanaging but at a faster speed, it is fine. I also think because of that very reason, you absorb more of the code.
I was really unimpressed by the free Codex (for nodejs/react dev). I think it must be using a less powerful model or they’re limiting it in some other way.
Are you specifically pointing at a different experience between free + paid? Or just that the free version is unimpressive?
I'm using paid on TypeScript and it's genuinely terrific. Subjectively I think it has the edge over Opus.
I'd be surprised if OpenAI is hamstringing the free version. That would seem crazy from a GTM PoV. If anything the labs seem to throttle the heavy paid users.
I stopped using my Claude subscription because it became so prohibitive. Back to ChatGPT and Codex full time and been pretty happy. I miss the tone/writing style of Claude, but don't miss the frustration of being told I've reached my plan limits in a comically short amount of time.
Using these prompts/steering[0], setting Base style to Friendly, Warm to More, Enthusiastic to Default, Headers, Lists, and Emoji to Less, I have found I can get gpt-5.5 about ... 80% of the way there to writing as non-annoyingly as Claude. And it's so much faster and has such higher limits that that's worth it for me.
I also put together this ridiculous thing[1] because I missed the font and color scheme of Claude.
Some of it is in my customized instructions, some of it I fed pieces in at a time saying "remember this please:" so it goes into Memories.
I'm not entirely clear on the mechanism by which memories make it into context, so it's possible some of it isn't all the time, but it does seem to be working reasonably.
Again, it's not as good as Claude when it comes to writing "not like an AI". But it's significantly better than it was.
FYI I'm actively working on aimpostor, so check back in a couple days for some quality improvements. (I'm definitely not going to bother with a Sparkle updater or anything like that.)
on Codex I ran into limits maybe like 2 times in 3 months, after doing several "upgrade this experimental game to my latest shared framework" passes on 5.5 Extra High
I can go through a 5-hour limit with a $20/mo Plus subscription in a few minutes with 5.5 Extra High. This causes me to reserve the latest/best rev for the harder problems.
5.5 really does seem to be very superior to 5.4, but it's also very expensive to run: The gas gauge moves fast. It's not very clearly defined whether 5.5 will cost less to get a problem solved quickly, or if a bunch of automatic iterations of 5.4 will solve it less-expensively. Both are often frustrating to me on the $20 plan.
Most of those commits since the last few months are thanks to Codex reviews (but the code is not AI generated): 5.5 since it came out, and 5.4 etc before that, almost always on Extra High because it's for a framework that underlies the other stuff I do so I want make to sure everything's correct.
Sometimes I have to run multiple passes on the same task: I rarely continue any session beyond 4-5 prompts to avoid "bloat" or accumulate "stale context", so sometimes Codex finds different stuff in subsequent reviews of the same file/subsystem.
The project is modular enough where each file can be considered standalone with only 1-2 dependencies, and I already used to write a lot of comments everywhere (something some people laughed at), so maybe that helps the AI along?
I'm taking this, along with my own experience, to mean that the GPTs are cheaper to use for refactors of an existing body of work than they are for creating a new one.
(And perhaps part of that is in the name? These "LLM" contraptions are very good at translation, after all. And tokens seem to relate more to concepts than to specific phrases or words.)
the current state of that 20$ claude plan, despite twice this week them stating better usage. first for "double 5 hour usage", then for 50% overall more usage a week.
MAYBE the 50% overall is true, but the double usage during a 5 hour window i just dont see it at all. I've maxed 3 5 hour windows since this happened, 0% chance it was double as much as normal, i ate up about 4-5% of my weekly total each time(this was ~10% each time pre announcements). wish i could give token numbers but its obscured i just know it was around 120k 4.6 with some delegation to sonnet subagents.
So SURE its almost certainly more allotted weekly, but if those totals are consistent for 5 hour blocks, you gotta split your daily usage into at least 3 sessions with 5 hours between them to even hit that weekly limit. its unreal how much they have burned their good reputation in a 2 month stretch, i am positive its also being astroturfed with bots more than happy to advance the narrative.
the internet is annoying, these tools are overall cool, just wish anthropic would go back to being semi predictable.
I’ve been using Codex from my phone for the past couple of months (through a tunnel, not this app).
I was initially quite excited, but I’ve found the results are less than great compared to being at a keyboard.
Something about the smaller screen size and/or lack of keyboard causes me to direct the agent less, which in turn creates more tech debt/code churn/etc.
Maybe I’m just showing my age, and I should practice voice dictation or something more, but my thoughts flow faster and more clearly on a keyboard (less ums).
It's not that I'm unimpressed by the results, it's that I think I'm saving time by pushing the agent along remotely, but the reality is that my messages to the agent(s) end up being a lot shorter, which inevitably leaves more up for interpretation.
Don't get me wrong, I still use Codex (and sometimes Claude Code) remotely every day, and am overall excited for this release, it's just that the benefit wasn't as high as I had initially hoped.
Part of this is due to the models getting better (no need to prod along with "continue"), and part of this is the nature of how I use my phone (short bursts of attention).
But again, maybe I'm just old and prefer big screens with a keyboard.
Just...write longer messages. Maybe it is age but I've written huge forum such as on HN all from my phone often with multiple tabs open to source various links for foot notes. When I type for an LLM, I will type a lot too if needed and will often even type a little, wait to think, then continue, over the course of like 15 minutes even, so that the intention of the prompt is correct since that saves much more time and produces better results than shorter messages.
I think you just need to type more rather than feeling constricted, as it's actually a form of liberation, to produce (or have an AI produce, whatever) something from wherever you are rather than needing to sit down on a laptop where you're gonna be waiting around anyway.
What tunnel setup do you use by the way? I'm on Android so it's kind of annoying all the LLM remote coding apps are iOS only.
Oh, I agree completely. I avoid loose language, revise my wording, and usually write prompts that require scrolling on mobile.
It isn’t so much that I feel restricted, I guess it’s that mobile wasn’t as big of a game changer as it was ~6 months ago.
My bandwidth feels more restricted by my own cognitive capacity (usually due to do context switching), rather than the limits of the model itself, and the mobile interface makes that worse.
I’ve recently found myself reserving larger tasks for “keyboard time” and reverting my thinking back to notes (in mobile), which I’ll then formulate to the LLM at some future time.
> What tunnel setup do you use by the way?
I “vibecoded” an agentic runtime that operates my machine generally (including TUIs like Codex/Claude Code), which I connect through a custom proxy and mobile app (both also vibecoded).
I previously tried Cloudflare Tunnels and an SSH setup, but it all felt a bit hacky.
Unfortunately the app is iOS only, but I could open source it and you’d probably be able to make an Android clone quickly (:
That could be cool, no issues with Claude Code not working in third party harnesses or their recent changes about different (more expensive) billing for programmatic usage? I guess I generally use OpenAI models which don't care.
I'm not sure about others but I can't coherently voice my thoughts through speech alone as I'd want to think and revise the message, so I generally don't use voice transcription with AI.
I've been coding on Android for a few months, mostly while walking around outside or showering. I'm on a mix of Tailscale + Termux + ssh server + tmux + codex CLI, Tailscale is great.
I think you may be able to optimize your workflow more by drafting your prompt in ChatGPT first; get it to expand out the intent for you. Doing that has made phone coding a lot more tolerable for me.
I like to think that I've given phone coding a fair shot (and I continue to do it), but I agree with the other poster that there's something about the lack of a keyboard that really gets to me :) I wish I knew what it was.
I was thinking about this, don't you think having everything in a terminal on a phone screen is a bit clunky to type in? Ideally I'd want it to be a clone of the official iOS apps from these AI companies with good mobile features like copy paste, smooth scroll, etc. Looks like the article's ChatGPT Codex functionality is in Android as well.
Nice to see some catch-up with Anthropic and others, but this doesn't actually offer cloud coding agents (my favorite Claude and Cursor capability).
I want to code from my mobile device when my laptop is off or unavailable, pushing PRs directly to GitHub. Codex mobile only works with a desktop machine, at which point, I'll just use that machine, what's the point.
The ability to unblock or redirect longer-running work from a phone seems underrated. Curious how often people will actually manage active coding threads this way.
Dang, I thought this was going to be integration for Codex Cloud, not the (still not available for Linux) Codex App. Not even Codex CLI, alas. You can still access the Cloud option from a mobile browser well enough but I prefer an app UI for poking at the things on the go.
You can do this from the CLI - `codex remote-control` works on Linux (I have no affiliation, just something I noticed).
They might just not have cut a new build yet, today. It 'works' on master, but the mobile app thinks that your build is outdated (v0.0.0) if you build from master without overriding version, so probably easiest to wait until they cut a build if they haven't.
> You can do this from the CLI - `codex remote-control` works on Linux (I have no affiliation, just something I noticed).
Woah, hadn't seen this before!
Off-topic, how long compile times do people have for codex-rs in openai/codex? Even my very beefy computer takes like 30 minutes to compile in release mode, makes me wonder why it's so slow and how this TUI got so large. But then I remember, agents like to write a lot of code, compilers get slower when they have to compile a lot of code :)
Try turning off LTO. Their default codex-rs/Cargo.toml uses `lto = "fat"`, which is... expensive and slow and... you really really don't need it for a local build that you're not distributing.
In my experience, although the build is a little slow, it's that LTO step that takes a million years.
GPT is unusable, untrustworthy, unreliable now [may 2026] - SouthPark poked fun at GPT calling everything a brilliant idea and the retaliation by "Open"AI was to nerf any response that didn't denigrate the user and inflate the LLM.
Is there a native way to work remotely with Claude/Codex on a local folder or git repo on your main machine without having to connect it to GitHub? For creating apps for personal use I’d rather just keep the files local.
Edit: Running into issues setting it up on Windows. There's no "/remote-control" command in the CLI, so I installed the Windows Codex app. Then I updated the iOS app which now has the "Codex" feature in the sidebar, which should allow remote access to the Windows machine's instance - except it doesn't connect. The iOS app shows my desktop's hostname, so it knows there's an instance there, but refuses to connect. Issues like this would persuade a lot of folks to switch back to Claude.
You can also connect remotely. Tailscale to connect to your network/machine. Then use SSH to login. Then use tmux to persist the session even if you log out.
I ask because I tried the other week to use /remote-control in Claude, and it prompted to connect a Github repo with no local alternative. Things may have changed since then.
My experience today with the new Codex remote control has been that it doesn't connect at all.
I tried apps that do this workflow (happy coder being one), but the workflow itself is rather clunky. You have to first start the session inside the remote machine. I now only do ssh, I can start or resume on whatever device suits at the time. The only downside is latency and connection drops, mosh solves it.
The right way to do this is Google Jules: the boundary is a git repository, the interface is a chat window you can open anywhere (yes, even simultaneously), the output is a diff you can choose to merge.
But, for whatever reason, no one uses Google Jules.
I don't want my phone to have the ability to execute things on my computer. Much less with a LLM in-between!
Just tried it and it doesn't work... won't let me create a new task as the repo selection is disabled... works fine on my laptop on the other hand and have been using it for some time.
This is extremely what Ive been wanting -- I had previously thought about using one of the hackish apps that try to deliver this experience - or spinning up something for this myself ... - but integrating this directly is definitely the right way to provide the best system and product experience -- and this seems to work out of the box exactly as I would want!
This is good not because I could work on code on the go. Codex excels not only at that but also at crunching through text. It's nice that now I can get an agent in my phone that understands my notes in Logseq. It's like my journals can now talk back to me.
Surely you mean grab a coffee and sit back down at your desk in your corporate office, because working remotely while your agent also does so is just preposterous.
I wish they'd have done this in a separate Codex app. On desktop I greatly prefer having Codex separate from ChatGPT... As compared to Claude, which is growing so fast and adding features so quickly it seems bolted together (I get why they do it, integrations/MCP-wise).
This specific feature is more akin to Remote Control in Claude. You could already kick off Codex Cloud tasks (although it's just a little more fiddly to do so).
If you can move to Codex Cloud (or "Claude Code for the Web"), I think it's the superior approach. Start it there, and just pick it up from the PR if necessary.
A while ago I created a telegram bridge for AWS Kiro CLI, this allows my to talk to the agent running on my server from anywhere. Any remote access to any of these agents is a massive game changer, it means that you don't need to hover infront of the pc while it works away at the problems. It changes your workflow, but I do find you need to force yourself to "turn off", its easy to do that with the PC, eg, just walk away, but when you can just "get the agent to do one more change" while waiting to pick the kids up or taking the dog for a walk, it can get difficult to stop.
This is a very myopic and unnecessary cynical sentiment. It's not about you - agents just need to run without your computer being on all the time. Coding is a background task that needs to run unattended now.
This is absolutely not true. I run dozens of Opus agents all day and they need so much constant attention and babysitting (lest everything turn to sh*t) that I would not qualify it anywhere close to “background”. And I’m sure as hell not wrangling these things from my *phone*.
I've been trying out various mobile, ai-assisted coding workflows.
Packing a Linux mini-pc in my rucksack, connected to display glasses, and voice-to-text with handy. Voice to text gets injected into a remote (Docker) codex session, running a hot reload web stack. I prompt to implement various features in an existing code base, where codex understands the structure and requirements.
If a feature is done, I take a moment to inspect the results on the display glasses, then move onto the next feature or keep iterating. It's not perfect, but I was able to implement a couple of not too complex features while walking my local national park.
The display glasses have a built-in 4-microphone array, and solid speakers. No need for a bulky headset or earbuds. Glasses come with monochromatic dimming, you can easily switch between dimming and see through.
If this comes with Linux integration, I will certainly give it a try.
I have been using Omnara now some months, on desktop and mobile. It's web/mobile remote for Claude and Codex.
I can do some tasks on mobile, especially if they are follow up and steering only, greatly increasing productivity as you can keep working whilst in transit, etc.
This is the EXACT evolution of this product that I've wanted. For simple tasks on some of my desktop machines, I don't want to mess with SSH or remoting into them, I just want to tell an AI agent what I need, let it build a plan that I press "Approve" on, and let it rip. This is the ClawdBot killer!
Nice. Next step is giving codex/Claude Code local device control...problem is the current ios/android are so locked down that agents can't do much ...but the space is so ripe for disruption that I bet we'll see AI-native devices coming out within the next few years that allow agents to interact with everything. I would be nervous if I were apple right now.
Now how would Apple get that sweet Mac money if you could do everything from your iPad? And that's exactly why they artificially segment those devices.
Wondering is it only me who vibe coded PWA mobile IDE and remote agent hosted on the laptop, which uses claude -p and local code to allow coding via mobile?
Somewhat related, some of these AI remote coding apps are iOS, what are people using for Android? Looks like some people are using terminal emulators to ssh into their machine and use the LLM CLIs but that seems clunky.
This is neat! Now I'm curious, what's left to innovate in the coding agent space? Sure there are the usual suspects like maintenance, security, reliability and other scalability improvements and looks like they will be addressed in the next year or two.
The entire UI/UX? We went back in time and basically have a text streamer in a 70's style terminal or existing editor-like situation. If you want to read and (hand)write code, sure, you might be done and be happy with the new variant of what you had decades ago.
A UI like Jira/Trello to stage features and see (agentic)team status. A Figma-like UX to actually build out the app/interface/features. A system that aids human review. There's tons of paradigms to explore and improve upon.
there is something "wrong" with the ux that is hard to pin down. these things generate even text summaries more rapidly than i can read them. i need a better method for dumping info into my brain + dynamic control (if necessary)
When I take time to read all of the output, I often find that it's mostly noise. I don't like noise so I usually don't bother.
But a person can use subagents, if they want, to filter that down. This burns tokens in a big hurry, but I think subagents can be arbitrary local commands (eg, a local LLM).
Or, you know: Just slow down. :) It doesn't always have to be a race, does it?
Agent farms. Have agents make tons of random high fidelity variations around the clock of the same app or feature from some vague ideas, and you use each of them to see which one you like best and can productize, and you skip the need to do iterative prompts.
Is texting your Coding Agent really the final form? Something that watches your interactions or process execution to surface improvements, or whips up prototypes while you brainstorm seems like the next step.
Not sure why this was flagged, this makes sense but only if inference gets sufficiently cheap. It would be awesome to see a bunch of interactive prototypes and iterate on the UX before ever building the full app. Historically that's been somewhat difficult even with UX designers.
It's refreshing that unlike Anthropic's Remote Control, this actually... works.
Feels like a testament to the value in taking time and doing it properly.
Now if only codex got its 1M token context window back.
---
Edit: Hmmm. Maybe I spoke too soon. Sigh. Definitely _more_ reliable by far overall, but still have queued messages with responses on my phone that don't show up on my computer, and responses that don't show up on my phone.
Edit 2: New threads created from my phone seem to have a little stall-out, but ones that are underway are behaving reasonably well.
Out of curiosity, what issues did you face with remote control on claude? I use it daily and it seems to work pretty well (bar the issues when my Mac would sleep and then the session would disconnect, but that's an issue on my end).
Myriad, to be honest. I find it to just constantly be in a 'torn' state, the UI is very mushy on mobile with a lot of the affordances from desktop missing, and... it's distinctly less useful when you can't... edit, rewind, start a new thread, etc.
That was my thought too. Claude Code and Codex are very close to Claw already (general purpose computer use) and moving increasingly in that direction (mobile integrations, built in memory features etc.)
The main issue is reliability, so I think the corporations are going to take a much more gradual, piecemeal approach, and probably end up with something like Claw within a year.
i'm not sure if i'm hallucinating, but i swear i had codex in the chatGPT app from long time ago (like the original codex on the web).
they added some new stuff, like remote control to wherever the desktop codex app is running, but these companies need to work much more on their press releases.
This sucks. Codex was already in the mobile app. And Codex in the browser or in the app sucks because it's not the same as local Codex (VS Code or CLI). And you can't pick the model. Sucks a$$.
What if you don't start on your laptop or workstation? Also, does the UI shown in the video reveal a model toggle? Doesn't look like it
Edit: actually it gets worse -- you can't start any tasks any longer in your mobile, you are required to sign in from your desktop/workstation. You can't "sign in" from your CLI.
Whoever made this, paper cuts I wish onto you. It sucks.
Edit2: actually you can ssh, so I presume that allows you to do the CLI -- and you still can do mobile first tasks, but it's not intuitive at all. Mobile first tasks you still can't pick the model, and I haven't tested the workstation connection one. That said, indeed paper cuts.
For many people, that's exactly why this is useful: less time on the computer, more time doing other things and occasionally checking in.
In those scenarios, the goal is not "work at any time" but to "be anywhere at any time", or, rather, to "be able to work from anywhere, doing anything".
Codex has been great in the last 3-4 months I've been using it, almost exclusively to review existing GDScript code, and this was the feature I wanted most, because with gamedev you get the best ideas when you're out and about or in bed :)
Claude on the other hand has been jank all around from the UX to the UI to the AI itself that it's baffling how it's more popular here on HN: https://i.imgur.com/jYawPDY.png
Sadly this remote control feature doesn't seem to be for Mac to Mac yet? I love the MacBook Neo as a "thin client" for AI and keep the MacBook Pro at home/hotel, and it would be nice to share Codex desktop sessions (without SSH → resume link)
Say what you want about OpenAI, but their software is actually pretty dam good especially compared to Anthropic and Google. Anthropic is just sloppy, and Google just doesn't live on this planet.
Both of the Codex apps are very good.
I tried this out and it works significantly better than Claude's remote control in fact the first few times I tried Claude's remote control it didn't even work and to this day is very buggy.
I use remote-control every day and haven't had many issues with it, aside from the fact that the mobile app being pretty limited, e.g there are no prompt suggestions like slash commands and skills, everything in the textbox is just a raw string. You also can’t start a new session directly from the app (have to SSH into the host manually to do that)
Other than those limitations, the connection has been very stable for me, definitely more reliable than alternatives like happy.engineering or Omnara. What’s been buggy for you specifically?
> Start investigating a bug while waiting for your coffee.
So the whole idea is not to make work more efficient. It's just to make you work more, all the time, while waiting for your coffee, while in your commute.
Can someone explain how the ChatGPT Codex Connector works in concert with GitHub access controls? I am not sure how to add it to my GitHub repositories, accounts, or organizations without potentially giving any OpenAI customer access.
I don't like this direction. For accessibility aspect, sure it is good. But Codex is a coding product. I am increasingly concerned of lack of reviewing practice. I doubt that a mobile app is good for reviewing code changes.
> Stay connected to active work from anywhere
... (and anytime because it's on your phone). No thanks.
opencode behind a nginx proxy with a standard user/password is sufficiently powerful. You can also upgrade to https://docs.linuxserver.io/images/docker-code-server/ and run any vscode plugins; opencode's plugin is pretty rudimentry but cline has been making a lot of strides.
You can run your local LLM and just connect the docker containers. I'm paranoid of being disconnected from the LLM, so I never run any of this on the same machine, so orchestrating a docker-compose file that provides the necessary services is important.
I'm still trying to find a good remote file system to loop into the setup for improved switching between cli and these web containers.
They needed to announce something after the Anthropic slop rewrite of Bun.
In an ideal world the would allocate 50% of compute to find errors in that rewrite and publish how bad Claude is, but that would undermine confidence in slop in general so that is not going to happen.
The best way I've found to work with LLMs is another OpenAI project, Symphony (which I implemented for Linear/GitHub and OpenCode[0]).
It integrates with your issue tracker and makes the tracker the UI for the LLM. It also clones the repo for every ticket, and can set up fixtures/etc. I can work on multiple items at a time, which is fantastic because otherwise you have to wait for the LLMs a lot.
(Someone deleted a comment about why you'd want a mobile Codex app. This is the answer I wrote.)
Once you've used these coding agents a lot, you develop a pretty intuitive feel for how they work, what they're capable of, what they're good at, and where their weaknesses are. Hopefully, you're already pretty familiar with the code base you're working on. Combining the two, this means you can get quite far essentially "vibe coding" (i.e. not looking at the actual code) on a new branch.
So if you have some idea or some issue you want to fix on the go, you just iterate with the agent for a bit (presumably no more than a couple hours) until the agent outputs an implementation. Here, I do claim there is some "skill" (which is a function of your codebase familiarity, general SWE ability, and facility with AI agents), and if you're good, this implementation will be halfway decent a high percentage of the time. Then when you're back at your desktop, you can review the changes carefully/do some proper testing/debugging etc. But you've saved a good chunk of time- an initial draft is already waiting for you.
I am not sure I understand the time savings you're describing here. Do you mean you saved the "time to write prompt into the text input box" because you got to do that sooner from your phone rather than write down your idea and do it when you got back to your computer?
Wouldn't you be doing the exact same thing had you been sitting at your computer when you had the idea?
Perhaps the person who wrote that had the mindset of "when I am away from my work, I want to be disconnected and present with the world around me, this updates now makes it so that I now have an excuse to carry work with me"
Maybe they're in a toxic/abusive work relationship where taking breaks is already difficult and this might lead to justifying working from your phone as "expected"
My question to you is: what is wrong with moving a little slower? Is time to prompt an optimization of a real bottleneck?
You can use STT and include a workflow that automatically extracts the requirements (filters all the um's, ah's, pauses) and it becomes more like an interaction where you act as the Product Owner/Manager and Codex is your Architect/Dev.
At least, that's how I code through my phone. But it does require some forethought in establishing your automated workflows. I'm at the point where my entire dev system has established templates for CI/CD so I can preview work in staging and production is still a manual step (obviously).
Sure, I too do that on the computer. Computers have microphones these days, and STT runs on my macOS as well. What was your point about in regards to my comment? I am not sure I understood you.
I was doing exactly this for a while with Claude Code. Very helpful when I'm away from home but can't stop thinking about my project. The remote agent has access to all the docs and instructions in my repo and most of the time gives me a decent draft I just need to polish later.
I unsubscribed from Claude after the performance regressions around the time of the Opus 4.7 update made it unusable. Been using Codex since then, and I've definitely missed being able to make these drafts. So I'm looking forward to trying this out.
For major new features on my SaaS this is exactly what I do on my phone/laptop sometimes over days or weeks. I never look at the code until I get a feeling that it's far enough along and then I will hop into the actual code and start manually making changes or using CC locally to make the changes iteratively over weeks until it's ready for release. In the early stages of a major new feature/product it can be counterproductive to closely monitor the AI. Of course like you said in your comment this requires very very strong knowledge of the code base and a lot of experience with using the agents in the first place. But once you can do this sort of workflow it's very powerful because you can do this in parallel with other work (just an hour or two per day over a week or two on your phone can get you to a really good first draft even on a major new feature/product. And of course I'm not saying it's ready for production that can still take weeks but that's not really the point)
So, the same thing we've all been doing already with Termius and Tailscale, just locked into ChatGPT?
But what if the code is on my laptop? Alongside the tools needed to work with it
Case in point, I have a Rust project with a target/ directory with about 10GB. Compile times from scratch takes about 10 minutes. (I do not love this)
With this mobile app I need to upload the code to the cloud, right? Or does OpenAI expects me to compile huge projects on my phone?
No, the phone connects to your local device. This isn't "codex web" on mobile. Basically you work through your desktop on your phone. So to be clear, there are security risks (you can wipe your entire desktop from your phone).
Not if you use Linux; app not available yet.
You can run Codex Desktop on Linux. It's on AUR already. Granted, just a repacked ASAR from Windows version but still does work quite well. Haven't tested connection to mobile yet but the integration with cloud environments already works.
The announcement doesn't make this very clear, but I think this talks to the Codex CLI, not the Codex App? (Or possibly both)
For now it appears that it talks only to the Codex App. Some users in this thread are saying that apparently the Codex CLI will support it on the next official release.
Codex App can connect to codex-cli via ssh. So you can use Codex App on a Mac with your projects/compilation etc. on Linux
I just use tailscale and remote desktop.
Not sure about how it works with Codex now, but with Claude you can just start a terminal session of claude code with your code checked out locally on your computer, and then enable remote control which lets you control that session from your phone.
So basically, it is like you are typing on your terminal on your computer from your phone.
basically yeah. the codex/claude desktop apps work the same way
I tried Codex web. It kinda sucks and OpenAI doesn't seem to be promoting it? Look elsewhere if you want a Linux VM in the cloud. (I quite like exe.dev and they do have good mobile support.)
It's beyond terrible. Like they're routing to gpt4o mini with low effort behind the scenes. Just let us pick the model and the effort.
The processes you're controlling are on your computer, similarly to Claude remote control.
I mean I'd love for them to take it further. If you put me on the phone with a talented software engineer I could supervise all sorts of changes. I wish I could do the same thing with my coding agent. Being able to be like, "hey remind me what's in that database table ... got it okay let's rename it to ..."
I'm also completely fine if it gives me hold mustic while it's working.
Would make my walks much more productive.
I've been vibe-refactoring a fork of get-shit-done (a skill collection for coding agents) for about the past week. I've had to revisit the same ideas multiple times because the agent doesn't always get it right at first, but it's still so much faster than I could have been at the same work + it's already mostly working (I've been dogfooding it for a day or two now). And I have gotten by just bringing up issues I notice from the LLM's implementation comments, rather than actually inspecting the code even once so far.
(The refactor's been to support Jujutsu VCS.)
> i.e. not looking at the actual code
You must be kidding me.
> There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists.
https://x.com/karpathy/status/1886192184808149383
Forgetting code exists is by definition not suitable for serious work. However, OP said in the following paragraph, that this would be a first draft, and that the code would actually be reviewed and tested properly before being integrated.
At which point it is by definition no longer vibe coding, because you do care about the code! It's just an AI assisted workflow, but now we call all of those vibe coding for some reason. (Naming things is hard!)
If vibe coding means not caring about the code, then a literal translation of the term would be "not caring about coding" coding.
> Forgetting code exists is by definition not suitable for serious work
This is just like everyone who says, “An iPad is not suitable for serious work.”
By which they (and you) generally mean, “What I do is serious work. What you do is unserious work.”
I think I do serious work – I mean they pay me for it? And I have only copy/pasted and just run whatever code’s been generated by AI for the past 12 months or so. Whenever I can I just let the AI run it itself.
Sad to learn that I’ve been so unserious all this time.
> Naming things is hard!
Indeed.
What OP said works quite well for a lot of tasks, and if you've set up base instructions on coding style they (Codex, Claude) generate code accordingly.
A key point is that after the "vibe" session you should also have a lot of tests written. So they can easily refactoring the code afterwards if there are major aspects you don't like when you get back to your desktop.
I find funny the trend of software engineers being shocked at the idea that someone would issue a set of instructions to a coder and not look at the code, or only glance at it.
How do you think the world has worked for the past thirty years? AI has just caught up with human skill is all.
Unbelievable. This is the silent de-skilling of this industry.
Imagine saying that you don't need to look at the roads or have no hands on the wheel whilst driving because someone-else said that the car can 'drive' itself; therefore, no need for anyone (including taxi drivers) to learn how to drive.
Just because a machine can generate plausible looking code does not mean you don't need to look at it or not know how it works or why it doesn't work.
Whats crazier is that Codex is free. I thought I had to pay to even try it out but nope, you can use the desktop app or cli for free, its apparently included in the free plan. You just have to sign in to your ChatGPT account.
Of course I am aware that the caveat here is that all my interaction is part of training, but I’m fine with that. Even Qwen Cli discontinued the free plan.
First hit is free… got to get you hooked.
How much better is it than Claude? I have both but Claude sucks up so many tokens.
I found it actually thinks about architecture and tests and not just spit out code with TODO in it like Claude.
I switched some time after Anthropic bricked their models with adaptive thinking. It's a legit mystery to me how people are still using CC professionally.
Codex is far less frustrating and manages context better. It's also costing me about 1/3rd as much as Opus 4.7 on CC.
The only way to keep using CC for me has been to stick to 4.6 1M
Oh I didn't know you could type /model claude-opus-4-6 and still use it.
Thanks!
Yes, and /model claude-opus-4-6[1m] gets you the larger context window. Happy to help :)
Thanks for the hint, but is a large context window actually that useful? I tend to get garbage too often with a normal big context window.
IME, based on an in-house bench it's still good to about 20% on the 1M for 4.6 and 4.7 with a code base >50k loc. The trick I used before switching providers was to have it write a handoff when it hit ~18% of context and reset.
There are also many people running 4.5 with specific parameters that claim to be having luck.
5.5 is absolutely comparable to opus 4.7 (both on highest effort), maybe even better. It generally seems less lazy, faster, and writes code closer to what I'd write. The only downside is that for very very long tasks, it can kind of lose track of the goal. For tasks under ten minutes I'll go with codex every time.
The main difference is in the frontend skills. GPT produces terrible design. What I do these days is ask Opus to produce an HTML mockup, then feed it to Codex.
I have not had problems with long goals. I let it chomp for 40 minutes on a proof in my custom theorem prover (xhigh fast), and it got there. Very happy with Codex, I ditched Claude for it.
They've added a new goal mode that might help with that
I stopped trying to use Claude to do anything with 4.7 because it sucks up so many tokens so quickly. I use the 4.6 model still and have switched to Codex for larger tasks. It also works better at more complex coding tasks than Claude for web apps that have python backends and typescript front ends.
Compaction is basically seamless which is a major weak point of Claude. At effort=low, Claude is better than codex but still slower. If you don't mind trading the upfront quality of work with additional micromanaging but at a faster speed, it is fine. I also think because of that very reason, you absorb more of the code.
Less gibliterrating and more doing
Very fast
Can’t you just turn off training on your data in the settings?
I was really unimpressed by the free Codex (for nodejs/react dev). I think it must be using a less powerful model or they’re limiting it in some other way.
Are you specifically pointing at a different experience between free + paid? Or just that the free version is unimpressive?
I'm using paid on TypeScript and it's genuinely terrific. Subjectively I think it has the edge over Opus.
I'd be surprised if OpenAI is hamstringing the free version. That would seem crazy from a GTM PoV. If anything the labs seem to throttle the heavy paid users.
Yes, the free version doesn't have access to the same models that the paid does.
You have access to 5.5 xhigh on free. Which model is missing except the 5.3 that run on cerebras?
It's only missing the trash models. Likely a user skill issue.
The free version of ChatGPT is definitely worse as well. My SO uses the free version and I can tell a significant downgrade.
Post your chat session
Can Codex chats be shared? (This is a genuine question; so far, I've only used Codex in CLI on Linux.)
Via jsonl file
I'm unimpressed by all LLMs, and especially unimpressed by the people claiming to be impressed by them.
I think it's free for about 2 useful requests and then you have to upgrade or wait?
Switching to GPT 5.4-mini can increase the number of requests we can use freely.
So basically a 20$ Claude plan lmao
I stopped using my Claude subscription because it became so prohibitive. Back to ChatGPT and Codex full time and been pretty happy. I miss the tone/writing style of Claude, but don't miss the frustration of being told I've reached my plan limits in a comically short amount of time.
Using these prompts/steering[0], setting Base style to Friendly, Warm to More, Enthusiastic to Default, Headers, Lists, and Emoji to Less, I have found I can get gpt-5.5 about ... 80% of the way there to writing as non-annoyingly as Claude. And it's so much faster and has such higher limits that that's worth it for me.
I also put together this ridiculous thing[1] because I missed the font and color scheme of Claude.
[0] https://gist.githubusercontent.com/dmd/91e9ca98b2c252a185e8e...
[1] https://github.com/dmd/aimpostor
How do you fit that entire prompt in the customized instructions ?
Some of it is in my customized instructions, some of it I fed pieces in at a time saying "remember this please:" so it goes into Memories.
I'm not entirely clear on the mechanism by which memories make it into context, so it's possible some of it isn't all the time, but it does seem to be working reasonably.
Again, it's not as good as Claude when it comes to writing "not like an AI". But it's significantly better than it was.
Thanks, I’ll give those a try!
FYI I'm actively working on aimpostor, so check back in a couple days for some quality improvements. (I'm definitely not going to bother with a Sparkle updater or anything like that.)
on Codex I ran into limits maybe like 2 times in 3 months, after doing several "upgrade this experimental game to my latest shared framework" passes on 5.5 Extra High
On which plan?
I can go through a 5-hour limit with a $20/mo Plus subscription in a few minutes with 5.5 Extra High. This causes me to reserve the latest/best rev for the harder problems.
5.5 really does seem to be very superior to 5.4, but it's also very expensive to run: The gas gauge moves fast. It's not very clearly defined whether 5.5 will cost less to get a problem solved quickly, or if a bunch of automatic iterations of 5.4 will solve it less-expensively. Both are often frustrating to me on the $20 plan.
(Also: Are you sure you're seeing it right? 5.5 has been in the wild for less than a month, so far. https://openai.com/index/introducing-gpt-5-5/ )
The standard $20 plan, on my existing Godot code: https://github.com/InvadingOctopus/comedot
Most of those commits since the last few months are thanks to Codex reviews (but the code is not AI generated): 5.5 since it came out, and 5.4 etc before that, almost always on Extra High because it's for a framework that underlies the other stuff I do so I want make to sure everything's correct.
Sometimes I have to run multiple passes on the same task: I rarely continue any session beyond 4-5 prompts to avoid "bloat" or accumulate "stale context", so sometimes Codex finds different stuff in subsequent reviews of the same file/subsystem.
The project is modular enough where each file can be considered standalone with only 1-2 dependencies, and I already used to write a lot of comments everywhere (something some people laughed at), so maybe that helps the AI along?
Thanks. That's good data.
I'm taking this, along with my own experience, to mean that the GPTs are cheaper to use for refactors of an existing body of work than they are for creating a new one.
(And perhaps part of that is in the name? These "LLM" contraptions are very good at translation, after all. And tokens seem to relate more to concepts than to specific phrases or words.)
the current state of that 20$ claude plan, despite twice this week them stating better usage. first for "double 5 hour usage", then for 50% overall more usage a week.
MAYBE the 50% overall is true, but the double usage during a 5 hour window i just dont see it at all. I've maxed 3 5 hour windows since this happened, 0% chance it was double as much as normal, i ate up about 4-5% of my weekly total each time(this was ~10% each time pre announcements). wish i could give token numbers but its obscured i just know it was around 120k 4.6 with some delegation to sonnet subagents.
So SURE its almost certainly more allotted weekly, but if those totals are consistent for 5 hour blocks, you gotta split your daily usage into at least 3 sessions with 5 hours between them to even hit that weekly limit. its unreal how much they have burned their good reputation in a 2 month stretch, i am positive its also being astroturfed with bots more than happy to advance the narrative.
the internet is annoying, these tools are overall cool, just wish anthropic would go back to being semi predictable.
I’ve been using Codex from my phone for the past couple of months (through a tunnel, not this app).
I was initially quite excited, but I’ve found the results are less than great compared to being at a keyboard.
Something about the smaller screen size and/or lack of keyboard causes me to direct the agent less, which in turn creates more tech debt/code churn/etc.
Maybe I’m just showing my age, and I should practice voice dictation or something more, but my thoughts flow faster and more clearly on a keyboard (less ums).
I'm not sure I follow, you develop code on a remote machine by speaking to your phone and are unimpressed by the result?
It's not that I'm unimpressed by the results, it's that I think I'm saving time by pushing the agent along remotely, but the reality is that my messages to the agent(s) end up being a lot shorter, which inevitably leaves more up for interpretation.
Don't get me wrong, I still use Codex (and sometimes Claude Code) remotely every day, and am overall excited for this release, it's just that the benefit wasn't as high as I had initially hoped.
Part of this is due to the models getting better (no need to prod along with "continue"), and part of this is the nature of how I use my phone (short bursts of attention).
But again, maybe I'm just old and prefer big screens with a keyboard.
Just...write longer messages. Maybe it is age but I've written huge forum such as on HN all from my phone often with multiple tabs open to source various links for foot notes. When I type for an LLM, I will type a lot too if needed and will often even type a little, wait to think, then continue, over the course of like 15 minutes even, so that the intention of the prompt is correct since that saves much more time and produces better results than shorter messages.
I think you just need to type more rather than feeling constricted, as it's actually a form of liberation, to produce (or have an AI produce, whatever) something from wherever you are rather than needing to sit down on a laptop where you're gonna be waiting around anyway.
What tunnel setup do you use by the way? I'm on Android so it's kind of annoying all the LLM remote coding apps are iOS only.
Oh, I agree completely. I avoid loose language, revise my wording, and usually write prompts that require scrolling on mobile.
It isn’t so much that I feel restricted, I guess it’s that mobile wasn’t as big of a game changer as it was ~6 months ago.
My bandwidth feels more restricted by my own cognitive capacity (usually due to do context switching), rather than the limits of the model itself, and the mobile interface makes that worse.
I’ve recently found myself reserving larger tasks for “keyboard time” and reverting my thinking back to notes (in mobile), which I’ll then formulate to the LLM at some future time.
> What tunnel setup do you use by the way?
I “vibecoded” an agentic runtime that operates my machine generally (including TUIs like Codex/Claude Code), which I connect through a custom proxy and mobile app (both also vibecoded).
I previously tried Cloudflare Tunnels and an SSH setup, but it all felt a bit hacky.
Unfortunately the app is iOS only, but I could open source it and you’d probably be able to make an Android clone quickly (:
That could be cool, no issues with Claude Code not working in third party harnesses or their recent changes about different (more expensive) billing for programmatic usage? I guess I generally use OpenAI models which don't care.
It's a lot easier to write long messages on a phone with something like Whisper.
I'm not sure about others but I can't coherently voice my thoughts through speech alone as I'd want to think and revise the message, so I generally don't use voice transcription with AI.
I haven't found that to be an issue. Just say what revisions you want. Once you're done, paste it into an LLM to clean it up into a usable prompt.
I've been coding on Android for a few months, mostly while walking around outside or showering. I'm on a mix of Tailscale + Termux + ssh server + tmux + codex CLI, Tailscale is great.
I think you may be able to optimize your workflow more by drafting your prompt in ChatGPT first; get it to expand out the intent for you. Doing that has made phone coding a lot more tolerable for me.
I like to think that I've given phone coding a fair shot (and I continue to do it), but I agree with the other poster that there's something about the lack of a keyboard that really gets to me :) I wish I knew what it was.
I was thinking about this, don't you think having everything in a terminal on a phone screen is a bit clunky to type in? Ideally I'd want it to be a clone of the official iOS apps from these AI companies with good mobile features like copy paste, smooth scroll, etc. Looks like the article's ChatGPT Codex functionality is in Android as well.
They are unimpressed by their (current) ability to use it, not the technology.
the ums are exactly the sign that you speak much faster than you type, so you need a pause for your thoughts to catch up
I've been trying voxtype (using whisper models) lately, and to my surprise all my ums are filtered out. It's really good now actually!
I don't see any way to use that on a phone.
Wispr flow cuts out ums. I love it
the main thing is functionality, you can always work around the ergonomics
Nice to see some catch-up with Anthropic and others, but this doesn't actually offer cloud coding agents (my favorite Claude and Cursor capability).
I want to code from my mobile device when my laptop is off or unavailable, pushing PRs directly to GitHub. Codex mobile only works with a desktop machine, at which point, I'll just use that machine, what's the point.
Yes it does. Codex, three dots menu: Codex cloud
The ability to unblock or redirect longer-running work from a phone seems underrated. Curious how often people will actually manage active coding threads this way.
Dang, I thought this was going to be integration for Codex Cloud, not the (still not available for Linux) Codex App. Not even Codex CLI, alas. You can still access the Cloud option from a mobile browser well enough but I prefer an app UI for poking at the things on the go.
Linux option for Codex App https://github.com/ilysenko/codex-desktop-linux
Mobile remote connection works, pushed the PR earlier today.
You can do this from the CLI - `codex remote-control` works on Linux (I have no affiliation, just something I noticed).
They might just not have cut a new build yet, today. It 'works' on master, but the mobile app thinks that your build is outdated (v0.0.0) if you build from master without overriding version, so probably easiest to wait until they cut a build if they haven't.
> You can do this from the CLI - `codex remote-control` works on Linux (I have no affiliation, just something I noticed).
Woah, hadn't seen this before!
Off-topic, how long compile times do people have for codex-rs in openai/codex? Even my very beefy computer takes like 30 minutes to compile in release mode, makes me wonder why it's so slow and how this TUI got so large. But then I remember, agents like to write a lot of code, compilers get slower when they have to compile a lot of code :)
Try turning off LTO. Their default codex-rs/Cargo.toml uses `lto = "fat"`, which is... expensive and slow and... you really really don't need it for a local build that you're not distributing.
In my experience, although the build is a little slow, it's that LTO step that takes a million years.
Oh, that's promising, thanks! I've just been using the npm version.
thanks. i dont use the app and so this is cool
Codex Cloud has been in the chatgpt app for quite some time now. If you click out of the new dialogs then you can access your cloud threads
GPT is unusable, untrustworthy, unreliable now [may 2026] - SouthPark poked fun at GPT calling everything a brilliant idea and the retaliation by "Open"AI was to nerf any response that didn't denigrate the user and inflate the LLM.
Is there a native way to work remotely with Claude/Codex on a local folder or git repo on your main machine without having to connect it to GitHub? For creating apps for personal use I’d rather just keep the files local.
Edit: Running into issues setting it up on Windows. There's no "/remote-control" command in the CLI, so I installed the Windows Codex app. Then I updated the iOS app which now has the "Codex" feature in the sidebar, which should allow remote access to the Windows machine's instance - except it doesn't connect. The iOS app shows my desktop's hostname, so it knows there's an instance there, but refuses to connect. Issues like this would persuade a lot of folks to switch back to Claude.
This is what /remote-control does in Claude Code, once it's running on your main machine. You can open it up in the phone app.
It flakes out in less than 24 hrs. I tried leaving a session open on remote control mode in a VM but it inevitably stopped with some token auth error.
You can run Codex and Claude on mobile from https://github.com/happier-dev/happier
I think the `/remote-control` feature does this, if I understand you correctly.
It's supposed to. I've always found it buggy and unreliable but maybe that's just me. (This command exists in Claude btw not sure about Codex)
Looks like codex has it too since last week, https://github.com/openai/codex/releases/tag/rust-v0.130.0
You can also connect remotely. Tailscale to connect to your network/machine. Then use SSH to login. Then use tmux to persist the session even if you log out.
Does it work on windows? And how do you then remote in?
That’s this announcement.
I ask because I tried the other week to use /remote-control in Claude, and it prompted to connect a Github repo with no local alternative. Things may have changed since then.
My experience today with the new Codex remote control has been that it doesn't connect at all.
I wish codex supported this, I use it all the time for claude.
I tried apps that do this workflow (happy coder being one), but the workflow itself is rather clunky. You have to first start the session inside the remote machine. I now only do ssh, I can start or resume on whatever device suits at the time. The only downside is latency and connection drops, mosh solves it.
The right way to do this is Google Jules: the boundary is a git repository, the interface is a chat window you can open anywhere (yes, even simultaneously), the output is a diff you can choose to merge.
But, for whatever reason, no one uses Google Jules.
I don't want my phone to have the ability to execute things on my computer. Much less with a LLM in-between!
because you can't rely on anything from them same for Antigravity, feels abandoned
only surprising product that feels non google for me is Google Stitch https://stitch.withgoogle.com/
that's because it's an acquisition (http://usegalileo.ai/)
Ah that makes a lot of sense now.
Google might shut it down soon.
Yes, and there are two entirely unrelated reasons for that:
1. It appears no one's using it.
2. Google.
It makes me sad though. This is how I want agents to work!
I don’t think I’d review code from my phone, but I can definitely see myself using this to keep a task from stalling while I’m away from the laptop.
Just tried it and it doesn't work... won't let me create a new task as the repo selection is disabled... works fine on my laptop on the other hand and have been using it for some time.
This is extremely what Ive been wanting -- I had previously thought about using one of the hackish apps that try to deliver this experience - or spinning up something for this myself ... - but integrating this directly is definitely the right way to provide the best system and product experience -- and this seems to work out of the box exactly as I would want!
This is good not because I could work on code on the go. Codex excels not only at that but also at crunching through text. It's nice that now I can get an agent in my phone that understands my notes in Logseq. It's like my journals can now talk back to me.
The best part is you don't have to stay at home waiting codex thinking, you can go out grab a coffee while connect your Codex and ask it to work
Surely you mean grab a coffee and sit back down at your desk in your corporate office, because working remotely while your agent also does so is just preposterous.
I wish they'd have done this in a separate Codex app. On desktop I greatly prefer having Codex separate from ChatGPT... As compared to Claude, which is growing so fast and adding features so quickly it seems bolted together (I get why they do it, integrations/MCP-wise).
This specific feature is more akin to Remote Control in Claude. You could already kick off Codex Cloud tasks (although it's just a little more fiddly to do so).
If you can move to Codex Cloud (or "Claude Code for the Web"), I think it's the superior approach. Start it there, and just pick it up from the PR if necessary.
OpenAi wants non devs to start coding as well, its just going to confuse users when there are two Apps.
A while ago I created a telegram bridge for AWS Kiro CLI, this allows my to talk to the agent running on my server from anywhere. Any remote access to any of these agents is a massive game changer, it means that you don't need to hover infront of the pc while it works away at the problems. It changes your workflow, but I do find you need to force yourself to "turn off", its easy to do that with the PC, eg, just walk away, but when you can just "get the agent to do one more change" while waiting to pick the kids up or taking the dog for a walk, it can get difficult to stop.
> Stay connected to active work from anywhere
And here I thought AI was gonna automate the world and we were gonna work less.
Turns out you’re gonna work 24/7 no matter where you are!
This is a very myopic and unnecessary cynical sentiment. It's not about you - agents just need to run without your computer being on all the time. Coding is a background task that needs to run unattended now.
This is absolutely not true. I run dozens of Opus agents all day and they need so much constant attention and babysitting (lest everything turn to sh*t) that I would not qualify it anywhere close to “background”. And I’m sure as hell not wrangling these things from my *phone*.
What would it take to remove yourself from the loop so your agents can go parabolic and kick off the singularity?
(Have them cover their own token costs, hehe).
if i didn't have to prompt it to learn from its mistakes and it just "intuitively" knew to do that
Why not work the same amount and be at your desk less?
your boss will probably prefer the colleague that can now work more, and isn't occasionally absent from their desk
The same reason I don’t have social media apps on my phone.
I've been trying out various mobile, ai-assisted coding workflows.
Packing a Linux mini-pc in my rucksack, connected to display glasses, and voice-to-text with handy. Voice to text gets injected into a remote (Docker) codex session, running a hot reload web stack. I prompt to implement various features in an existing code base, where codex understands the structure and requirements. If a feature is done, I take a moment to inspect the results on the display glasses, then move onto the next feature or keep iterating. It's not perfect, but I was able to implement a couple of not too complex features while walking my local national park. The display glasses have a built-in 4-microphone array, and solid speakers. No need for a bulky headset or earbuds. Glasses come with monochromatic dimming, you can easily switch between dimming and see through.
If this comes with Linux integration, I will certainly give it a try.
I have been using Omnara now some months, on desktop and mobile. It's web/mobile remote for Claude and Codex.
I can do some tasks on mobile, especially if they are follow up and steering only, greatly increasing productivity as you can keep working whilst in transit, etc.
I use Termius on my phone to remote and make agent do stuff while i chill or am on road. This seems useful too.
I also do this, but I find the terminal ergonomics are a little awkward on mobile. I do think termius is the best version of it I've used though
This is the EXACT evolution of this product that I've wanted. For simple tasks on some of my desktop machines, I don't want to mess with SSH or remoting into them, I just want to tell an AI agent what I need, let it build a plan that I press "Approve" on, and let it rip. This is the ClawdBot killer!
Nice. Next step is giving codex/Claude Code local device control...problem is the current ios/android are so locked down that agents can't do much ...but the space is so ripe for disruption that I bet we'll see AI-native devices coming out within the next few years that allow agents to interact with everything. I would be nervous if I were apple right now.
I’d finally have a use case for my overpowered iPad if it could compile and run code
Now how would Apple get that sweet Mac money if you could do everything from your iPad? And that's exactly why they artificially segment those devices.
Android can allow an app to control the device using accessibility permissions.
Wondering is it only me who vibe coded PWA mobile IDE and remote agent hosted on the laptop, which uses claude -p and local code to allow coding via mobile?
Could Codex CLI get this support also? I am sure a lot of us are running remote linux machines with Nvidia GPUs, with codex CLI running
I think this is a thing, maybe need to upgrade your codex cli.
Seems we must have a macOS Codex App running to enable linux box - phone communication
Most of us are missing the point. This gives a direct access to my machine with just one tool and no setup ( apart from Coded setup)
Now I can run scripts as hoc or stop my loss making stock market script without going through the hoops
Somewhat related, some of these AI remote coding apps are iOS, what are people using for Android? Looks like some people are using terminal emulators to ssh into their machine and use the LLM CLIs but that seems clunky.
As the winner of the everything app is revealed, I foresee this feature integrated in it. One platform to manage all agents by any provider.
Oh no, I am just adding codex integration to my app with in-app tailscale networking, communicating with codex app server via websocket over tailscale
But I will still consider to release it anyways
So Codex is also heading towards 'portability' and I can see that here but I bet this will take time before it's cleanly optimized for mobile hard use
This is neat! Now I'm curious, what's left to innovate in the coding agent space? Sure there are the usual suspects like maintenance, security, reliability and other scalability improvements and looks like they will be addressed in the next year or two.
The entire UI/UX? We went back in time and basically have a text streamer in a 70's style terminal or existing editor-like situation. If you want to read and (hand)write code, sure, you might be done and be happy with the new variant of what you had decades ago.
A UI like Jira/Trello to stage features and see (agentic)team status. A Figma-like UX to actually build out the app/interface/features. A system that aids human review. There's tons of paradigms to explore and improve upon.
there is something "wrong" with the ux that is hard to pin down. these things generate even text summaries more rapidly than i can read them. i need a better method for dumping info into my brain + dynamic control (if necessary)
Tell it to create html summaries with diagrams and sidebar for navigation.
Or ask Codex to create image that explains xyz.
When I take time to read all of the output, I often find that it's mostly noise. I don't like noise so I usually don't bother.
But a person can use subagents, if they want, to filter that down. This burns tokens in a big hurry, but I think subagents can be arbitrary local commands (eg, a local LLM).
Or, you know: Just slow down. :) It doesn't always have to be a race, does it?
Agent farms. Have agents make tons of random high fidelity variations around the clock of the same app or feature from some vague ideas, and you use each of them to see which one you like best and can productize, and you skip the need to do iterative prompts.
Some of us pay by the token.
Is texting your Coding Agent really the final form? Something that watches your interactions or process execution to surface improvements, or whips up prototypes while you brainstorm seems like the next step.
Not sure why this was flagged, this makes sense but only if inference gets sufficiently cheap. It would be awesome to see a bunch of interactive prototypes and iterate on the UX before ever building the full app. Historically that's been somewhat difficult even with UX designers.
Well, this just made it even easier to keep coding away from the desk.
It's refreshing that unlike Anthropic's Remote Control, this actually... works.
Feels like a testament to the value in taking time and doing it properly.
Now if only codex got its 1M token context window back.
---
Edit: Hmmm. Maybe I spoke too soon. Sigh. Definitely _more_ reliable by far overall, but still have queued messages with responses on my phone that don't show up on my computer, and responses that don't show up on my phone.
Edit 2: New threads created from my phone seem to have a little stall-out, but ones that are underway are behaving reasonably well.
Out of curiosity, what issues did you face with remote control on claude? I use it daily and it seems to work pretty well (bar the issues when my Mac would sleep and then the session would disconnect, but that's an issue on my end).
My own experience has been that it works for about five minutes before it just disconnects or hangs. I’ve never been able to use it successfully.
Myriad, to be honest. I find it to just constantly be in a 'torn' state, the UI is very mushy on mobile with a lot of the affordances from desktop missing, and... it's distinctly less useful when you can't... edit, rewind, start a new thread, etc.
Made a menu bar app you may find useful for MacBook sleep prevention, even when the lid is closed:
https://github.com/narcotic-sh/modafinil
Hey buddy, I submitted a pull request to you to support Intel chips
https://github.com/narcotic-sh/modafinil/pull/3
This feature could well be the reason OpenAI hired Peter Steinberger (OpenClaw).
That was my thought too. Claude Code and Codex are very close to Claw already (general purpose computer use) and moving increasingly in that direction (mobile integrations, built in memory features etc.)
The main issue is reliability, so I think the corporations are going to take a much more gradual, piecemeal approach, and probably end up with something like Claw within a year.
They are just copying features from Claude Code.
No.
I don't understand OpenAI's product strategy.
It seems pretty simple:
1) Keep getting investors to give them money.
2) Convince the right people that OpenAI is "critical to national security" so that when 1 runs out, they can get bailed out by the government.
Everything else is just set dressing.
What part is confusing?
> I don't understand OpenAI's product strategy.
Neither does OpenAI.
So we can finally stop tailscale + ssh + codex. Nice
i'm not sure if i'm hallucinating, but i swear i had codex in the chatGPT app from long time ago (like the original codex on the web).
they added some new stuff, like remote control to wherever the desktop codex app is running, but these companies need to work much more on their press releases.
That was cloud codex. Not comparable
This sucks. Codex was already in the mobile app. And Codex in the browser or in the app sucks because it's not the same as local Codex (VS Code or CLI). And you can't pick the model. Sucks a$$.
Isn’t what you described exactly what they just released? Now it connects to local Codex and you can pick the model?
What if you don't start on your laptop or workstation? Also, does the UI shown in the video reveal a model toggle? Doesn't look like it
Edit: actually it gets worse -- you can't start any tasks any longer in your mobile, you are required to sign in from your desktop/workstation. You can't "sign in" from your CLI.
Whoever made this, paper cuts I wish onto you. It sucks.
Edit2: actually you can ssh, so I presume that allows you to do the CLI -- and you still can do mobile first tasks, but it's not intuitive at all. Mobile first tasks you still can't pick the model, and I haven't tested the workstation connection one. That said, indeed paper cuts.
friends, you don’t have to always be productive. leave the agent on the computer and take care of yourself.
For many people, that's exactly why this is useful: less time on the computer, more time doing other things and occasionally checking in.
In those scenarios, the goal is not "work at any time" but to "be anywhere at any time", or, rather, to "be able to work from anywhere, doing anything".
Sort of....I guess.
I love how all these software engineers are willingly walking into a productivity trap where they will be exploited.
I’m not a swe but damn, I’d hate to be one.
Hammock-driven development will get a new meaning.
This is really useful for when you just need to approve plans or make small decisions.
Codex has been great in the last 3-4 months I've been using it, almost exclusively to review existing GDScript code, and this was the feature I wanted most, because with gamedev you get the best ideas when you're out and about or in bed :)
Claude on the other hand has been jank all around from the UX to the UI to the AI itself that it's baffling how it's more popular here on HN: https://i.imgur.com/jYawPDY.png
Sadly this remote control feature doesn't seem to be for Mac to Mac yet? I love the MacBook Neo as a "thin client" for AI and keep the MacBook Pro at home/hotel, and it would be nice to share Codex desktop sessions (without SSH → resume link)
Say what you want about OpenAI, but their software is actually pretty dam good especially compared to Anthropic and Google. Anthropic is just sloppy, and Google just doesn't live on this planet.
Both of the Codex apps are very good.
I tried this out and it works significantly better than Claude's remote control in fact the first few times I tried Claude's remote control it didn't even work and to this day is very buggy.
I use remote-control every day and haven't had many issues with it, aside from the fact that the mobile app being pretty limited, e.g there are no prompt suggestions like slash commands and skills, everything in the textbox is just a raw string. You also can’t start a new session directly from the app (have to SSH into the host manually to do that)
Other than those limitations, the connection has been very stable for me, definitely more reliable than alternatives like happy.engineering or Omnara. What’s been buggy for you specifically?
Nigerian engineers are going to have a time of their lives
> Start investigating a bug while waiting for your coffee.
So the whole idea is not to make work more efficient. It's just to make you work more, all the time, while waiting for your coffee, while in your commute.
Ask yourselves: is that the society we want?
This is super nice!!!!!!
rust and opensource W
macOS only so far. "Windows is coming soon"
Can someone explain how the ChatGPT Codex Connector works in concert with GitHub access controls? I am not sure how to add it to my GitHub repositories, accounts, or organizations without potentially giving any OpenAI customer access.
I don't like this direction. For accessibility aspect, sure it is good. But Codex is a coding product. I am increasingly concerned of lack of reviewing practice. I doubt that a mobile app is good for reviewing code changes.
> Stay connected to active work from anywhere
... (and anytime because it's on your phone). No thanks.
Buggy af though
opencode behind a nginx proxy with a standard user/password is sufficiently powerful. You can also upgrade to https://docs.linuxserver.io/images/docker-code-server/ and run any vscode plugins; opencode's plugin is pretty rudimentry but cline has been making a lot of strides.
You can run your local LLM and just connect the docker containers. I'm paranoid of being disconnected from the LLM, so I never run any of this on the same machine, so orchestrating a docker-compose file that provides the necessary services is important.
I'm still trying to find a good remote file system to loop into the setup for improved switching between cli and these web containers.
They needed to announce something after the Anthropic slop rewrite of Bun.
In an ideal world the would allocate 50% of compute to find errors in that rewrite and publish how bad Claude is, but that would undermine confidence in slop in general so that is not going to happen.
The best way I've found to work with LLMs is another OpenAI project, Symphony (which I implemented for Linear/GitHub and OpenCode[0]).
It integrates with your issue tracker and makes the tracker the UI for the LLM. It also clones the repo for every ticket, and can set up fixtures/etc. I can work on multiple items at a time, which is fantastic because otherwise you have to wait for the LLMs a lot.
[0] https://github.com/skorokithakis/symphony
Can someone recommend an IDE that can be used with a self-hosted model (via OpenAI or similar)?
Look up OpenCode
vs code supports local models (bring your own key/model)
you need a model server - ollama/llama.cpp/lm studio
> bring your own key
Do you mean supporting oai-compatible api URLs in copilot? If so then you need either VS Code Insiders, or a VS Code extension I believe?