Jetro – JSON query engine for Rust (jq-like DSL with compilation and VM)

(github.com)

15 points | by mitghi 17 hours ago

2 comments

drzaiusx11 33 minutes ago

Random question: has the JQ query language been formalized? I have at least 3 separate implementations installed on my machine atm: JQ, Python YQ and the Go YQ CLI. They're all "mostly compatible" but at this point there should probably be a formal spec / grammar if there isn't one already. The minor differences between implementations are where they get you...

Hey everyone,

I’ve been building Jetro, a Rust JSON query language inspired by functional language paradigm.

The project started about three years ago, but recently went through a serious design overhaul. The newer direction is planner-driven, and focused on being expressive without becoming overwhelming.

It aims to cover much of the everyday usefulness of jq, but with a smaller, more approachable query language and a Rust embeddable engine. Jetro uses simd-json as its primary JSON parser and tries hard not to do unnecessary work on top.

One fun part is demand propagation:

    use jetro::Jetro;

    let j = Jetro::from_bytes(bytes)?;
    let out = j.collect(
      "$.orders
        .map({ id: @.id, total: @.items.map(@.price * @.qty).sum() })
        .filter(@.total > 100)
        .map(@.id)
        .first()"
    )?;

In many query languages, this kind of chain naturally becomes:

    parse/materialize all orders
    compute totals for every order
    filter all shaped orders
    extract all ids
    then take the first one

Jetro tries to read the chain from the end backward and ask what is actually needed. Then elements stream through the pipeline only until that need is satisfied:

    need one id
    <- need one matching order
    <- pull orders one at a time
    <- compute total for the current candidate
    <- if it matches, emit the id and stop

So the pipeline is not "finish every stage, then move to the next." It is more like: pull one item, shape it, test it, maybe emit it, and stop as soon as the query has enough.

The same idea works for richer object-shaping queries too:

    let out = j.collect(
      "{
        errors: $.events
          .drop_while(@.level != 'error')
          .filter(@.service == 'checkout')
          .map(match @ with {
            { level: 'error', message: msg, timestamp: ts } -> {
              kind: 'error',
              ts: ts,
              msg: msg
            },
            { level: 'warn', message: msg } -> {
              kind: 'warning',
              msg: msg
            },
            _ -> {
              kind: 'other'
            }
          })
          .take(20),
    
        slow_orders: $.orders
          .filter(@.latency_ms > 500)
          .map({ id: @.id, latency: @.latency_ms })
          .take(10),
    
        first_vip: $.customers
          .filter(@.tier == 'vip')
          .map({ id: @.id, region: @.region })
          .first()
      }"
    )?;

There’s also jetrocli for terminal, a book for learning the language.

Would love get some feedbacks, or ideas for what would make this useful to you.

Thanks

Jetro CLI: https://github.com/mitghi/jetrocli Book: https://github.com/mitghi/jetro-book