Hacker Newsnew | past | comments | ask | show | jobs | submit | tontinton's commentslogin

Something I've been working on out of frustrations with the existing AI coder CLIs

Yeah we all converge to the same workflow, in my ai coding agent I'm working on now, I've added an "index" tool that uses tree-sitter to compress and show the AI a skeleton of a code file.

Here's the implementation for the interested: https://github.com/tontinton/maki/blob/main/maki-code-index%...


I'm curious, what does your workflow look like? I saw a plan prompt there, but no specs. When you want to change something, implement a new feature etc, do you just prompt requirements, have it write the plan and then have it work on it?

Oh, that's great.

I've always wanted to explore how to fit tree-sitter into this workflow. It's great to know that this works well too.

Thanks for sharing the code.

(Here is the AIPack runtime I built, MIT: https://github.com/aipack-ai/aipack), and here is the code for pro@coder (https://github.com/aipack-ai/packs-pro/tree/main/pro/coder) (AIPack is in Rust, and AI Packs are in md / lua)


Is it similar to rtk? Where the output of tool calls is compressed? Or does it actively compress your history once in a while?

If it's the latter, then users will pay for the entire history of tokens since the change uncached: https://platform.claude.com/docs/en/build-with-claude/prompt...

How is this better?


This is a bit more akin to distill - https://github.com/samuelfaj/distill

Advantage of SML in between some outputs cannot be compressed without losing context, so a small model does that job. It works but most of these solutions still have some tradeoff in real world applications.


We do both:

We compress tool outputs at each step, so the cache isn't broken during the run. Once we hit the 85% context-window limit, we preemptively trigger a summarization step and load that when the context-window fills up.


> we preemptively trigger a summarization step and load that when the context-window fills up.

How does this differ from auto compact? Also, how do you prove that yours is better than using auto compact?


For auto-compact, we do essentially the same Anthropic does, but at 85% filled context window. Then, when the window is 100% filled, we pull this precompaction + append accumulated 15%. This allows to run compaction instantly


Very cool, have you taken a look into what TigerBeetle does with VSR (and why they chose it instead of raft)?


Yes I’ve read through TigerBeetle’s VSR design and their rationale for not using Raft.

VSR makes a lot of sense for their problem space: fixed schema, deterministic state machine, and a very tight control over replication + execution order.

Ayder has a different set of constraints: - append-only logs with streaming semantics - dynamic topics / partitions - external clients producing arbitrary payloads over HTTP

Raft here is a pragmatic choice: it’s well understood, easier to reason about for operators, and fits the “easy to try, easy to operate” goal of the system.

That said, I think VSR is a great example of what’s possible when you fully own the problem and can specialize aggressively. Definitely a project I’ve learned from.


Can I select multiple receivers concurrently, similar to a select in Linux?


Learn about the data structures & algorithms that make up modern log search engines like Elasticsearch.


You're right, I'll fix it in the post.

Thanks!


Yeah I might have been wrong, simply went with https://blog.stenmans.org/theBeamBook/#_reductions

Now that I reread that section, it also depends on whether you call a BIF or not, I'll think about how to phrase that better in the blog post.

Thanks!


Even some BIFs and NIFs (BIF are mostly just an built-in NIFs) can yield [1] and choose to reschedule themselves later. But inside the C code it is voluntary, of course. An example of this can be seen when running hash functions [2].

Another interesting idea is that a NIF which may not reschedule themselves with yield, can still signal the scheduler that they are consuming so many "reductions" if they did some lengthy work. So if, say, they run for 10 msec, they might want to bump the reduction counter a bit: enif_consume_timeslice. Otherwise a tight recursive loop calling the same NIF could still effectively block the VM.

[1] https://www.erlang.org/doc/man/erl_nif#enif_schedule_nif

[2] https://github.com/erlang/otp/blob/ab7b354c37dac92704faac455...

[3] https://www.erlang.org/doc/man/erl_nif#enif_consume_timeslic...


Thanks a lot! :)


I thought about it, I just wanted people to get hooked with some eye candy, and to want to continue reading ;)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: