This language is very popular among quant finance people associated with Morgan ...

wood_spirit · on June 1, 2024

Thinking about problems and data manipulation in the way array languages enable is hard but, once you have pushed through the what feels like a barrier of mainstream programming language thinking that is stopping you from “grokking” it, it is a sudden moment of clarity and then you “get it”.

Perhaps a half way house is sql. The difference between ORM-style CRUD and a power user using window functions to make the data dance shows there is still art to be had in programming :)

breck · on June 1, 2024

Agreed. Pushing through until you can think in array languages is well worth it! In my experience one of the top 30 highest ROI mental circuits you can develop.

That being said, I'm not convinced that the extremely minimal syntax is essential. I think it can be done another way ;)

hoosieree · on June 1, 2024

I started with J but now prefer K and I'm writing an interpreter for a K-like language in my spare time for fun.

I'd say the minimal syntax isn't just a gimmick, because it really does help with mentally chunking phrases/idioms to a degree that's not possible when the same phrases are multiple lines long. Terseness also makes it physically faster to write the same thing, which encourages interactive experimentation much more than other languages.

These are small things, but taken together you get an experience that's more than the sum of its parts.

A lot of folks seem to tolerate K syntax because K jobs pay well. (Supposedly. I've never seen a super-high paying K job in real life.) But I actually like the K syntax because it helps organize my problem solving, and it gets out of the way during experimentation time. To me it's like NumPy/Pandas but better designed and without all the ceremonial boilerplate.

jimberlage · on June 2, 2024

Would an array language without the terseness, but with hygienic macros, fit the same niche?

rak1507 · on June 2, 2024

I don't see how it could, the terseness is the point. I'm not sure how a macro system could really help. Did you have a specific idea in mind?

jimberlage · on June 2, 2024

The thought was - if I have a long piece of code repeated, and I want it to be shorter, I can

- Use a language that minimizes the code to write

- Use a helper function, maybe at some runtime cost

- Use a macro, turning a short piece of code into a longer one, at a compile-time cost

Having a DTI (debt-to-income ratio) macro and a very short definition of DTI that looks similar everywhere in code sort of do the same thing.

bravura · on June 1, 2024

Can you please share a few of the other 30 highest ROI mental circuits to develop?

breck · on June 1, 2024

Good question.

I should make a ranked list.

In regards to programming, my top 30 would include:

ScrollSets: https://breckyunits.com/scrollsets.html

RegEx.

The dataflow paradigm as popularized by dplyr would be on there.

HIT ranking: https://breckyunits.com/hits.html

raverbashing · on June 1, 2024

But tbh it looks worse than matlab

eggy · on June 1, 2024

Most people I know who actually learn an array language like k or j usually grow to appreciate the expressiveness and cleverness of these languages. Typically, people have your reaction who have only looked at it and tried it very briefly. I'm surprised. Why did you have to learn it? Where?

eigenvalue · on June 1, 2024

Was working at a quant pod at Millennium for a bit where they used it. I was ultimately able to use it but everything took me 20x longer than using Numpy/Pandas. The irony was that the Python code was shorter because there were so many more library functions and better abstractions and syntax. So it was slow and unintuitive for zero benefit whatsoever.

Qem · on June 1, 2024

> was ultimately able to use it but everything took me 20x longer than using Numpy/Pandas.

You can try klongpy, a K-like array language implementation that runs atop numpy: https://pypi.org/project/klongpy/

eggy · on June 4, 2024

Geometric mean in Numpy vs. J:

(Copied from some forum, since I don't use Python much)

  import numpy as np

  def geo_mean_overflow(iterable):
      return

  np.exp(np.log(iterable).mean())

Or,

  from statistics import.
  geometric_mean

  geometric_mean([1.0, 0.00001, 10000000000.]) # 46.415888336127786

In J, since I don't know K:

  gm=:#%:*/

Even shorter than Python whether it's a canned lib routine or created from composing simple functions.

And I don't need to format code on HN in J because it's so short anyway, besides I don't know how!

wood_spirit · on June 1, 2024

But how did your perf compare to the best of the K kicking quants around you? Were they too being less productive than they would have been in python?

I’m not saying they were right or better. Horses of courses. Array languages do my head in and my choice is sql.

eigenvalue · on June 1, 2024

I was able to explore new ideas much much faster using Python than the experienced k people could. But creativity is more important anyway. Ultimately, having good ideas/data/signals trumps fancy or fast data wrangling. Glad I’m doing other things now in any case.

gitonthescene · on June 2, 2024

Any detail whatsoever would make this a more credible claim. I haven’t met many people, including those skeptical of the performance claims, who have called K _slow_. Maybe for particular domains but I’d doubt that includes the kind of quant work that gets done at Millennium.

spopejoy · on June 2, 2024

I've heard plenty of complaints over the years, and only within quant, unsurprisingly since that is the only field you'll get paid to use K.

A brilliant programmer I met who came from DE Shaw said he reimplemented a K-based portfolio optimization pipeline because the performance hit a wall once the dataset got large enough. He was able to beat K with Java of all things.

Columnar and timeseries dbs have continued to evolve, K is the same tech it was in the 2000s. The only reason it gets used at a Millennium is that whatever trade is still printing money, not any tech advantage.

gitonthescene · on June 3, 2024

I appreciate it's probably not possible to share too many details but I wouldn't be surprised if the choice of Java wasn't simply preference. It may have been a problem with the pipeline rather than with K. I.e. a fix might have been available using K but it can be easier (and harder) to just use some out-of-the box solution. I agree that columnar and time series dbs may have caught up with K over the years, but most of the complaints I've heard about K aren't technical.