A hundredth the price and a quarter the quality means that this is here to stay....

A_D_E_P_T · on Jan 24, 2024

> A hundredth the price and a quarter the quality means that this is here to stay

No, it's simply that those noobs don't know how to use LLMs. They'll eventually learn.

Basically, you don't use them to dig up new information, unless you're extremely careful about triple-checking that information. Google Scholar's legal database search is better for that. You use LLMs to write boilerplate, paraphrase, edit, and synthesize information from your own sources. Do it properly, and you'll never "hallucinate" a fake case in one of your legal filings, and you'll be able to write 'em in 5% of the time.

consp · on Jan 24, 2024

> Do it properly, and you'll never "hallucinate" a fake case in one of your legal filings, and you'll be able to write 'em in 5% of the time.

All fun and games for those who can and good for them, but I'm betting the majority can't or won't. The result being society pays for those later ones incompetence.

sgt101 · on Jan 24, 2024

I've led a team building an LLM agent for customer service.

Our finding is that it's between 50% and 10% of the operational cost of a human for a case. This costing is based on the range of costs for offshore vs. nearshore workers and doesn't account for a lot of the overhead of a human powered service organisation (in the jargon this isn't fully loaded).

I believe that the real cost is about 20% if dev expenses are included - but that's just my view of where inbetween the bounds thing come to rest.

Now, that's not 100th. In terms of quality, there are things it can't do and despite our architecture (which is aimed at managing the deficiencies of LLM's) we still see some hallucinations creeping through. For example our encoder has problems with directionality as in it will write text like "average transaction value declined from $150 to $154 in october." We can catch (in our tests anyway) all the mistakes about the values, but the actual textual phrasing is hard to check - at the level where hard means I think that the value of the system doesn't justify it.

I think, from customer feedback, that this sort of thing will be ok for the apps we are building, but it is a real problem with this generation of models and it's not clear to me that it will be solved in the future (although like everyone else I was blindsided by the jump from GPT3 to 4 so who knows).

happytiger · on Jan 24, 2024

Really interesting insights and a really great comment.

I expect the technology to accelerate including dramatic leaps in accuracy and for LLM technology to make geometric improvements (just larger models and better hardware will improve them substantially, and that’s already coming to market in 2024-2025).

lapcat · on Jan 24, 2024

> since all that is needed is improved accuracy

Oh, is that all?

You speak as if truth and facts are just minor details, perhaps to be added in a .1 update.

happytiger · on Jan 24, 2024

Yea I believe it will progress geometrically and not linearly so basically that’s how I see it.