You can always ask your parent company to train on their usage. I hear they have incredibly massive codebases: Windows, Office, MSSQL, which stay out of training data for some reason.
I thought neural nets never repeat the training data verbatim, and copyright does not pass through them, so what's the problem?
Those sellers never disappeared; although I'm from not from Paraguay, the situation is familiar. These days they're selling desktops built on 10+ year old Xeons which you can buy for dirt cheap on AliExpress, installed on frankenstein motherboards from noname Chinese manufacturers which are desktop-oriented, but take server processors. The graphics card is something old like RX480, and comes from being run into the ground by years of crypto"currency" mining, then resoldered on a new board, also often developed by Chinese manufacturers you've never heard of.
Graphics cards especially are very unreliable and frequently die within a few months of purchase. But when you can buy a whole PC for the price of one modern videocard, many don't have a choice.
The notion that GPU chips can be "run into the ground" by years of crypto mining or AI workloads has been debunked pretty thoroughly by now. The hardware is quite resilient, it doesn't really fail at a higher rate.
I bought a XFX RX 580 that was used for cryptomining from eBay, used it for 3 years then handed it down to my son who used it for 2 more. It still worked when he removed it. Can confirm.
I used an AMD Radeon 7770 for years, then I upgraded to a GTX 1060, the 7770 became my younger sister's GPU for her PC. It has only recently gave out, after some 14 years of service.
Likewise for the 1060, its still going strong. I upgraded to a 3060 round about the time my younger brother decided he wanted a PC so he is now using it without any issues. About 10 years use out of it, plenty more to go.
GPUs are pretty damn resilient if you aren't pissing around with them.
I think the problem is the distinction here between chips and boards. The entire GPU assembly can absolutely be worn down from continuous use, thermal pads, paste, VRMs, fans do degrade. The chip itself may be fine but it's very rare to find anyone willing to transplant a GPU from one board to another.
Also any miner worth their salt knows to undervolt to save power (=money), run cooler, last longer, and run at very close to full speed or even in some cases 100% full-speed, depending on silicon "lottery".
Well, a lot of people here would have loved to have 10-year old Xeons in their motherboards; while power hungry, I guess they would make good CPUs since they have good cache sizes. But no, there's no Xeons in our offers here. What people get here now are Intel Pentium and Celeron-branded CPUs, or N-class CPUs, with the onboard GPU only, 4GB RAM and 1 TB HDD running unlicensed Windows with understandable results. But when you are a digitally illiterate parent seeking to purchase a first PC for your children of school age, this looks attractive enough at a good price point.
Don't look at the branding. Look at the core type, count, and speed (maybe).
It's been a while since I shopped Intel, but they used to typically release a low core count/lower clock speed Pentium/Celeron on the mainstream cores, often with no hyperthreading. These were typically low cost and could be a good value, you'd get decent single core performance because it's the newest architecture and multicore performance would be iffy but you can't have everything.
> N-class CPUs
These are definitely worth avoiding most of the time. Usually twice the cores, but much less performance per clock. Never feels fast for interactive work. But they make sense for some situations. Some of these get an n3 branding to trick people looking for i3s.
> These are definitely worth avoiding most of the time.
They may not be ideal for desktops, but they are great low power home server CPUs. In fact, they are much better than ARM alternatives like Raspberry Pis for the money.
E.g. gowinston.ai gives 98% probability that the comment is human written. LLM detectors of course aren't always correct, but generally their detection performance for pure LLM text can be high (accuracy % in high 90s).
Do you have some specific techniques or strategies for LLM text detection? Have you validated them?
Anything from Etymotic never failed me. The current ER3 SE has been going for 7 years, and the cable is replaceable (when/if it fails — they're still on the original cable).
All Etys have a peculiar love/hate neutral sound profile, so you should try them before committing to them. I exclusively listen to podcasts, so they're a perfect match.
Very pleased to see such performance improvements in the era of Electron shit and general contempt for users' computers. One of the projects I'm working on has been going for many years (since before React hooks were introduced), and I remember building it back in the day with tooling that was considered standard at the time (vanilla react-scripts, assembled around Webpack). It look maybe two minutes on a decent developer desktop, and old slow CI servers were even worse. Now Vite 8 builds it in about a second on comparable hardware. Another demonstration of how much resources we're collectively wasting.
> Very pleased to see such performance improvements in the era of Electron shit and general contempt for users' computers.
Luckily, we have invented a completely new nightmare in the form of trying to graft machine-usable interfaces on top of AI models that were specifically designed to be used by humans.
It is especially weird because JavaScript was not supposed to be processed at all! This is all wrong if you ask me. Web development should strive to launch unchanged sources in the browser. TypeScript also was specifically designed so engine could strip types and execute result code. These build tools should not exist in the first place.
Jobs’ complaint wasn’t actionscript the language, it was the security and performance nightmare of the Flash runtime.
Though it’s hard to imagine what the web would look like if the language had become the standard. JS is a pain but AS was even less suitable for general purpose compute.
And at least the "performance nightmare" is an irony from today's perspective as the Flash player wasn't actually slow at all! It was the incapability of the Safari browser to handle plugins in a good way and on mobile devices. Today's implementations of mobile application, JavaScript heavy applications and websites are much much more performance heavy.
Both not minifying and including unenforced type hints consumes a little bandwidth though this can be largely offset by compression. This is an engineering trade off against the complexity of getting source maps working reliably for debugging and alerting.
If I am shipping a video player or an internal company dashboard how much of my time is that bandwidth worth?
This feels like a ridiculous thread that captures everything wrong with modern Javascript ecosystem.
It's grown into a product of cults and attempted zingers rather than pragmatic or sensible technical discussions about what we should and shouldn't expect to be able to do with an individual programming language.
edit: to clarify, I assume there needs to be a basical level of comprehension of programming languages to debate the nuance of one, and if you can't think of a single reason as to why someone would want types removed, that's a possible indicator you don't have that necessary level yet, and I think the most effective way for you to learn that is to Google it. Sorry for coming across as rude if you genuinely don't know this stuff.
If you already know many reasons as to why types would be removed, then it seems disingenuous to ask that question, other than to make the point that you feel types shouldn't be stripped. If you think that, say it, and explain why you think they shouldn't be stripped.
The current state of Javascript is you _have_ to remove types; I was pointing out I can think of reasons why I sometimes wouldn't want to. (Admittedly in a glib manor; though on this site many prefer that to four paragraphs)
Hear me out: javascript integrates types as comments (ignored by default) in its standard and engines start to use types as performance / optimization hints. If you mistype, your program runs, but you get worse performance and warnings in console. If you type correctly, your program runs more efficiently. We already have different levels of optimization in V8 or JSC, why can't they use type hints to refine predictions?
> TypeScript also was specifically designed so engine could strip types and execute result code. These build tools should not exist in the first place.
More recently, it's been designed so this is the case. Namespaces, enums, and the property constructor shortcut thing were all added relatively early on, before the philosophy of "just JS + types" had been fully defined.
These days, TypeScript will only add new features if they are either JavaScript features that have reached consensus (stage 3 iirc), or exist at the type system only.
There have been attempts to add type hints directly to JavaScript, so that you really could run something like TypeScript in the browser directly (with the types being automatically stripped out), but this causes a lot of additional parsing complexity and so nothing's really come of it yet. There's also the question of how useful it would even be in the end, given you can get much the same effect by using TypeScript's JSDoc-based annotations instead of `.ts` files, if you really need to be able to run your source code directly.
> TypeScript also was specifically designed so engine could strip types and execute result code.
That's no less a build step than concating, bundling, minifying, etc. When people say "I'm against processing code before deploying it to a web site" but then also say "TypeScript is okay though" or "JSX is okay though," all they're really saying is "I like some build steps but not others." Which is fine! Just say that!
> It is especially weird because JavaScript was not supposed to be processed at all! This is all wrong if you ask me.
You're not actually suggesting that technology can't evolve are you? Especially one whose original design goals were to process basic forms and are now being used to build full-blown apps?
It's absolutely wild to me that with everything that has happened in the last 2 decades with regard to the web there are still people who refuse to accept that it's changed. We are building far bigger and more complex applications with JavaScript. What would you propose instead?
If you want to make ultra-complicated clients, I assume that's what WebAssembly is heading towards. And it doesn't limit you to a poorly evolved language that wasn't intended for ultra-complicated software in the first place, or even force you to use that poorly evolved language on a server if you need to run the same logic in both places.
I've also been running (neo)vim as a manpager. You get the same features as with vim (like easily copying text or opening referenced files/other manpages without using the mouse), but neovim also parses the page and creates a table of contents, which can be used for navigation within the page. It doesn't always work perfectly, but is usually better than nothing.
Because it's far more reliable to use proper parsers instead of a bunch of regular expressions. Most languages cannot be properly parsed with regexes.
Those files are compiled tree-sitter grammars, read up on why it exists and where it is used instead of me poorly regurgitating official documentation:
That would waste CPU time and introduce additional delays when opening files.
They could probably lazily install the grammars like neovim does, but as someone who doesn't have much faith in the reliability of internet infrastructure, I'll personally take it...
Just ran `:TSInstall all` in neovim out of curiosity, and the results were predictable:
If disk space is important for your use case, I guess filesystem compression would save far more than just compressing binaries with upx. btrfs+zstd handle those .so well:
$ compsize ~/.local/share/nvim/lazy/nvim-treesitter/parser
Type Perc Disk Usage Uncompressed Referenced
TOTAL 11% 26M 231M 231M
$ compsize /usr/lib/helix/runtime/grammars
Type Perc Disk Usage Uncompressed Referenced
TOTAL 12% 23M 184M 184M
For real parsing a proper compiler codebase (via a language server implementation) should be used. Writing something manually can't work properly, especially with languages like C++ and Rust with complex includes/imports and macros. Newer LSP editions support syntax-based highlighting/colorizing, but if some LSP implementation doesn't support it, using regexp-based fallback is mostly fine.
I thought neural nets never repeat the training data verbatim, and copyright does not pass through them, so what's the problem?
reply