tl;dr: The proportion of Reddit comments containing emdashes more than doubled since ChatGPT's release!
From the data: for several years before ChatGPT release, the proportion of Reddit comments containing emdashes was about 0.13-0.17%. For the few years after ChatGPT came out, it's between 0.17% and 0.41%.
I noticed it shortly after commenting, and completely rewrote my comment accordingly. Excellent research! If you were to do a write-up on how you did this analysis, that would be very interesting (as the number of comments involved is large).
The "dev notes" in the top right links to https://intervolz.com/emdash-observer-writeup/
I downloaded torrents of reddit comments, and processed them in Go, written using AI assistance. Then Intervolz did this thing and wrote it up.
The misleading aspect is that the AI generated content was in first person, so any reasonable reader would falsely attribute the statement to the person involved, when in fact it was concocted entirely by Meta's AI.
> The Sega Channel was an online game service developed by Sega for the Sega Genesis video game console, serving as a content delivery system. Launched on December 12, 1994, the Sega Channel was provided to the public by TCI and Time Warner Cable through cable television services by way of coaxial cable. It was a pay to play service, through which customers could access Genesis games online, play game demos, and get cheat codes. Lasting until July 31, 1998, the Sega Channel operated three years after the release of Sega's next generation console, the Sega Saturn. Though criticized for its poorly timed launch and costly subscription fee, the Sega Channel has been praised for its innovations in downloadable content and impact on online game services.
Tangental, but I found 'Have I Been Pwned' useless too because you can't enter your email and find leaked passwords associated with the address, instead you have to enter each password (and repeat for every password you want to check).
I know there's an explanation that the raw password is not being sent and instead being hashed locally and only part of the hash is sent. But I don't know how to verify that and it feels wild to type passwords into a random website. (if anyone knows how to verify HIBP does only what it says it does [rather than blindly trust and hope for the best], would love to read more about it)
I always thought that it could be reasonably simple to have a safe alternative. Have people enter a SHA256 of their password instead, and match against a database of other hashes.
Almost everyone interested in checking for password leaks knows how to generate SHA256 of a string. And those who don't shouldn't put their passwords on the internet.
Or even better, generate hash for all passwords in the database, package these hashes together with a simple search script and let people download it. That way, you are not sending any information anywhere, and noone can exploit the passwords, because hash is a one way function.
Then again, that download could be really large. I admit I have no idea how much storage would that take. But it's just text, so easily compressible. And with some smart indexing, it should be possible to keep most compressed and only unpack a relatively small portion to find a complete match.
Then again, I have virtually no background in cryptography, could be something horribly wrong with this.
When you do a check on https://haveibeenpwned.com/Passwords nothing is sent to the server. Instead the password is hashed locally and a list of the hash range is downloaded, which contains all the hashes and the number of occurrences.
The server doesn't receive the password, neither in plain-text nor hash form.
It would be easy enough to add this as a "secret" feature:
* user submits password
* gets hashed client side
* server compares it against stored hashes
* server also re-hashes the stored hash, and compares it against the hash received from the client
This would effectively mean that either entering the password, or the password hash would correctly match, since when entering the hash you are effectively "double" hashing the password which gets compared to the double hashed password on the server.
The upside is that users who don't understand hashing or don't feel like opening a sha256 tool wouldn't have to change their behavior or even be confused by a dialog explaining why they should hash the input, while advanced users could find out about the feature via another channel (e.g. hackernews).
The downside would be that it adds an extra hash step to every comparison on the sever. It's hard to know how expensive this would be for them.
Care to explain how you can tell what scripts gp was sent for the page https://haveibeenpwned.com/Passwords and what scripts he will be sent on future visits?
Well of course a hostile actor could use this incredibly accessible resource to test a bunch of emails and find their passwords.
Though perhaps there could be a service where you enter in an email address and it sends an email to that address containing the passwords. That would be a slightly more complicated server to set up though
> (if anyone knows how to verify HIBP does only what it says it does [rather than blindly trust and hope for the best], would love to read more about it)
I recall HIBP documents their hashing protocol so that it should be possible to have a non-web client you can trust more.
I don't know how to verify what the website does, but I think that in a few minutes I'll be able to put together a CURL call that does what we're hoping the website does.
Delivery apps like Grab and Uber Eats are even worse since they have even more perverse incentives (minimising delivery time and maximising 'sponsored' listings).
Other than being willing to scroll a lot, I haven't found any great ways to find new restaurants when using delivery apps, and I'm sure I use them far less because of the tedium involved. I think scraping listings and re-doing the algorithm yourself (as per post) is perhaps the best approach. E.g. Just being able to rank by user rating and filter for no less than 200 reviews and within 5km would be an outstanding improvement on the status quo, which is always the 50 closest restaurants to the delivery address - what a coincidence! - with a few 'sponsored' listings thrown in.
> Bypassing the Mouse.. I use Vimium in the browser.
Vimium seems great for navigation.
Is there any way to get vim keybindings inside text boxes? (I looked at 'wasavi' chrome extension which hasn't been updated in 8 years [0] and the website's down [1])
You could use real vim in there with ghosttext, but it's not a native integration, you'd have a separate editor window
Another upside is (if your editor is properly setup to not lose data) that a page crash will never lose your precious long carefully crafted comment since it will persist in the editor
cc: @dang
reply