Hacker Newsnew | past | comments | ask | show | jobs | submit | sorenjan's commentslogin

I disagree, I don't want another ffmpeg binary, I already have one. Winget works well, especially since this is already a terminal program.

I don't find trimming videos with ffmpeg particularly difficult, is just-ss xx -to xx -c copy basically. Sure, you need to get those time stamps using a media player, but you probably already have one so that isn't really an issue.

What I've found to be trickier is dividing a video into multiple clips, where one clip can start at the end of another, but not necessarily.


I don't find Sharing files with people very difficult, just login to your FTP and give an account to another user. - Person commenting on OneDrive

Missed opportunity to reference the famous Dropbox hn comment.

I just think there are other closely related use cases where a separate program can add more value, especially in the terminal. I wouldn't suggest most people should use ffmpeg instead of a gui, those are too dissimilar. Another example is cutting out a part of a video, with ffmpeg you need to make two temporary videos and then concatenate them, that process would greatly benefit from a better ux.


Point of order: the Dropbox HN comment is famously misconstrued. People think it was about Dropbox; it was about the Dropbox YC application, and was both well-intentioned and constructive.

> with ffmpeg you need to make two temporary videos and then concatenate them

It can be done in a single command, no temp files needed.


There's nothing easy about it. Here's a taste.

  # make a 6 second long video that alternates from green to red every second.
  ffmpeg -f lavfi -i "color=red[a];color=green[b];[a][b]overlay='mod(floor(t)\,2)*w'" -t 6 master.mp4; # creates 150 frames @ 25fps.

  # try make a 1 second clip starting at 0sec. it should be all green.
  ffmpeg -ss 0 -i "master.mp4" -t 1 -c copy "clip1.mp4"; # exports 27 frames. you see some red.
  ffmpeg -ss 0 -t  1 -i "master.mp4" -c copy "clip2.mp4"; # exports 27 frames. you see some red.
  ffmpeg -ss 0 -to 1 -i "master.mp4" -c copy "clip3.mp4"; # exports 27 frames. you see some red.

  # -t and -to stop after the limit, so subtract a frame. but that leaves 26...
  # so perhaps offset the start time so that frame#0 is at 0.04 (ie, list starts at 1)?
  ffmpeg -itsoffset 0.04 -ss 0 -i "master.mp4" -t 0.96 -c copy "clip4.mp4"; # exports 25 frames, all green, time = 1.00. success.

  # try make another 1 second clip starting at 2sec. it should be all green.
  ffmpeg -itsoffset 0.04 -ss 2 -i "master.mp4" -t 0.96 -c copy "clip5.mp4"; # exports 75 frames, time = 1.08, and you see red-green-red.
  # maybe don't offset the start, and drop 2 at the end?
  ffmpeg -ss 2 -i "master.mp4" -t 0.92 -c copy "clip6.mp4"; # exports 75 frames, time = 1.08, and you see green-red.
  ffmpeg -ss 2 -t 0.92 -i "master.mp4" -c copy "clip7.mp4"; # exports 75 frames, time = 0.92, and you see green-red.
  
  # try something different...
  ffmpeg -ss 2 -i "master.mp4" -c copy -frames 25 "clip8.mp4"; # video is broken.
  ffmpeg -ss 2 -i "master.mp4" -c copy -frames 25 -avoid_negative_ts make_zero "clip9.mp4"; # exports 25 frames, all green, time = 1.00. success?
  # try export a red video the same way.
  ffmpeg -ss 3 -i "master.mp4" -c copy -frames 25 -avoid_negative_ts make_zero "clip10.mp4"; # oh no, it's all green!

I've never tried doing frame perfect clips like that, that does sound annoying. But from a cursory read of the source, I don't think this program will solve that issue either? Because the time stamps in your examples are all correct, and the TUI is using ffmpeg with -ss and -t as well.

  func BuildFFmpegCommand(opts ExportOptions) string {
   output := opts.Output
   if output == "" {
    output = generateOutputName(opts.Input)
   }
   duration := opts.OutPoint - opts.InPoint
  
   args := []string{"ffmpeg", "-y",
    "-ss", fmt.Sprintf("%.3f", opts.InPoint.Seconds()),
    "-i", filepath.Base(opts.Input),
    "-t", fmt.Sprintf("%.3f", duration.Seconds()),
   }
I think the best way of getting frame accurate clips like that is putting the starting time after the input (or rather before the output), which decodes the video up to that time, and reencode it instead of copying. Both of these commands gives the expected output:

  ffmpeg -i master.mp4 -ss 0 -t 1 -c:v libx264 green.mp4
  ffmpeg -i master.mp4 -ss 1 -t 1 -c:v libx264 red.mp4

Yer, I noticed that this tool was just doing `-ss -i -t` from its demo gif, which is what prompted me to reply. I'm sure people will discover that all sorts of problems will manifest if they don't start a lossless clip on a keyframe. One such scenario is when you make a clip that plays perfect on your PC, but then you send it someone over FB Messenger, and all of a sudden there's a few seconds of extra video at the start!

Can't make frame perfect cuts without re-encoding, unless your cut points just so happen to be keyframe aligned.

There are incantations that can dump for you metadata about the individual packets a given video stream is made up of, ordered by timecode. That way you can sanity check things.

This is terribly frustrating. The paths of least resistance either lead to improper cuts or wasteful re-encoding. Re-encoding just until the nearest keyframe I'm sure is also possible, but yeah, this does suck, and the tool above doesn't seem to make this any more accessible either according to the sibling comment.


> Re-encoding just until the nearest keyframe I'm sure is also possible Yer, I've done that, and it's a pain to do "manually" (ie, without having a script ready to do it for you). I've also manually sliced the bitstream to re-insert the keyframe, which if applied to my clip5.mp4 example, could potentially reduce the 50* negative ts frames to maybe 2 or 3. It would be easier if there were tools that could "unpack" and "repack" the frames within the bitstream, and allow you to modify "pointers"/etc in the process - but I don't know of any such thing.

For frame perfect cuts you need to re-encode. You can use lossless H264 encoding for intermediary cuts before the final one so that you don't unnecessarily degrade quality.

I wonder if there is a solution which would just copy the pieces in between the starting and ending points while only re-encoding the first and last piece as required.


FWIW, here's a simple command line utility for joining and trimming the multiple video files produced by a video camera.

https://metacpan.org/dist/App-fftrim/view/script/fftrim


I've been trying to cut precise clips from a long mp4 video over the past week or so and learned a lot. I started with ffmpeg on the command line but between getting accurate timestamps and keyframe/encoding issues it is not trivial. For my needs I want a very precise starting frame and best results came from first reencoding at much higher quality, then marking & batching with LossLessCut, then down coding my clips to desired quality. Even then there's still some manual review and touch-up. It's not crazy-hard, but by no means trivial or simple.

https://github.com/mifi/lossless-cut


I used a plugin in mpv to do it but I can't find it anymore. You just pressed a key to mark the start and end. And with . and , you could do it at keyframe resolution not just seconds.

Found a few links to projects that fit this description in an awesome-mpv repo.

https://github.com/stax76/awesome-mpv?tab=readme-ov-file#vid...

Appreciate you mentioning the MPV route for making clips, I might actually go through and process all the game recordings I saved for clips over the years.


There's mpv-webm, which is great, but has no way to make a lossless clip AFAIK.

Both Russia and Ukraine build millions of drones per year, most of them fpv drones that are basically remote controlled flying grenades. There's plenty of electronic warfare with radio jamming, so in some places they use drone mounted spools of fiber optic cable to control them. It's probably been the most impactful weapon type in the war for the past years.

  > uvx --with pillow --with okmain python -c "from PIL import Image; import okmain; print(okmain.colors(Image.open('bluemarble.jpg')))"
  [RGB(r=79, g=87, b=120), RGB(r=27, g=33, b=66), RGB(r=152, g=155, b=175), RGB(r=0, g=0, b=0)]
It would make sense to add an entrypoint in the pyproject.toml so you can use uvx okmain directly.

I wonder if one of the LLMs could generate code from a screenshot of a layout designed by this.

Claude Code built a TUI for me last night, in this case to step through nanosecond timestamped ITCH market data messages and rebuild an order book visual in the terminal. This type of stuff would have taken a day - but done in 5 minutes now.

You can right click on it and choose "Show controls", at least in Firefox.

Oh, that's odd, it didn't show up on chrome when I first tried it, but it does now. I was wondering how they'd managed to hide the video context menu

It's probably just <video> element without "controls" attribute.

https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/...

> controls

> If this attribute is present, the browser will offer controls to allow the user to control video playback, including volume, seeking, and pause/resume playback.

Edit: I misunderstood, you are asking

> how they'd managed to hide the video context menu

Not sure, but it works in FF for me


Its entirely possible I did something to it accidentally that made the context menu not work properly. I had the dev tools open to pull the actual video address when I right clicked, so I might have messed something up. Or maybe the devs are secretly looking at the comments and fixed it between me and you trying :P

It won't let me reply to parent's child comment, but i wanted to say:

That is what HN is for!


Planet labs has a solution specifically for ships.

https://www.planet.com/pulse/illuminate-the-dark-fleet-with-...


Dang, that is hard to do, 4 pixels of orange to work with

But that's not something you'd use an LLM for. There have been computer vision systems sorting bad peas for more than a decade[0], of course there are plenty of use cases for very fast inspection systems. But when would you use an LLM for anything like that?

[0] https://www.youtube.com/watch?v=eLDxXPziztw


Nobody said you would use an LLM for that. It's an example of a process where "industrial inspection, in particular, [would] benefit from lower latency in exchange for accuracy".

The point of their comment isn't that you would use an LLM to sort fruit. It was just an illustrative example.


The discussion was about fine-tuned Qwen models, not industrial inspection in general. I would also find it interesting to learn about what kind of edge AI industrial inspection task you could do with fine-tuned llms, not some handwavy answer about how sometimes latency is important in real time systems. Of course it is, so generally you don't use models with several billion parameters unless you need to.


The thread you're in broke away from the main discussion topic.

Again: Nobody is using LLMs to (for example) sort fruit. But there are some industrial processes that prioritize latency over reliability.


No, we are literally trying to find a use case where using a lower accuracy LLM makes sense for a vision task.

But fine - what are these industrial processes where that prioritize latency over reliability and using a LLM - as mentioned by the OP - makes sense?


> No, we are literally trying to find a use case where using a lower accuracy LLM makes sense for a vision task.

They're reconfigurable on the fly with little technical expertise and without training data, that's really useful. Personally in projects for people I've found models have fewer unusual edge cases than traditional models, are less sensitive to minor changes in input and are easier to debug by asking them what they can see.


Seems like a way to use a sledgehammer to hammer in screws, and inviting nondeterminism in important systems. Besides being way larger and more complex than what most specialized industrial processes need, they are also vulnerable to adversarial attacks.

https://www.lakera.ai/blog/visual-prompt-injections

https://www.theverge.com/2021/3/8/22319173/openai-machine-vi...


> Seems like a way to use a sledgehammer to hammer in screws

The lazy analogy the other way is that developing a custom system to do these jobs is like hiring a team of experts to spend 2 years designing the perfect crosshead screwdriver that fits exactly one screw (and doesn't work if the screw starts slightly rotated) when you have a flathead one right next to you that'll work and it'll work right now.

> and inviting nondeterminism in important systems.

Traditional ML is just as non-deterministic.

> they are also vulnerable to adversarial attacks.

Typically not relevant in these kinds of cases but also this is easily a problem in many traditional ML algos.

Have you worked on things like this?


A flathead screwdriver is not a valid analogy, because LLMs are big complicated and opaque machines. And while other ML methods are non-deterministic as well, gaussian process, decision trees or even CNNs are easier to try to make sense of than these huge black boxes.

And I still haven't seen a single example of anyone actually using a finetuned Qwen in industrial inspection, which leads me to believe than nobody is actually using it for that, but some people want to use it because it's their new favorite toy. You don't need a VLM to count cells in microscopy images, or find scratches in painted parts, or estimate output from a log in a saw mill. I can see the use case for things like describing a scene from a surveillance camera, finding a car of a certain model and colour, or other tasks that demand more reasoning or description. But in those cases latency is not super important compared to getting the right output, which was the tradeoff discussed from the start of this thread.

The last thing I'd want to deal with is to have a computer say something like "You're absolutely right, it was wrong of me to classify the metal debris as food".


I’ve used multimodal LLMs for this sort of task and if a fine tuned model would get reasonable performance compared to frontier models I’d use that. Running things purely locally lets you massively simplify the overall architecture and data transfer requirements of some of these tasks if nothing else and lower latency means you can report problems much faster (vs transfer images off device, batch process).

> The last thing I'd want to deal with is to have a computer say something like "You're absolutely right, it was wrong of me to classify the metal debris as food".

The cnn will do that potentially more often and it can be because it’s just not seen enough examples of the debris at that angle or something else equally irrelevant to a human.


You would use a VLM (vision language model). The model analyzes the image and outputs text, along with general context, that can drive intelligent decisions. https://tryolabs.com/blog/llms-leveraging-computer-vision


They worked best when everybody were farmers and had to get up early and go to bed early. Now most people don't live their lives centered around noon, our free time comes after our work is done at around 17:00, so having more light in the evening instead of worthless light in the night makes sense.


That's a myth.

Farmers have to wake up early because their animals wake up at sunrise and some tasks are best performed at that time. So they wake up before sunrise regardless of the clock time.

Human, like farm animals, are better off if they wake up at sunrise and go to sleep in full dark. At the equator that's easy, wake at 6, bed at 10PM. And standard work hours are 7-3 or 8-4.


So, it sounds like you're actually arguing that the numbers are just a construct and that we should all just use UTC and set appropriate work hours to the times that most correlate to the solar day in our region rather than adjust the clock approximately 1 hour per 15 degrees around the equator and have an International Date Line.

I think this would make way more sense, when they say the Olympic Opening Ceremony start at 18:00, its 18:00 for everyone around the world. No one as to work out which TZ Italy is in or scheduling meetings with Tech Support in far flung locales does not require knowing IST is how far ahead or behind.


Yes. https://en.wikipedia.org/wiki/Sandford_Fleming ( https://www.smithsonianmag.com/smithsonian-institution/sandf... )

> He promoted worldwide standard time zones, a prime meridian, and use of the 24-hour clock as key elements to communicating the accurate time, all of which influenced the creation of Coordinated Universal Time.

The one bit where this would be problematic would be "what day is it?" When does today become tomorrow?

There are a lot of systems that we've built that depend on that distinction. Things like business days and running end of day so that everything that happens on March 2nd is logged as March 2nd. I've encountered fun with Black Friday sales where the store is open over the midnight boundary and the backend system really wants today to be today rather than yesterday (sometimes this has involved unplugging a register from the network so that it doesn't run end of day, running EOD on the store systems, and then plugging the register back in after it completes and then running a reconciliation.).

Other than that particular mess of banks and businesses... yea, running everything on UTC would be something nice in today's world.

---

This is also kind of what happens in China (with a complicated history). https://github.com/eggert/tz/blob/main/asia#L272

https://en.wikipedia.org/wiki/Time_in_China UTC+08:00 is observed throughout the country even though it spans about 60° of longitude.

---

Aside on the "changing clocks" and realizing my flexible schedule privilege at a company I worked at I switched my schedule from 8-4 to 9-5 with the change in daylight savings so that I maintained a consistent "this is the hour I wake up".


China shows why this is impossible.

When people propose switching to UTC what they are actually proposing is that everyone nominally switches to UTC but still uses local time informally in everyday life, which is a worse system than time zones. At least with time zones there is a way to know what time it is in any given place. With informal time you lose that.


how so?

Eastern parts of China gets up at 05:00 AM and westtern part gets up at 10:00 AM.

People get used to it.


Local time tells you things like "when is it a good time to call this person". Unless the person is calling is in China.


That's a fair point. And CRM system should take notes. Not everyone lives a 9 to 5 schedule.



> arguing that the numbers are just a construct

Yes.

> and that we should all just use UTC and ...

No. that does not follow. Abstraction is useful. Having commonly understood terms (in this case hours of the day) that share certain traits regardless of where you happen to be in the world facilitates communication.


Right, but where I live sunrise is in the middle of the night in the summer (around 03:30). Using standard time in the summer gives me one less hour of useful sunlight in the evening, and while it doesn't technically disappear it gets moved to where I can't use it because that's when I sleep. It's the same for people further south as well, another bright hour in the early morning before they wake up is a wasted bright hour that would make more sense in the evening, when most modern humans are awake. The argument "noon should coincide with solar noon" is nonsensical to me, the clock is a social construct and should make sense for how most of us live our lives.


But the social construct of work hours shifted later by more than that one hour during the last century, so this is not what people actually prefer by their actions.


Optimizing for summer is silly. Summer gets lots of daylight already. We need to optimize for winter.


People disagree on whether to prioritize mornings or afternoons in the winter. For the summer, only very few people care if the sun rises at four or five (or whatever), but most people like having long summer evenings. Therefore the summer tips the scales.


Then they are also social activities that you just need to wait for in summer, because they can only happen after sunset. Viewing a movie (outside), sitting around a fire, having a party all just really happen after sunset.


The extra hour of daylight in the evening on summer time is even more valuable in the winter.


Is this a distillation of Nano Banana Pro?


Gemini 3.1 Flash Image is based on Gemini 3 Flash.

source: https://deepmind.google/models/model-cards/gemini-3-1-flash-...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: